It’s good that we took a closer look at data types in the last lesson and how choosing the incorrect data type can cause data error. In this lesson we’re going to look at the ‘Date’ data type and some of the related things that you can do once data is in a data format.
Dealing with dates is a really important skill in Finance Teams; we need dates for analysis as we’re always wanting to know when things have happened, we need to put in dates into journals, or perhaps working out budgets. Extracting details from dates such as month names, weekday names, end of month dates or beginning of month dates can be done in standard Excel but you need to know a range of functions to do it. In PowerQuery (predictably!) it’s much easier and literally a click of a couple of buttons and that’s what we cover in this lesson.
We also cover quite a lot of useful abilities including how to duplicate columns, how to rename columns and then how to move them about.
Let’s have a look…
As you may know, Standard Excel has lots of options for extracting date information from dates in Standard Excel. These include YEAR (extracts the Year from a date), DAY (returns the number of a day in a month from 1-31), MONTH (returns month number from 1-12), EOMONTH (gives the date of the last day of a month) and so on. You can also use the TEXT function to extract things like the month name. However, even if you know that these functions exist, you often need to go rummaging around the internet to remind yourself of the function and what it does.
The great thing about PowerQuery is that it’s very easy to extract date or time information from dates in PowerQuery. We just have to click on a couple of buttons!
However before you can get to any of that exciting extraction, you need to convert your date columns into the ‘Date’ or ‘DateTime’ datatype. If it’s in a number of text format then you won’t be able to use the date extraction options, which we’ll go through in a moment.
Converting data to the ‘Date’ or ‘DateTime’ format is also useful as it means you’ll be able to filter the data column by date in PowerQuery. Once we eventually load the data into Excel, either into a table or a pivot table, you’ll also be able to filter the data by date in those tables as well.
Quickly though, what is the difference between ‘Date’ and ‘DateTime’?
Well the ‘Date’ format shows the date only whereas the ‘DateTime’ datatype shows both the date and the time. How can you tell which one you can use?
Well a column of dates will be a column of 5 whole digits from 1-9 e.g 41539. In Excel, each day is represented by 1, with the number 1 itself representing 1st January 1900, number 2 being 2nd January 1900, number 3 being 3rd January 1900 and so on. The number 41539 is the date 22nd September 2013 as it’s 41538 days since 1st January 1900. Any such 5 digit serial number can be converted into the ‘Date’ data type. If you try and convert this information into a DateTime format then you’ll get a data formatting error as PowerQuery looks for some hour and minute information which it can’t find.
A column of date and time information will be a column of 5 whole digits from 1-9 and then a decimal point and a fraction. As the number 1 represents a 24 hour period, 0.5 represents 12 hours, 0.25 represents 6 hours and so on. The number 41539.05 therefore represents 12 noon on 22nd September 2013. A column of this type of number can be converted to DateTime datatype.
Converting a column to the ‘Date’ or ‘DateTime’ data type
So how can we convert a column to the ‘Date’ or ‘DateTime’ datatype?
Once we’ve done this, then we’ll see a new Changed Type action in our Query Settings listing.
Different countries arrange their dates in different ways.
For example, in the UK, the date is formatted dd/mm/yyyy. However in the USA, dates are formatted mm/dd/yyyy. When we receive files from systems that have different locale settings then this can sometimes trip us up in standard Excel. An example would be where a UK Finance Team with UK operating systems are using a file created in the US with a US operating system. In this example, if we don’t change the locale settings, then PowerQuery will create an Error.
To change the Locale Settings:
Date or DateTime information extraction introduction
As we said when we started off this lesson, it’s frequently very useful to extract the name of days of the week, name of months of the year or the number of days of the week or weeks of the year. This can be particularly useful when analysing data like sales trends or comparing retail weeks year on year. In Standard Excel, extracting all of that information is possible but you need to know the functions to be able to do it. In PowerQuery though we’ve got some simple buttons to extract this information. All we need to do is highlight the date column from which we want to extract the information and then select the extraction option that we want. However we’ll also lose the date information in the original column, so first we’d need to sort out duplicating the column.
How to Duplicate a Column.
Duplicating a column creates an exact replica of the duplicated column and puts it at the end of the dataset. The new column will have the same header but with the word ‘copy’ at the end of it.
This then creates a Duplicated Column step in the Applied Steps pane of the Query Settings pane.
How to change the position of columns in the dataset.
Ok you’ve duplicated your column…but it’s at the end of your dataset so you’ll probably want to move that. Like in a pivot table, you can click and hold the header of the column you want to move and then drag it where you want it to go. That is a veeeeeeery slooooow way of doing things though. A possible faster plan is to:
How to rename your column
Super simple this one. Double left click on the header that you want to rename and then when the header becomes highlighted, type in your new Header title.
Back to Date or DateTime information extraction
Date or DateTime information extraction introduction
Now you have a duplicated column in the right place in the dataset and with an appropriate Header. To extract our date information then:
Age – Calculates the length of time between the current date and the date in each row of the column. Perfect for calculating things like debtor ageing.
Date Only – For those columns formatted as a DateTime, this option removes the Time element
Year, Month, Quarter, Week and Day – each have subsequent options such as Start of Year, End of Year or Start of Month, End of Month, which give the first or last date of the relevant time period. You can also extract from here the number of the day of the week from 1-7, or the name of the week, the number of the month from 1-12 and so on.
Earliest and Latest – This returns a table showing the Earliest and Latest dates in the column.
To extract the name of the month from the date, select your date column then select the ‘Year’ option and then the ‘Name of Month’ option. You’ll then have a column with the name of the month :-D.
Date differences between PowerQuery and Excel – Ok this is very geeky and chances are you’ll never have to deal with this. The number 1 in PowerQuery represents the date 31/12/1899 whereas in standard Excel it is 01/01/1900. This mismatch continues until the leap year day 29/02/1900 in standard Excel ( which is the number 61). This leap year day does not exist in PowerQuery and so the mismatch corrects itself on 01/03/1900 and after this Standard Excel and PowerQuery match.