Power BI – The Data Model – Introducing the Dataset

Introducing the Dataset

Previous Session Summary

In the previous lesson, we covered how to enter the PowerQuery editor for the first time by highting some Purchase Order Data in our current workbook and then going to the Data tab and then going to From Table/Range.
We then looked at the different components of the PowerQuery editor being the Preview Screen, the Query Pane, the Tabs and Ribbon and the Query Settings Pane.

Lesson Synopsis

In this lesson we start looking at some of the basic steps that we or PowerQuery automatically use to transform our source data from its original ‘dirty’ state, to clean columnar data. Now before we start off in this lesson, one very simple, important point:
You cannot edit individual cells in the PowerQuery Editor. You can only perform transformations on whole columns, rows or tables.
In this lesson then we’re going to continue working with the PO data from the last lesson. We’re going to look at a couple of transformation steps which PowerQuery has added automatically and then we’re going to put in a few of our own, including removing some unnecessary rows, promoting the top row into the Header row and then loading our nice, neat columnar data into Excel.

Resource Listing

The Video

Play Video

Lesson Notes

After an introduction to PowerBI’s data model modelling phase in the last lesson we’re going to have a proper look at the data model views available in Power BI and what each of the different functions do.

But before we do any of that though we’re going to leave behind the container imports file that we used in Stage 1 and introduce a new dataset into Power BI which is fictitious sales data for a number of retail stores.

Let’s get this into Power BI using our new PowerQuery skills. I’ve got Power BI open on a blank report and then I’ll select the Get data drop down menu.

 Now of course this being Power BI we can see considerably more connection options than we had in PowerQuery in Excel, but the one that I’m looking for which is the excel option is still available so I’ll select that option and then find where I’ve saved the Excel file on my OneDrive. I’ll select the FactTable tab from my list and then select Transform Data

Now Power BI has already done a lot of my formatting for me, but there’s one thing that I need to do before we start and that is to change my ReportingPeriodID from whole number to a Date format .

Ok let’s have a look at our dataset and what is contained in our data. This data overall shows a number of columns including:

MonthID, a SKU number or product unique reference called ItemID, A location ID showing where an item was sold, a gross margin value, unit price in Sterling, A reduced price in Sterling, the reporting period, which is a month and year combined, the sum of units sold at regular price, the sum of units sold at the reduced price, and some data on when sales were made.

Just a quick word on SKU numbers. These are unique seven digit codes with a value unique for each product sold and stand for Stock Keeping Unit.

I’m also going to bring in three other datasets:

  • The first one shows the details about the stores where goods have been sold.
  • The second shows some information about the store manager
  • The third shows some information about the products that have been sold and how they can be categorised.

I’m going to do this off screen however as by now you know how to bring datasets in using PowerQuery.

Now I can see my new queries have popped up the query pane and I’ve done just a little formatting to get the data into shape.

I’m happy that my data is in a good place and that I can start viewing it in the data model so I’ll click Close and Applly. Now we come out of the Power Query screen and a new menu has popped up saying Apply Query Changes.

Power BI is now sorting out putting the data from the dataset into the data model view and depending on the amount of data and the amount of tables being used, this can sometimes take a little time.

There we go, we’re now ready to go, al the information is now available in the data model view, so let’s go and take a tour.  

.... and a little more!

The Formula Bar – The Formula Bar does not automatically come up in the PowerQuery Editor the first time that you enter the Editor in Excel. To bring the Formula Bar up, go to the View tab. In the Layout section of the Ribbon, tick the Formula Bar box.

Remove rows – While we remove the unnecessary top rows in the video, many system reports also have totals lines at the bottom of the reports which can cause a problem when we’re doing things like summing our data later once we’re finished in PowerQuery. We also have the option to remove a specific number of bottom rows using the Remove Rows drop down in the Home tab, as well as removing alternative rows, duplicate rows, rows with errors and blank rows.

Use First Row as Headers – In the video we promote the first row in the table into the headers row. However we can also take the Headers and put them into the first row. To do this go to the same Use First Row as Headers drop down in the Home or Transform tab and select Use Headers as First Row.