Power BI – The Data Model – How Does the Data Model Work

How Does the Data Model Work?

Previous Session Summary

In the previous lesson, we covered how to enter the PowerQuery editor for the first time by highting some Purchase Order Data in our current workbook and then going to the Data tab and then going to From Table/Range.
 
We then looked at the different components of the PowerQuery editor being the Preview Screen, the Query Pane, the Tabs and Ribbon and the Query Settings Pane.

Lesson Synopsis

In this lesson we start looking at some of the basic steps that we or PowerQuery automatically use to transform our source data from its original ‘dirty’ state, to clean columnar data. Now before we start off in this lesson, one very simple, important point:
 
You cannot edit individual cells in the PowerQuery Editor. You can only perform transformations on whole columns, rows or tables.
 
In this lesson then we’re going to continue working with the PO data from the last lesson. We’re going to look at a couple of transformation steps which PowerQuery has added automatically and then we’re going to put in a few of our own, including removing some unnecessary rows, promoting the top row into the Header row and then loading our nice, neat columnar data into Excel.

Resource Listing

The Video

Play Video

Lesson Notes

Hi there and welcome to the first of our series of lessons on the Power BI data model. In this lesson we’re going to talk through at a high level what a data model is, and the process that we go through to use the data model capabilities of Power BI.

So to start off, what is a data model?

For the purposes of this course, a data model is a collection of datasets grouped together and that are interconnected through relationships that we can then interrogate through analysis to get (hopefully) some useful information and insight. Ultimately the aim of the data model is to be able to manipulate the information from all the reports as if they were in a single dataset and that’s what the diagram here shows. That might all sound a but complicated, but this is exactly what the data model section of Power BI is designed to do – to link datasets together.

So what’s the process of creating a data model then?

Well step 1 of course is to bring all of the reports that we need into Power BI. We can do that of course through Power Query and form stage 1 of the course, we’re already comfortable with that phase.

Step 2 after that is to create relationships between the datasets which means linking the datasets together througha piece of information that is common in more than one dataset. We’ll then end in a position quite similar to what we have here in step showing all of our 5 datasets joined together.

Step 3:

Create calculated columns.

The concept of a calculated columns is not new – creating calculated columns in PowerBI are exactly the same as performing calculations in an Excel formatted Table for example multiplying a sales units column by a price column to generate a revenue column or subtracting a revenue column from a cost column to generate a profit column. This means that the calculation or function that is performed in one cell will automatically be applied to the rest of the column. The purpose of course of this is to create columns in datasets that don’t already exist and on screen we can see the creation of a revenue column in dataset 3 using the units and price columns to create a revenue column.

You’ll be glad to know that the Microsoft wizards designed the DAX language which we use to create calculated columns (and measures which we’re about to go onto) to be as close as possible to the Excel language. This means that lots of your favourite functions that you use in Excel can be found in Power BI with along with some excellent new ones.

Step 4:

Create Measures

Measures are one of the main reasons why PowerBI is so brilliant and the best way to think of them is that they are customizable aggregation functions that you can add to a pivot table ot to a power bi report that are used to summarize data into a single value. For those of you advanced enough in standard pivot tables to know that you can create calculated fields in Pivot tables, the concept is similar, but the capability, calculation speed and ease of use is hugely improved meaning that we can create some really powerful functions. On screen we can see a simple SUM measure calculating the total of our evenue column in dataset 3 and this is our first example of DAX – and look! It’s exactly the same syntax as we would use in Excel other than we have to be a bit more careful with referencing the correct table and column rather than cell.

Once we’ve got our four steps in place, then we can move onto Stage 3, the visualisation phase, but first we’re going to take the time to understand steps 2-4 of the data model phase in more detail.

.... and a little more!

The Formula Bar – The Formula Bar does not automatically come up in the PowerQuery Editor the first time that you enter the Editor in Excel. To bring the Formula Bar up, go to the View tab. In the Layout section of the Ribbon, tick the Formula Bar box.

Remove rows – While we remove the unnecessary top rows in the video, many system reports also have totals lines at the bottom of the reports which can cause a problem when we’re doing things like summing our data later once we’re finished in PowerQuery. We also have the option to remove a specific number of bottom rows using the Remove Rows drop down in the Home tab, as well as removing alternative rows, duplicate rows, rows with errors and blank rows.

Use First Row as Headers – In the video we promote the first row in the table into the headers row. However we can also take the Headers and put them into the first row. To do this go to the same Use First Row as Headers drop down in the Home or Transform tab and select Use Headers as First Row.