Power BI – The Data Model – Bridging Tables

What Are Bridge Tables?

Previous Session Summary

In the previous lesson, we covered how to enter the PowerQuery editor for the first time by highting some Purchase Order Data in our current workbook and then going to the Data tab and then going to From Table/Range.
We then looked at the different components of the PowerQuery editor being the Preview Screen, the Query Pane, the Tabs and Ribbon and the Query Settings Pane.

Lesson Synopsis

In this lesson we start looking at some of the basic steps that we or PowerQuery automatically use to transform our source data from its original ‘dirty’ state, to clean columnar data. Now before we start off in this lesson, one very simple, important point:
You cannot edit individual cells in the PowerQuery Editor. You can only perform transformations on whole columns, rows or tables.
In this lesson then we’re going to continue working with the PO data from the last lesson. We’re going to look at a couple of transformation steps which PowerQuery has added automatically and then we’re going to put in a few of our own, including removing some unnecessary rows, promoting the top row into the Header row and then loading our nice, neat columnar data into Excel.

Resource Listing

The Video

Play Video

Lesson Notes

In order to create relationships we of course need to have a key, a piece of information that is present in both datasets that we can use to link the two datasets together. Here from a couple of lessons ago is the slide showing us connecting the Retail Analysis table and the Item table together using the ItemID as a connecting field.

But what happens if you can’t do that? What happens if you want to connect two datasets but there’s no key – no field available that is present in both datasets? Here for example we have the District Dataset which is in our Power BI model and the Retail Analysis dataset also in our Power BI model, albeit condensed a little to get it onto the screen and we can see that there’s no available key to link the two datasets together.

Of course we frequently see this problem in Excel models as well and in that situation, what do we do?

Well I at any rate frequently go searching for another piece of information that will act as a bridge, a piece of information that is present in the two datasets that I’m trying to connect. We can do the same thing in Power BI so therefore in this lesson we’re going to cover:

How to use bridge tables to connect datasets which have no ‘keys’ in common

Let’s go back into Power BI and look at the datasets that we’ve got remaining to us to connect.

Switch into the Model view.

Back in the model view then I can see that along with the retail analysis table and the district dataset, we also have the Store Dataset which is currently not connected. So let’s see if we have a piece of information that we can use to connect together the Retail Analysis dataset and Store dataset. Ok scrolling down the Store table I can see that we have Location ID, and I can also see that field in Retail Analysis table so that could be an option for connection. First I should check whether there’s a one to many relationship so let’s go into the data view and have a quick scan down the Location ID over here on the left hand side. And yes, it doesn’t look like there are any duplicates. Ok let’s go ahead and try and connect the two fields. I’ll go to Manage relationsbips, click new, then select Store and Location ID and then select Retail Analusis and Location ID. Oh dear Power BI is telling us that tjere’s a problem here – that we have a many to many relationship going on which means that there must be a duplicate somewhere in our Store field. So what do we do? Well let’s come out of these screens

Ok let’s scroll down to the bottom of the dataset and we can see that we have a couple of blank rows here in the column Location ID, as there’s some trailing information in one of the other columns. This is what’s causing us the problem. To get rid of these we’ll go back into PowerQuery by selecting Edit Queries and then go to the Store dataset. Now I’ll scroll to the bottom and we can see theses nulls here at the bottom so firstly I’ll right click on the Location ID column header, and then select remove duplicates to get rid of any duplicated rows. That’s now removed two of the nulls at the bottom of the page, but it has left us with one. Now Power BI will not allow us to have a null on the one side of a many to one relationship, so we need to get rid of that as well. Easy enough, I’ll select the arrow at the top of the Location ID filter and then filter our the null. Then Ill select Close and Apply.

Ok let’s try and recreate our relationship again. This time I’ll go to the Model view, highlight the Location ID Field in Store and then drag and drop the field onto the LocationID field in Retail Analysis. And wait a moment. And there we go, we now have a relationship and by the 1 here and the asterisk here, I can see that it is a One to many relationship.

But we’re still left with our District table unrelated. Now I can see that we have a District ID field in both the District dataset and the Store dataset so I’m going to use the Store dataset which is connected to retail analysis as a bridge table. To do this, I’ll create a relationship between the District table and the Store table using districtID, so I’ll click and hold on District ID in the District Dataset, then drag it over to be on top of the District ID field in the Store Dataset. And once again it’s a one to many relationship.

Now let’s take our new relationships for a spin. Let’s go back to the report view, where we’re left with our matrix table showing our units sold by category. Now let’s take off the category field  and the segment field and instead put on District Manager from the District dataset.

There we are, we now have the units sold by district manager. How amazing is that, the relationship filter has flown down from the District field, through the Store field and into analysis – no trouble!

.... and a little more!

The Formula Bar – The Formula Bar does not automatically come up in the PowerQuery Editor the first time that you enter the Editor in Excel. To bring the Formula Bar up, go to the View tab. In the Layout section of the Ribbon, tick the Formula Bar box.

Remove rows – While we remove the unnecessary top rows in the video, many system reports also have totals lines at the bottom of the reports which can cause a problem when we’re doing things like summing our data later once we’re finished in PowerQuery. We also have the option to remove a specific number of bottom rows using the Remove Rows drop down in the Home tab, as well as removing alternative rows, duplicate rows, rows with errors and blank rows.

Use First Row as Headers – In the video we promote the first row in the table into the headers row. However we can also take the Headers and put them into the first row. To do this go to the same Use First Row as Headers drop down in the Home or Transform tab and select Use Headers as First Row.