This lesson we’re going to talk about:
a) The idea of Cardinality – also known as:
A 1 to Many and a
Many to Many relationship
So let’s jump back into our Model View in Power BI to look further.
Ok here we are in the Model View and first off let’s look at the line joining the two datasets and the fact that there’s an arrow pointing from the Item dataset to the Retail Analysis dataset. Now next to the Item dataset we can see that there’s a 1 and then just above the 2018 Sales Data, there’s an asterisk. Let’s keep this in mind. Ok next I’ll click on the Manage Relationships button….and select Edit with the selected relationship.
Ok now down to the bottom of the Edit relationship menu we have a couple of drop down menus and a few tick boxes that we skipped over in the last lesson. Well we’re going to address some of them now.
So first of all, the Cardinality box. This currently says Many to one and you’ll notice the asterisk and the 1 appears here as well as in the Model View that we just looked at. So what does this mean?
Well Power BI works best when we have what we call a Many to One or a One to Many relationship (they are the same thing) Let’s flip back into a slide to explain this better.
Flip back to PowerPoint
So here we go. The “One” and the “Many” in the cardinality box relates to the amount of instances of a value in the key field of our Datasets.
So in the relationship we’ve just created, our “Key” is the ItemID column and this column is populated with 7 digit codes. Our first picture here is a representation of our ItemID dataset that we’ve just imported and as we can see there are no duplicates ItemID numbers in this column. This is an example of what we call a Dimension Table, and generally shows the characteristics of what’s in the “Key” column. Each of the numbers we see here are unique and that of course makes sense as the Item Dataset shows rich the information in relation to a particular product; its segment, category and buyer. There’s no need to have multiple instances or multiple records of the same product as that would be unnecessary duplication – each instance should show the same information. This table therefore represents the One part of the Many to One relationship.
Our second picture here is a representation of the Retail Analysis table and this is an example of a Fact table so named because they generally detail the facts of the history that we want to analyse – in this case the Sales data. Now as we can see here there are duplicates in our SKU number column and this table represents the Many table in our Many to One relationship. And of course that makes sense that there are duplicates of SKU or Item ID numbers as this table shows a record for the sale of each product EVERY DAY in each store, so potentially 365 lines for a each store of which there are over 100.
Now for a relationship to work effectively in Power BI, at least to start off with, there needs to be this many to one relationship. If we don’t have this many to one relationship, Power BI won’t be able to combine the two reports together correctly in a Many to One relationship. So why is this?`
Well let’s go back to our VLOOKUP analogy. Let’s say we wanted to take the segment column from the Item ID dataset and put it into the Retail Analysis table. At the moment we could do that happily We could do a VLOOKUP happily and as there are no duplicates in the Item dataset, the right information would always be brought across into the Retail Analysis table based on the Item ID number
Change slide.
Now let’s say though that we have duplicates in the Item dataset like we do on this slide where SKU number 1292637 has three segments, one being 908, another being 4028 and a third being 4128. Well a VLOOKUP or an IndexMatch will always take the first row that it finds – in this case returning the segment 9087 – but who’s to say that that is the correct segment to bring across?
In this instance Power BI works differently and does not allow you to perform a Many to One Relationship, instead insisting on a Many to Many relationship which has a different set of rules altogether.
So what can we takeaway from this lesson:
Firstly:
a) Cardinality means whether a designated relationship is “One to many” or “Many to Many”
Secondly
b) If the ‘Key’ fields in the relationship both have duplicates, the relationship is Many to Many
Finally
b) If the ‘Key’ fields in the relationship both have duplicates, the relationship is Many to Many
The Formula Bar – The Formula Bar does not automatically come up in the PowerQuery Editor the first time that you enter the Editor in Excel. To bring the Formula Bar up, go to the View tab. In the Layout section of the Ribbon, tick the Formula Bar box.
Remove rows – While we remove the unnecessary top rows in the video, many system reports also have totals lines at the bottom of the reports which can cause a problem when we’re doing things like summing our data later once we’re finished in PowerQuery. We also have the option to remove a specific number of bottom rows using the Remove Rows drop down in the Home tab, as well as removing alternative rows, duplicate rows, rows with errors and blank rows.
Use First Row as Headers – In the video we promote the first row in the table into the headers row. However we can also take the Headers and put them into the first row. To do this go to the same Use First Row as Headers drop down in the Home or Transform tab and select Use Headers as First Row.