Hi there, now this session is absolutely essential to ensuring that you understand how Relationships and a lot of the calculations that we use in Power BI work. This lesson is going to be all about filters and how we can introduce filters into Power BI but it also starts to give a little idea of implicit measures. To really understand all of this, we’re going to go back to the humble pivot table.
Open up 2018 Sales Data table
Ok here we are in a separate Excel file that holds a bit of new Sales data for an unrelated company who has a store in Bromley by Bow and we’ve got our table of data here and then on this tab we’ve got the Pivot table over this set of data. In the dataset we’ve got columns for the amount of sales made, the SKU number or ItemID number, t he day the sale was made and date the sale was made, the store the sale was made in and whether there was a sale active at the time.
Now we’re going to think about what pivot tables do and a lot of what we’re about to go through may seem really obvious when I say it, but at the same time you may not have consciously thought exactly how pivot tables work. It’s really important that you do start to think in the way that we’re about to go through to understand not only how we can use relationships to control multiple pivot tables (which we’ll go through in the next lesson) but also how to build good reports including calculations that do what you’re expecting.
Ok so let’s kick off. What is a pivot table?
Well really it should be called a filter and summarise table shouldn’t it as it goes through the linked table, filters the data according to filters that we set either in the filter pane, row pane or column pane and then summarises or aggregates the remaining information. So all pivot tables are actually doing is presenting a filtered view of a set of data and then summarising the result into a single cell.
Let’s take a look at a couple of examples. So first of all , let’s put sales units into the values field. Ok we’ve got a number of 699136 – that’s the amount of units that we’ve sold according to the base data in this tab here across the whole year for this particular store. So what the pivot table has done is said ok you want me to summarise the values in the sales units column without applying any filters and then aggregate the values into a single cell. This is our first hint of an implicit measure by the way but we’ll talk about those a bit more later and in future lessons. So just to check our number I’ll go into our data tab and then highlight the Sales Units column. And there we go we can see that the pivot table agrees with the overall value in this column.
Ok now let’s say we only wanted to look at sales made on a weekend, there’s three main ways (main ways not the only ways) that we could do this in a pivot table. We can either put the Day field on columns and then select the drop down here and select Saturday and Sunday only, or we could transfer this field to Filters, or we could transfer it to Rows. In each case the grand total of Sales Uits is the same showing 200141 with the Saturday number 99966 and the Sunday number being 100175. Now if we go onto our Sales Data table and filter Day on Saturday and then sum the Sales Units column then we get 99666. If we then bring in Sunday as well then we get the answer 200141 exactly the same as the pivot table.
So then we can say that by adding the day criteria into either the Row, Column or Filter sections of the Pivot Table pane that we have introduced filters into the pivot table by introducing the Day filter of Satrurday and Sunday. In our cell showing 99966, we can say that this cell is calculated in the filter context of Saturday and on the cell showing 100175 we can say that this cell is being calculated in the filter context of Days being set to Sunday. The grand total column is evaluated with the filter context being set to either Saturday AND Sunday and this is very important to understand. The Grand total cell is NOT the numbers of the Saturday and Sunday cells added together. No it is the Sales unit column filtered to either Saturday and Sunday in the Day column. The fact that the grand total happens to equal the total of the individual Saturday and Sunday cells are complete coincidence as far as the pivot table is concerned. Filter context therefore tells each cell what filters are being applied to the underlying dataset, to arrive at the value in that cell and therefore the filter context on every cell in a pivot table is different.
Now at the moment we are applying our filter context through Rows on the Pivot table. But of course we could add further filters to the pivot table through adding filters to the filter section here or the columns section here. So for example we could add in the filter context into Columns of Sales? and then we see the total amount of sales and then that adds four new columns. Our original values are now split into Sales made during the Black Friday sales period, the Christmas Sales Period and the Summer Sales period as well as the periods with no sale going on.
Now if we look here at the Summer Sales on Saturday, this is the equivalent of the Pivot table going through the following steps:
1) Filtering the dataset on this tab by the Sales Column to show only rows with Summer Sales in this column
2) Filtering the dataset by the Day Column to show only rows with Saturday in this column.
3) Evaluating the Sum calculation over the Sales Units column for the remaining datasets.
If we were to look at the number below at the Summer Sales on Sunday this is the equivalent of the pivot table going through the following steps:
1) Filtering the dataset on this tab by the Sales Column to show only rows with Summer Sales in this column
2) Filtering the dataset by the Day Column to show only rows with Sunday in this column.
3) Evaluating the Sum calculation over the Sales Units column for the remaining datasets and presenting it in the pivot table.
Essentially then, a pivot table will run the same sum calculation over evert cell in the pivot table, but it will take into account the data visible to the calculation based on the filters in place for that cell.
Now Power BI works in a very similar way to a pivot table, in fact it runs off the PowerPivot technology included in Excel and then creates charts of the resulting tables, so all of this talk about filter context is absolutely essential for the rest of the Power BI pack. And it actually gets more complicated as there are a lot more ways that you can introduce filter contexts into a Power BI report than just the three that we’ve looked at here. You’ve got slicers (although of course we have those in Pivot tables as well), you can introduce filter context through measures, visual filters, page filters, report filters and Row Level Security. But that’s ok as now we’ve got the basics, it’s all about just layering those contexts one on another.
So what have we learned from this lesson:
Filter context is the set of filters applied to a supporting dataset before a pivot table or Power BI calculation occurs
Filter context therefore dictates the subset of data available to PowerBI or to the pivot table to perform a calculation.
Filter context therefore dictates the subset of data available to PowerBI or to the pivot table to perform a calculation.
The Formula Bar – The Formula Bar does not automatically come up in the PowerQuery Editor the first time that you enter the Editor in Excel. To bring the Formula Bar up, go to the View tab. In the Layout section of the Ribbon, tick the Formula Bar box.
Remove rows – While we remove the unnecessary top rows in the video, many system reports also have totals lines at the bottom of the reports which can cause a problem when we’re doing things like summing our data later once we’re finished in PowerQuery. We also have the option to remove a specific number of bottom rows using the Remove Rows drop down in the Home tab, as well as removing alternative rows, duplicate rows, rows with errors and blank rows.
Use First Row as Headers – In the video we promote the first row in the table into the headers row. However we can also take the Headers and put them into the first row. To do this go to the same Use First Row as Headers drop down in the Home or Transform tab and select Use Headers as First Row.