The previous lesson was a big one with understanding how to connect to and transform multiple files at once by connecting to a folder of files. This really is one of the simplest and biggest time saving techniques in PowerQuery and will save most Finance Teams a huge amount of time.
In this lesson were going to focus on removing unnecessary columns and moving the headers into the first line of the dataset.
Removing columns is useful, both to slim down a dataset and reduce a dataset as well as to really focus in on the information that is important in a file. Demoting headers can also be necessary if there’s transformation or cleaning of the text in the headers that we need to do, as the Headers can’t be transformed without specific functions. In the case of the file in the example, the headers are split by delimiter and will eventually need to be separated. We can only do this if the headers are part of the main dataset.
Ok once we’ve got into the PowerQuery Editor (from the standard Excel screen, select Data, then Queries and Connections and then in the Queries and Connections pane, right click on a query and select ‘Edit’, we need to get rid of some columns.
How to delete columns
In the example video, we remove the column on the left of the screen called Source.Name. This column shows the file in the folder that each row has come from. This can be quite useful, for example for introducing data creation dates into datasets. However for this lesson, we’re going to remove it.
There’s three main ways to remove a column:
We can either:
At the moment in our video, our data is split by a space, a comma and then another space; a slightly elaborate delimiter.
Unfortunately, this also applies to our top line which has been promoted into the Headers. This line therefore needs the same cleaning and formatting work that we apply to the remainder of our dataset. To do that cleaning and formatting then we need to demote those headers into the first row
To demote headers then: