So far in this course we’ve done a lot of work with data that’s been imported from Excel files, but in fact we can connect our queries directly from an awful lot of other sources of data.
Now we’re going to look at some of the ones available in Excel, but Power BI which also uses PowerQuery has an awful lot more and we’ll look at those in a few lessons time.
So how can we find these other data sources? Of course in the usual way!
From here there are plenty of options of course. In the course, we’ve always gone to “From File” and used either “From Workbook” or “From Folder”.
– “From Text/CSV” is also available which connects to a CSV file, although in truth, PowerQuery will do the same thing if you go through the From Workbook route and then select a CSV file (you’ll need to show ‘All Files’ in the browsing menu for a CSV file to show).
– There’s also a couple of other options including JSON which is a Java file and XML files which are Extensible Markup Language. In both cases, simply select either XML or JSON and then select where you’ve saved the file to connect to them.
– Finally, and not shown in the video, depending on your version of Excel are Sharepoint Folders. These are extremely useful particularly when using Power BI as they make automated file refreshing easier. However, as long as you’ve synced your Sharepoint site to your OneDrive account, and therefore the Sharepoint site can be accessed through Windows Explorer, you’ll be able to access files saved in Sharepoint through the normal “From Workbook connector.
Database connection options:
Other options that are available including querying information directly from a database including:
Aside from Access which your Finance Team may use independently, if your business happens to use one or more of these databases, they’ll very likely be run and curated by your IT team so get in touch with them and they should be able to help you out with connection. This would include giving you vital information including server or database addresses.
Additional connection options are available in Excel depending on your level of Microsoft subscription. Two other useful connectors that were not shown on the video is the ability to connect to Microsoft Exchange and extract data from emails as well as Sharepoint Folders.
Connecting to inboxes and extracting data from attachments is such a handy feature as it means that you don’t need to always be saving attachments to drives before connecting to those attachments. Do note though that you can’t connect to a personal email address, such as one ending in ‘hotmail.com’ or ‘outlook.com – it has to be a company email address that uses Microsoft Exchange.
If your company uses Sharepoint, then the Sharepoint folder is also extremely helpful as it allows you to connect directly to Sharepoint folders and files through the web. If you’re interested in learning more about this file connection type then have a look at this lesson.
We’ve already looked at connecting to a table or range right at the beginning of this course, and we’re going to look at connecting to a web page in a moment, but these three options here are also worth a mention.
Connecting to data on a website
Connecting to data on a website is actually a very straightforward thing to do. Let’s imagine that we wanted to download some information directly from a website. Now historically, we would have had to go to the webpage and attempt to copy and paste the data. Sometimes this would work, other times it might not. Sometimes we might be able to copy the data but then pasting it would result in some strange code or peculiar format.
So how can we use PowerQuery to connect to website data?
At this point, Excel may ask how we want to connect to the web data. As we’re querying a public website here, Anonymous on the left works fine.
PowerQuery then connects to the website and gives us the Navigator screen which has the options for which of the tables on the webpage we want to use. A bit of trial and error may be needed to find the data that you’d like but use the preview screen here and you’ll find the data you’d like to use!
Refreshing options in Excel
Now while, we’re at it – another excellent tool that PowerQuery has is the automatic refresh option.
A few lessons ago we looked at how you could manually refresh queries and looked at it again in the last lesson, but if other people have access to our files and we want the information to be as live as possible, without having to hit refresh…well we need to do something else.
What we can do is switch on automatic refreshing which will refresh the connection every hour, every 10 minutes or whatever amount of time that we choose.
To do that we need to:
There’s also a couple of other useful options including Refreshing Data When Opening a File and Enabling a Fast Data Load. Both of these are useful options, but be warned that the former could really slow down the opening of a file and if external users are using the file, they need to be warned of that! The latter is also useful but can also really slow down any other work you’re doing on your computer (depending upon how pacey your computer is!)
Refreshing CSV files vs .xlsx files – A key difference that’s worth noting between .xlsx files (standard Excel workbooks) and CSV files is on refreshing. If you connect to a .xlsx file and then subsequently add new columns to that file, then PowerQuery will spot the new columns and bring them into the query. That’s not the case with CSV files where PowerQuery will only bring in the amount of columns that it identified in the original CSV file. The number of columns brought in can be changed in the Source step in the Formula bar but it has to be done manually, when new columns are included on the source data.