Skip to main content

How to access and combine the files in a Sharepoint Folder using PowerQuery

In a recent blog post, I described the steps required to connect PowerQuery to a folder on a local drive using the Folder.Files function in M. The Sharepoint.Files function uses the same process to connect to folders on Sharepoint.

In this example, I've created a sample file and created a number of duplicates of the file which I've saved in a folder on sharepoint. 

Once you have something similar the process you need to carry out is as follows:
  • Open PowerBI or Excel, both of which contain Power Query. I'll be using PowerBI for this demo.
  • Click on the button labelled 'Get Data', then 'more' and then when the following window appears, select 'Sharepoint Folder' and then click 'Connect'
  • Then in the next window, enter the site URL (not the folder URL) which contains the files you'd like to combine then click 'ok in the window shown below
  • The following window will display all of the files that are on the Sharepoint site with some buttons at the bottom of the screen to load the data in a few different ways. 
  • Click 'Transform Data' to load the list of files into a query which will display a screen similar to the one below. 
  • As you can see from the list above, this site contains a number of different types of files and some that we would like to exclude from the import process. First we'll filter the file type to list only CSV files and remove any files that arent of the format we wish to combine. To do this, select the drop down in the 'Extension' column and ensure only CSV is selected.
  • Then we'll remove the 2 files that appear to be different by filtering the 'Name' column in the same way we filtered the 'Extension' Column
  • Now we have a list containing only the files we wish to combine we can now click the 'Combine Files' icon in the 'Content' header or select the 'Content' column and click 'Combine Files' in the Home ribbon.
  • The next screen confirms the settings to import a file. If this is CSV the default is usually sufficient. There is the option to 'Skip files with errors' which is unticked by default. If there is a change future files have errors, ticking this will exclude them and allow the query to be loaded without presenting an error to the user.
  • Click 'OK' to combine the files
  • Once the above sequence has completed, the combined data will be displayed as a single query which can then be loaded to the data model.
  • In this example, the header row hasn't been configured correctly as the CSV file has been loaded with the first row as the first row of data. This can be addressed in a couple of ways. The recommended method is to modify the helper query titled 'Transform Sample File' to 'Use First Row as Headers' the alternative method is to apply the same 'Use First Row as Headers' transformation in the combined query (here titled 'Query1') but this also requires each subsequent header row to be filtered. Although this achieves the same result it creates 2 additional lines of code so its less elegant than applying the transformation to the 'Transform Sample File' query. 

  • Rename the query as required by clicking on its name and pressing F2 and then load the data by clicking 'Close and Apply' (or close and load if you're using Excel).
Now you can add files to the folder on SharePoint and update the data in the data model with a single click of the 'Refresh' button.

Comments

Popular posts from this blog

How to combine multiple files with Power Query (with no VBA and just 10 mouse clicks!)

The need to combine information from multiple files is one that most users of Excel will have come across at some point in the use of Excel. I've personally spent far too many hours aggregating data from multiple files, that are identical in structure, so that I can analyse larger datasets and provide insights into products and processes. For anyone who has also done this and not yet discovered Power Query you'll probably be amazed how simple the process has become. I realise there might be some who will say "just use VBA, its easy once you learn how to code..." and they would be right. The method using Power Query provides a zero code solution that is an evolution of the Excel interface that many will already be familiar with. In this example, I've created a sample file and created a number of duplicates of the file which I've saved in a folder. The folder contains only these files and i'd recommend you do the same if you're looking to try out this pr...

Extracting data from Word (.docx) files into Power Query

Word and Excel don’t usually get along too well so it's no surprise that Power Query isn't directly compatible with its estranged cousin Word either. If you are presented with the need to import data from Word into Power Query you'll be please to hear it is possible however it requires a couple of manual steps to make it work.  The manual steps could fairly easily be completed by a batch file which would automate the process further. Here is the Excel data pasted 'as values' in a Word file which i'll use for the first example Here is the Excel data pasted with 'keep source formatting' which i'll reference a couple of times in the article. Although the steps I've covered below aren't complex, this whole process has some unknowns around it so you may find the result in your instance varies from mine. The Word file I've used contains the contents of a range of excel cells that I deliberately pasted as values into Word to create a test file f...

How to automate the import of all files in a Google Drive folder to PowerBI, now updated, please read the first paragraphs!! (PowerQuery)

Update!!! This method no longer works although there is a new Google Sheets connector for PowerQuery that is currently in beta. If you're using PowerBI, you'll need to enable the preview features to enable it. https://docs.microsoft.com/en-us/power-query/connectors/googlesheets In the interest if demonstrating how it 'did' work, the original post is provided below. If you attempt to replicate this, you'll quickly realise the website doesn't behave as it used to. (here is the original article)  Using PowerQuery to access multiple files within the same folder on a local or network drive is a game-changing feature that will almost certainly save many people hours of effort. This functionality is great if your data exists in a place that is easily accessible but what do you do if your data is somewhere less accessible like Sharepoint, OneDrive or even Google Docs? I have previously connected to data on SharePoint and found it fairly straight forward which raises the...