Skip to main content

How to load a large file into excel

In this example, I'll be loading a large CSV file into Excel using PowerQuery to filter the data before it is loaded to Excel to make the file more manageable. I've found a suitable file to upload from this site http://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ where there are a range of pre-made CSV files that are perfect for this example. I've used the '1000000 Sales Records' file but any of the files on this page should provide a similar result.




Although Excel can display over a million rows, in this hypothetical example we'd like to filter the source data on one country before loading it to the sheet in Excel to make it more manageable.

  • Open Excel
  • In the ribbon, click on 'Data' then click, 'New Query' and navigate to 'From CSV' and click on it.


  • Locate the CSV file you wish to load and click 'Open'. Once this is done the 'Query Editor' window will open and display a preview of the file. Note, this is limited to the first few hundred rows and therefore will not display the whole file.
  • Apply the required filter and then click 'Close & Load'. This will assume the default load options and display the results of the query in a table on a sheet in Excel. 


    • If the table is large, selecting the 'Close & Load To… ' option will present some alternative load options where you can choose to 'Only Create Connection' to the data and/or load the data to the data model. This can be useful if the table is too large to display on a sheet in Excel.


  • If you've loaded with the default options you should then see a table of the data with the transformations applied in the query…


  • Now the data can be used to create pivot tables and/or conduct any further analysis.
The above process has indirectly created a query in Power Query which can be accessed by clicking on the 'Advanced Editor' in the power query window. Below is the code that has been created to import and transform the data. Although there is no need to understand this code, if you want to create more advanced transformations its useful to understand what the code is doing. Most of the required transformations can be achieved in Power Query using the GUI which is very similar to the common functions in excel.

let
    Source = Csv.Document(File.Contents("C:\Users\---------\Downloads\1000000 Sales Records\1000000 Sales Records.csv"),[Delimiter=",",Encoding=1252]),
    #"Promoted Headers" = Table.PromoteHeaders(Source),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Region", type text}, {"Country", type text}, {"Item Type", type text}, {"Sales Channel", type text}, {"Order Priority", type text}, {"Order Date", type text}, {"Order ID", Int64.Type}, {"Ship Date", type text}, {"Units Sold", Int64.Type}, {"Unit Price", type number}, {"Unit Cost", type number}, {"Total Revenue", type number}, {"Total Cost", type number}, {"Total Profit", type number}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Country] = "Afghanistan"))
in
    #"Filtered Rows"

Comments

Popular posts from this blog

How to combine multiple files with Power Query (with no VBA and just 10 mouse clicks!)

The need to combine information from multiple files is one that most users of Excel will have come across at some point in the use of Excel. I've personally spent far too many hours aggregating data from multiple files, that are identical in structure, so that I can analyse larger datasets and provide insights into products and processes. For anyone who has also done this and not yet discovered Power Query you'll probably be amazed how simple the process has become. I realise there might be some who will say "just use VBA, its easy once you learn how to code..." and they would be right. The method using Power Query provides a zero code solution that is an evolution of the Excel interface that many will already be familiar with. In this example, I've created a sample file and created a number of duplicates of the file which I've saved in a folder. The folder contains only these files and i'd recommend you do the same if you're looking to try out this pr

Extracting data from Word (.docx) files into Power Query

Word and Excel don’t usually get along too well so it's no surprise that Power Query isn't directly compatible with its estranged cousin Word either. If you are presented with the need to import data from Word into Power Query you'll be please to hear it is possible however it requires a couple of manual steps to make it work.  The manual steps could fairly easily be completed by a batch file which would automate the process further. Here is the Excel data pasted 'as values' in a Word file which i'll use for the first example Here is the Excel data pasted with 'keep source formatting' which i'll reference a couple of times in the article. Although the steps I've covered below aren't complex, this whole process has some unknowns around it so you may find the result in your instance varies from mine. The Word file I've used contains the contents of a range of excel cells that I deliberately pasted as values into Word to create a test file f

How to automate the import of all files in a Google Drive folder to PowerBI, now updated, please read the first paragraphs!! (PowerQuery)

Update!!! This method no longer works although there is a new Google Sheets connector for PowerQuery that is currently in beta. If you're using PowerBI, you'll need to enable the preview features to enable it. https://docs.microsoft.com/en-us/power-query/connectors/googlesheets In the interest if demonstrating how it 'did' work, the original post is provided below. If you attempt to replicate this, you'll quickly realise the website doesn't behave as it used to. (here is the original article)  Using PowerQuery to access multiple files within the same folder on a local or network drive is a game-changing feature that will almost certainly save many people hours of effort. This functionality is great if your data exists in a place that is easily accessible but what do you do if your data is somewhere less accessible like Sharepoint, OneDrive or even Google Docs? I have previously connected to data on SharePoint and found it fairly straight forward which raises the