In this example, I'll be loading a large CSV file into Excel using PowerQuery to filter the data before it is loaded to Excel to make the file more manageable. I've found a suitable file to upload from this site http://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ where there are a range of pre-made CSV files that are perfect for this example. I've used the '1000000 Sales Records' file but any of the files on this page should provide a similar result.
Although Excel can display over a million rows, in this hypothetical example we'd like to filter the source data on one country before loading it to the sheet in Excel to make it more manageable.
- Open Excel
- In the ribbon, click on 'Data' then click, 'New Query' and navigate to 'From CSV' and click on it.
- Locate the CSV file you wish to load and click 'Open'. Once this is done the 'Query Editor' window will open and display a preview of the file. Note, this is limited to the first few hundred rows and therefore will not display the whole file.
- Apply the required filter and then click 'Close & Load'. This will assume the default load options and display the results of the query in a table on a sheet in Excel.
- If the table is large, selecting the 'Close & Load To… ' option will present some alternative load options where you can choose to 'Only Create Connection' to the data and/or load the data to the data model. This can be useful if the table is too large to display on a sheet in Excel.
- If you've loaded with the default options you should then see a table of the data with the transformations applied in the query…
- Now the data can be used to create pivot tables and/or conduct any further analysis.
The above process has indirectly created a query in Power Query which can be accessed by clicking on the 'Advanced Editor' in the power query window. Below is the code that has been created to import and transform the data. Although there is no need to understand this code, if you want to create more advanced transformations its useful to understand what the code is doing. Most of the required transformations can be achieved in Power Query using the GUI which is very similar to the common functions in excel.
let
Source = Csv.Document(File.Contents("C:\Users\---------\Downloads\1000000 Sales Records\1000000 Sales Records.csv"),[Delimiter=",",Encoding=1252]),
#"Promoted Headers" = Table.PromoteHeaders(Source),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Region", type text}, {"Country", type text}, {"Item Type", type text}, {"Sales Channel", type text}, {"Order Priority", type text}, {"Order Date", type text}, {"Order ID", Int64.Type}, {"Ship Date", type text}, {"Units Sold", Int64.Type}, {"Unit Price", type number}, {"Unit Cost", type number}, {"Total Revenue", type number}, {"Total Cost", type number}, {"Total Profit", type number}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Country] = "Afghanistan"))
in
#"Filtered Rows"
Comments
Post a Comment