Skip to main content

Posts

Showing posts with the label WebScrape

Use Power Query (M Language) to scrape a web page and output it to a CSV file

When using PowerQuery / M / PowerBI to scrape web pages, it's often useful to output the data to a CSV file. This enables the recording of multiple scrapes (with the time stamp in the filename) and also makes transformation and reloading more efficient. Below is a piece of M code I recently used to capture a webpage and record its details to a CSV file. This is a simplified piece of code to demonstrate the application, if there are additional transformations that are required, these can be done before the RScript line/step in the code. Click here to download the file in the this example let     Source = Web.BrowserContents("www.bbc.co.uk"),           #"Extracted Table From Html" = Html.Table(Source, {{"Column1", ".module--highlight .media__link"}, {"Column2", ".module--highlight .media__tag"}, {"Column3", ".module--highlight .block-link__overlay-link"}, {"Column4", ".module--hig...