Scraping the Shopify API
If you’re doing market research for creating your own e-commerce store, you may want to know which products to sell and which products to avoid. While you can do your own manual research and try trusting your gut, you may want to also take a more data-driven approach with some scraped market data from Shopify stores.
Scraping data from other Shopify stores can be a great place to start gathering this data for market research, and we’ll walk through how to legally scrape specific Shopify Stores using only your web browser without any complicated code, risky software or illegal automation.
We can scrape a lot of data from the store used in this example, including what may be even real sales data shown in column ga_unique_purchases
above, which can help us figure out exactly which products are selling well for any store including your competition.
1. Check Compatibility
If you see a store that looks like a Shopify store, you first need to test to see if you can scrape it or not with HAR files. Some Shopify stores will be configured to send their product data back as HTML instead of JSON, in which case our HAR File Web Scraper will not work.
Working Example
We can scrape this Mid-Century Furniture Section with HAR Files, as they send their data back in JSON. To test this web page, first go to the page, then right click on the page and hit “Inspect” to open up developer tools. Then refresh the page and scroll down and click on the next page of results (just one additional page is enough to test).
Now go to the “Network” tab under developer tools and click the down arrow labeled “Export HAR…” to download a HAR file containing the product data. Upload that file to the HAR File Web Scraper and look for a grouping like the one shown below.
It may not look exactly like this group as each Shopify Store is different, but if you see fields like “title” and “price” under the Fields section, odds are you can scrape the products!
Non-Working Example
If we try to scrape this Sofa Furniture Section using the same methods as shown above, we’ll get the following warning when we upload to the HAR File Web Scraper as this store sends their data as HTML instead of JSON, which is currently not supported.
2. Browse Products
Once you’re sure the website will work, continue to browse through all of the pages on the website until you get to the end with developer tools still open. While tedious, this ensures you won’t get blocked for using an automated bot or violate the hosting store’s Terms of Service, since you’re just normally using the website.
3. Export a HAR File
After scrolling through all the products, export a 2nd HAR File (just like we did earlier when testing for compatibility) and upload to the HAR File Web Scraper again. Then parse the group and you’ll see even more products this time.
4. Download Products
On the parse data section, look for a collection labeled results
or similar. It will contain all of the products we browsed combined into a single CSV file you can download as shown below.
The following columns are what we scraped from the example store, but these will vary based on how the Shopify Store you’re scraping is configured.
- Product Name
- Product Title
- Created Timestamp
- Unique Purchases
- URL Slug
- Image URL
- Shipping Availability
- MSRP
- Sales Price
- Product Type
- Published Timestamp
- Brand / Manufacturer
- SKU
- On Sale?
- Variants (colors, sizes, etc…)