Scraping the Trulia API
If you need property listings data for real estate investing, we’ll show you how to legally scrape real estate data from the Trulia API without violating their Terms of Service to extract data with addresses, pricing & much more from any market.
To legally collect the real estate listings, we’ll be recording the network traffic Trulia sends to our browser (as we use the website in full compliance with its Terms of Service).
We’ll then then scrape real estate from the recording of web traffic instead of the actual Trulia website as to not violate the Terms of Service or be detected as with other web scraping services.
1. Browse Trulia Listings
Head to the Trulia Website and jump to any local market (or search results URL) you want to scrape the property listings of. You can apply any of the filters on the search page that you would normally use (e.g. if you only want sold properties).
Once the listings load, right click on the page and hit “Inspect” to open up developer tools. This will begin recording your web traffic so we can capture the raw data that Trulia sends to your browser.
After opening developer tools, you’ll want to refresh the page to force Trulia to re-send its data to your browser. Then you can use the map to scroll and zoom around the area and/or scroll down through the listings and use the next arrow to load more and more properties. Just keep interacting with the website to get it to load more data now that we’re recording.
2. Export a HAR File
Once you’ve browsed through enough listings, click the “Network” tab under developer tools and look for the down arrow labeled “Export HAR…” and click it to export a HAR file containing the intercepted network data.
Upload the HAR file to the HAR File Web Scraper and we’ll parse out the requests into groups you can then extract the data from, as shown below:
You’ll want to look for the group with graphql
in it. There may be several, so look for the group with fields like streetAddress
, pricePerSquareFoot
, isForeclosure
etc… to find the correct group (you can always just try them all if you’re not sure). Click “Parse Group” on the group to proceed to the next step.
3. Download Trulia Data
Once we parse out the data, you’ll see downloadable collections you can save to your computer as CSV or Excel files. Here, the most relevant data is in the data › searchHomesByDetails › homes
collection, but this may vary depending on your exact circumstances:
You’ll get back a lot of detailed fields which can help with real estate market analysis. Some fields you may be interested in are as follows, but there are plenty more than what’s on this list!
- Trulia URL
- City
- State
- Zip Code
- Street Address
- Latitude & Longitude Coordinates
- Listing Price
- Square Footage
- Number of Bedrooms
- Number of Bathrooms
- Sale Type (Resale, Builder, Lot Land, etc…)
- Is Foreclosure?
- Listing Agent Name
- Broker Company Name
- Listing Image URL
- Timestamp Listed
Removing Duplicates
If you pan around the map a lot while generating the HAR file, you’ll likely load duplicates into the HAR file and need to de-duplicate them after downloading the scraped data. To do this, we suggest using the url
when removing duplicates either in Excel or your spreadsheet program of choice.