Scraping the Hotels.com API
If you work in the travel or hospitality industry, you may be very interested to know the pricing & availability data for your specific local market.
Unfortunately, many travel aggregator websites like Hotels.com don’t make this information easily accessible, but using a special technique involving HAR files, we can legally intercept and scrape pricing data from Hotels.com without violating the Terms of Service.
1. Browse Hotel Search Results
To start scraping, head to the Hotels.com Website and enter any area and search term to get the results for. You can also apply any of the filters you need - we suggest being as specific as possible to narrow down the results to only the ones you’re interested in.
Once you’re happy with the results shown on the page, right click on the web page and hit “Inspect” to open up developer tools, which will begin automatically recording your web traffic. Now refresh the page to have the Hotels.com API reload its data into your browser while you’re recording it.
If you’re curious to see what’s going on, check out the “Network” tab under developer tools and type in graphql
to only show these API requests. You’ll be able to see the raw JSON data that the Hotels.com API is sending to your browser as you interact with the website.
2. Export a HAR File
You can scroll through the results if you’d like to load more, but they seem to load a bit on the first page only, so you may not need to scroll through anything. To get the raw JSON data out of the browser, click the down arrow under the “Network” tab in developer tools labeled “Export HAR…” to download a HAR file containing all of the intercepted API data from Hotels.com.
On the HAR file grouping page, look for a HAR grouping ending in graphql
but also check to make sure you see fields with the word “price” in them, as shown in the screenshot above. There are several different graphql
API calls made, so you need to make sure you select the correct one. Click “Parse Group” to parse out the JSON data into CSV format.
3. Download Hotels Data
Once the parse completes, you should see the data › propertySearch › propertySearchListings
group as the first result. Unfortunately, there are a lot of excess columns, but you can look for the following columns we’ll list out below for the most useful information about each search result.
id
ID of the PropertycardLink.resource.value
URL to the property with exact pricingmediaSection.gallery.media[0].media.url
Image URL for the first carousel imagemediaSection.badges.tertiaryBadge.text
Call to Action Text (E.g. “Only 5 Left!”)headingSection.heading
Hotel NameheadingSection.messages[0].text
Hotel AreapriceSection.priceSummary.displayMessages[0].lineItems[0].price.formatted
Original PricepriceSection.priceSummary.options[0].displayPrice.formatted
Discounted (Current) PricesummarySections[0].guestRatingSection.badge.text
Hotel Rating (Out of 10)
Images & Amenities
You’ll also see other collections to download the image URLs and ameneties as separate collections, since each property can have multiple of each type. Look for the data › propertySearch › propertySearchListings › mediaSection › gallery › media
collection which will contain a row for each image and the data › propertySearch › propertySearchListings › headingSection › amenities
collection to download the full list of amenities.