Scraping the LinkedIn API
LinkedIn scraping can be an uphill battle, as the company has made clear through legal actions that it does not want its data scraped. But this hasn’t stopped companies like Apollo from building their own LinkedIn profile scraper, which you can scrape through our Apollo LinkedIn Scraper.
However, it’s still possible to get data from LinkedIn’s API (as an authenticated member while logged in) using a 100% legal (and undetectable) way to extract basic information, like export LinkedIn search results to Excel, without violating their Terms of Service using a little-known technique involving HAR files with our LinkedIn web scraper.
In order to legally scrape LinkedIn, we’ll need to scrape a recording of the API data they send to our browser instead of scraping the actual LinkedIn website. We can then use our HAR File Web Scraper as shown below as a LinkedIn scraper API substitute.
This means that (1) LinkedIn can never detect the scraping (as we are just passively recording) and (2) the LinkedIn web scraping occurs on a recording of LinkedIn network traffic and cannot be governed under their Terms of Service through our HAR File LinkedIn data extractor.
For more thorough scraping of public profiles & company profiles from LinkedIn, check out our Apollo LinkedIn Scraper or watch the video below to see how you can easily scrape LinkedIn data from Apollo as an alternative to the instructions shown below.
1. Browse Search Results
To get started, simply head to LinkedIn and perform a search for people, jobs, companies or posts you’re interested in. You can also navigate to a company page or LinkedIn group and then browse through people, posts & jobs there as an alternative.
When you have the data you want to scrape on the screen, right click in the browser and hit “Inspect” to open up developer tools. This will begin recording the network traffic that LinkedIn sends to your browser with the raw LinkedIn data in it.
You’ll then want to refresh the page to force LinkedIn to reload the data (now that you’re recording) and then proceed to scroll down and paginate through the results (keep clicking next). You should be able to browse to a few hundred items of whatever you’re scraping.
2. Export a HAR File
Once you’ve browsed through the data you want to scrape, find the Network tab under developer tools and click the down arrow labeled “Export HAR…” to download a HAR file containing a copy of all the data LinkedIn sent to your browser.
Upload that file to the HAR File Web Scraper and look for the group labeled /voyager/api/graphql
or similar (as pictured above). This contains all of the pages of data you browsed with the data you want. Click the “Parse Group” button for that group to access the data.
3. Download LinkedIn Data
The data you want should be in the included
collection. However, it’s a little tricky as the first 10 rows will appear to all be blank, but if you hover your mouse over the navigationUrl
or title.text
column, you can see a distribution of all the values and verify that the data is there from our LinkedIn data scraper.
The files will contain basic information about the people (with job titles & company), jobs or posts you’re scraping - basically anything that was presented in the search results or listings will appear in the LinkedIn data scraping output.
The navigationUrl
column is useful as it links to the resource included in the scrape. So for example if you’re trying to do outreach to lead lists matching your search, you can save the URLs to a spreadsheet and then divide up that work and track progress that way.
Please note that our LinkedIn scraping tool will not capture contact information or detailed LinkedIn profile data like email address or phone numbers. To get those, you must manually visit each profile (with the navigationUrl
) and see what the LinkedIn user makes publicly available or use another email finder like LinkedIn Sales Navigator.
Legal Concerns
It’s no secret that LinkedIn doesn’t like when people scrape its public data. If you need to know more, simply see the HiQ vs. LinkedIn Lawsuit on the legality of scraping public data. While the courts are favoring HiQ’s right to scrape LinkedIn as a first party, the ruling is irrelevant for third party tools like our service which help you with web scraping LinkedIn.
It’s still illegal for a tool (like a Chrome extension) to help you violate another company’s Terms of Service (but it’s not illegal if you web scrape LinkedIn yourself).
If you do extract LinkedIn data (legally with our service of course), you need to be extremely careful about republishing the data in public as it contains personally identifiable information. However, we usually see our customers use this data strictly for private research and use, so it generally does not become an issue.