Scraping Instagram Data
In an ideal world, Instagram would offer a data API we could responsibly use to scrape public data about users, hashtags, locations & more. This would help countless social media marketers, academic researchers & businesses looking to scrape public data about Instagram users & businesses. But unfortunately, Instagram does not offer this type of data API… officially at least.
While Instagram offers a Graph API for businesses to manage engagement and a Basic Display API to display your own information from Instagram, neither allows you to programmatically scrape public data from Instagram that you could otherwise manually access.
Unofficial Instagram API
Since Instagram is primarily a mobile application, they maintain an unofficial API that the mobile app & desktop website use to communicate to and from Instagram’s servers. Therefore, through the use of open source software & traffic interception, we’re able to document how this unofficial API works and how one could intercept it for data scraping as an alternative to “screen scraping” or even hiring someone to copy & paste data from Instagram all day long.
Emails, Phone Numbers & Locations
Instagram allows business accounts to publicly share their emails, phone numbers, business categories & locations on Instagram, meaning that by using the unofficial Instagram API, anyone could scrape this public data. And since anyone can convert their personal Instagram account into a business account, there are also a lot of individual users who simply convert to a “business” account and share this information publicly in addition to actual businesses.
You can learn more about Instagram Email Scraping or watch this video for a brief introduction:
In order to scrape emails & contact information from users, you need to first build a list of Instagram user IDs. If you already have a list of usernames (but not IDs), you can import the Instagram User Basic Info Formula to see how one could convert the list of usernames into user IDs. Otherwise, you could target users based on one of the following approaches if you’re starting from scratch.
Followers of Popular Accounts
If you have a lot of Instagram followers (or want to target the followers of rival accounts 😼) you could scrape the followers of any public Instagram account. To see how to do this, simply import the Instagram Followers Email & Details Formula and enter your list of target accounts. You can learn more about follower scraping from this video:
Because only engaged accounts tend to share their emails publicly, if you scrape the followers of an account with a lot of fake followers, you’ll get a very low percentage of users who share their email address. You may be better off scraping by hashtag or prepare to scrape a lot of followers to get email addresses.
Hashtag Post Authors
Depending on who you’re trying to target, you may have a particular hashtag in mind that your target customer is using and you’d like to target anyone who posted with this hashtag. Fortunately, Instagram lets you search posts by hashtag, so you could easily scrape all the recent posts using your target hashtag and then get the contact information for all post authors. Import the Instagram Hashtag Post Author Emails Formula to get started or watch this video to learn more:
Note that you’ll only see public emails from a certain percentage of users. If you search for a hashtag that a lot of businesses use (who are more likely to share their email publicly), then you’ll get a lot more emails than if you search for a more unknown hashtag.
Location Post Authors
If you primarily care about finding users in a particular location, then you can search for Instagram users who geo tag posts in a popular location you care about. You can import the Instagram Location Posts Formula to find all the recent posts for a list of locations you’re interested in, and then target those users to get their email addresses. This video explains more about Instagram location scraping:
User Posts Scraping
If you want to perform detailed analysis on a particular Instagram user (or set of users, including yourself), you’ll want to get all of their posts so you can analyze the captions, hashtags, likers & comments for each post. To do this, just import the Instagram Public User Posts Formula and enter the Instagram user IDs you want to scrape posts for. See this video for more information:
Likes & Comments Scraping
Sometimes you need to scrape a particular post’s likers, comments & replies, paginating through thousands of likers & comments for very popular posts. Fortunately, this is easy to do as long as the post you’re scraping is public. You first need to generate a list of posts and get their shortcodes (you can get the shortcode from the public URL of the post). Once you have a target list of posts, you can follow these steps to get likers, comments & replies for all the posts in bulk.
To get the list of likers for a specific post (or list of posts), just import the Instagram Post Likers Formula and enter the posts you need to get the likers for to see how one could access this data. You can also learn more from this video:
Comments & Replies
To get the comments & replies for a particular Instagram post (or list of posts), just import the Instagram Post Comments Formula and enter the post shortcodes to get started. You can also learn more from this video:
If you’re interested in getting data about stories for a particular Instagram account you have access to (or your own account), you can see Scraping Instagram Stories for more information. If you run this on your own account, you’ll also get back basic information on who viewed your story.
In addition to accessing basic information for each story item, you can also get back structured data including hashtags, locations, mentions & even advertisement URLs you would otherwise see in the Instagram app. You can see this video for more information:
Always use a proxy to protect your IP address.
Do not exceed the rate of scraping for what a normal Instagram user would do… per account.
Open Source Software
If you need additional functionality or want to learn more about the underlying unofficial Instagram API, please see these open source libraries from where most of these endpoints were uncovered from.
- Instagram Private API (Python)
- Instagram Scraper (Python)
- Instagram PHP Scraper (PHP)
- Instagram Java Scraper (Java)
Be warned: Some open source libraries may no longer work in 2020 after Instagram began cracking down on spam bots (a separate issue from scraping), so be sure to check the issues for each repository and make sure the library is still operational.