FB Pixel


👥 Scraping Instagram Followers

April 8, 2020, 12:33 p.m.

Instagram allows users to follow each other, resulting in a small fraction of accounts having a large amount of followers, acting as “hubs.” If you’re interested in social media marketing and targeting people in different niches, finding people who follow other influential accounts is a great idea. You could go on the Instagram app, find the niche accounts you want to get the followers for, and then manually scroll through everyone while copying and pasting their usernames to build your... Read More »

🎶 Visualizing Spotify New Release Album Data

April 3, 2020, 12:06 p.m.

Spotify’s API allows anyone with a Spotify account to scrape data on newly released albums, returning back standard data like the artists, track information, duration - and even a link to download a 30-second sample of each track. What’s even more interesting, is that Spotify also can send back “audio features” about each track, offering insight into how its recoomendation algorithm works. You can see the Spotify New Album Visualization and we’ll walk through how to make this below. Read More »

🍿 Visualizing Netflix Catalog Data from Guidebox

March 20, 2020, 1:11 p.m.

This article will walk you through how to visualize data from the Guidebox Data API. You’ll want to import & run the Guidebox Movies Workflow Formula to generate the all_netflix_movies.csv file and the Guidebox Shows Workflow Formula to get the all_netflix_shows.csv file (specify netflix in the sources input collection). Data Visualization Let’s start with visualizing the movies - you’ll first want to read all Read More »

⚖️ Is Data Scraping Legal?

March 17, 2020, 10:24 p.m.

Disclaimer: I am not a lawyer - I just know a lot about data scraping after onboarding hundreds of clients onto Stevesie Data as the founder. The following are general guidelines I’ve seen in industry for what constitutes responsible data scraping practices. Nothing in this article guarantees that what you are doing is legal nor illegal to any extent. This is not legal advice! Data Scraping in Industry If data scraping were illegal, we would not have Google. Search engines like Google, Bing &... Read More »

🎬 Mining Your YouTube Subscriber Data for Insight

Oct. 5, 2019, 12:47 a.m.

If you have a YouTube channel, like Stevesie Data, you may be interested in understanding your subscribers a little more so you can make better content in the future. You could call this “data-driven” content generation, and we’ll go over how to do this using the YouTube Data API. Get Your Subscriber List You’ll first want to get the list of your own subscribers, which you can do with the [YouTube Subscribers Formula](https://stevesie.com/cloud/workflows/formulas/youtube-subscriber Read More »

💌 Scraping Emails from Instagram

Sept. 4, 2019, 1:25 a.m.

Instagram allows its users to share their emails publicly, making it a goldmine if you’re looking to connect with new customers, understand more about your existing audience (via email analysis tools like ClearBit for lead enchancement), or for any other legal reason (do NOT spam people or you WILL get in trouble). You can easily get thousands of emails per day if you manually browse around on your phone all day using the Instagram app & copy-paste the email addresses. If this sounds too... Read More »

🗓 How to Scrape Airbnb Occupancy Rates in Any City

Aug. 2, 2019, 2:50 a.m.

If you’re looking to become an Airbnb host, you’re probably wondering how much you can expect to earn. One way of estimating this is to find the average occupancy rate for specific locations over the next year, accounting for seasons, special events, etc… While Airbnb makes all of this occupancy and pricing data available via its website & app, it doesn’t allow an easy way to aggregate and analyze all this data to get a pulse on the market. We’ll walk through how to get a day-by-day breakdown... Read More »

🖼 Instagram Posts by Hashtag, Location & Date

Aug. 2, 2019, 2:02 a.m.

Let’s say you want to collect Instagram posts containing a given hashtag, posted in a given location on a given date or time range (or some combination of these 3 requirements). Unfortunately, Instagram doesn’t offer this type of “advanced search” in its app, but with some clever collecting of public data from the Unofficial Instagram API, we may be able to get pretty close. Of the 3 criteria (hashtag, location & time), it’s easiest to query Instagram by the hashtag we’re interested in and... Read More »

🍽 Scrape GrubHub Restaurants & Menu Items from Different Cities

Aug. 1, 2019, 9:09 p.m.

If you’re looking to collect data from GrubHub for market research (maybe you’re looking to join the ‘Hub with a new restaurant!) you may want to get some overall market restaurant and pricing data from a few cities. Let’s walk through how one could do this, documenting some of the underlying API endpoints that GrubHub publicly uses (with the structured data you want) while using its website or app! First Page of Results The simplest place to start is with the GrubHub search endpoint which... Read More »

🔑 How to Get Your YouTube OAuth Access Token

July 30, 2019, 12:49 a.m.

YouTube API endpoints that return back sensitive data (that only you’re entitled to see) require stricter access than simple API Tokens, and instead use OAuth. For example, to get back the list of people who subscribe to you, you’ll need to use the mySubscribers paramterer in the Subscriptions Endpoint, which only works with OAuth. OAuth Playground If you’re only interested in accessing your own data, you can use the Google OAuth Playground to easily get credentials Read More »

🔑 How to Get Your YouTube API Token

July 30, 2019, 12:27 a.m.

You’ll need an API Key to use the YouTube API. All you need to do is head to the credentials section in the Google Console for the YouTube API. Note: Be sure to enable YouTube on your account or you may get an access not configured error Existing Key If you already have an API Key, you’ll see it below in the interface under Key in the UI: New Key If you don’t have an API Key, simply visit your [Google Cloud Credentials](https://conso Read More »

🔎 How to Find Your Instagram Session ID

July 17, 2019, 8:24 p.m.

Most Unofficial Instagram API Endpoints require a valid Session ID from an Instagram account you have access to. This is to ensure that Instagram only shows its data to logged-in users of the Instagram platform (even though anyone can create an account, therefore most likely making the data it would show to an anonymous but registered Instagram user part of public domain). Very Important Any attempt to use your Session ID outside of an official Instagram client may be against the Instagram te Read More »

🔎 Scrape UPCs & Product Details from Home Depot SKUs

July 16, 2019, 2:42 p.m.

In the wonderful world of retail there are several identifiers for products - ranging from UPCs (Universal Product Codes) to SKUs (Stock Keeping Units). When you’re buying or selling a product, you typically have one of these pieces of information, but not the other. We’ll walk through the case where you have a list of SKUs for Home Depot products and need to convert that to a list of UPCs (perhaps to use with some third party software that only works with UPCs). We’ll talk about how one... Read More »

🏡 Scraping ALL Airbnb Listings from a City

July 11, 2019, 1:29 p.m.

If you’re looking to collect ALL the Airbnb listings for a location, not just the first 300 that Airbnb returns by default, you’ve found the right article. If you go to Airbnb’s Official Website and search for listings, you’ll notice that you can only paginate up to about 300 before you get cutoff. If you’re searching within a large city, you know there’s way more than 300 listings! So how to we get them all? The trick here is to note that Airbnb returns 300 listings per search and offers... Read More »

🤖 Machine Learning Image Popularity - Predict Image Success

Feb. 6, 2019, 5:48 a.m.

What makes people click, like & share online images? Is it the colors, composition, contrast, tones or something else? In this post, we’ll walk through developing an algorithm to predict whether or not an image is popular on GrubHub with 65% accuracy. Part 1 - Getting Training Image Data In this exercise, we’ll keep things simple and focus on predicting whether or not an image’s click through rate will exceed a certain percent or not. Ideally, we’d like to train our system on images with a... Read More »

💰 Home Depot vs. Lowe's Price Comparison

Jan. 30, 2019, 3:15 a.m.

You’re looking to buy something? Who’s got the lower price: Home Depot, or Lowe’s? Of course it’s not that simple, since each store sells thousands of different products at different prices. Some products will be cheaper at Home Depot, and others products will be cheaper at Lowe’s. If you’re shopping in their physical stores, it will also depend on where you are in the country. However, if you’re interested in a specific type of product or brand, it makes sense to do a little bit of research... Read More »

🌍 Visualize Worldwide Airbnb Listings

Nov. 6, 2018, 1:04 a.m.

With so much Airbnb data available, it can be hard to get a grasp on the trends and demands in your area. Here, we’ll explore ways to visualize Airbnb listings using Python and popular visualization libraries. Check out the Airbnb Listings endpoint, where you can retrieve structured data for listings around the world. You can enter a city or just use the default settings to get listings from all over the world. If you execute the endpoint, you’ll see an option to download the Expanded CSV... Read More »

🔥 Reveal Tinder Photo Success Rates

Oct. 27, 2018, 11:09 a.m.

When Tinder released Smart Photos, a feature that tests your photos to increase matches, they also released an API endpoint with some very interesting data, notably “Success Rate” for each of your photos - also returning this data back in the response from the Tinder profile endpoint. You’ll notice it send back a successRate for some (or all) of your photos if you have the feature enabled. Exactly what successRate represents is a mystery, but one would guess Read More »

🤳 Instagram Stories Scraping

Oct. 27, 2018, 11:09 a.m.

Stevesie’s Unofficial Instagram API Endpoints show endpoints that mimic how Instagram fetches stories from within the Instagram app. To access this content and view your own stories (or see data about stories from other public accounts), check out this video with some background on Instagram stories scraping: 1. Get the User ID To get the current stories of a user you’re interested in (including yourself), you’ll need to know their User ID (this is different from someone’s username). If you kno Read More »