Blog


New Twitter Data API V2 - Scrape 10,000,000 Tweets per Month

May 1, 2021, 11:48 a.m.

Twitter is working on an absolutely amazing & delightful update to its official API in the form of Twitter API V2. The new API is more streamlined and combined previous APIs together into a uniform one. They have also modified how they return data back, returning fully hydrated related objects (such as users) adjacent to the objects you’re querying. E.g. if you’re searching for Tweets with #beer, the new API will not only return the Tweets, but also the user’s who posted these Tweets as fully... Read More »

🎶 Visualizing Spotify New Release Album Data

April 3, 2020, 12:06 p.m.

Spotify’s API allows anyone with a Spotify account to scrape data on newly released albums, returning back standard data like the artists, track information, duration - and even a link to download a 30-second sample of each track. What’s even more interesting, is that Spotify also can send back “audio features” about each track, offering insight into how its recoomendation algorithm works. You can see the Spotify New Album Visualization and we’ll walk through how to make this below. Read More »

🍿 Visualizing Netflix Catalog Data from Guidebox

March 20, 2020, 1:11 p.m.

This article will walk you through how to visualize data from the Guidebox Data API. You’ll want to import & run the Guidebox Movies Workflow Formula to generate the all_netflix_movies.csv file and the Guidebox Shows Workflow Formula to get the all_netflix_shows.csv file (specify netflix in the sources input collection). Data Visualization Let’s start with visualizing the movies - you’ll first want to read all Read More »

⚖️ Is Data Scraping Legal?

March 17, 2020, 10:24 p.m.

Disclaimer: I am not a lawyer - I just know a lot about data scraping after onboarding hundreds of clients onto Stevesie Data as the founder. The following are general guidelines I’ve seen in industry for what constitutes responsible data scraping practices. Nothing in this article guarantees that what you are doing is legal nor illegal to any extent. This is not legal advice! Data Scraping in Industry If data scraping were illegal, we would not have Google. Search engines like Google, Bing &... Read More »

🎬 Mining Your YouTube Subscriber Data for Insight

Oct. 5, 2019, 12:47 a.m.

If you have a YouTube channel, like Stevesie Data, you may be interested in understanding your subscribers a little more so you can make better content in the future. You could call this “data-driven” content generation, and we’ll go over how to do this using the YouTube Data API. Get Your Subscriber List You’ll first want to get the list of your own subscribers, which you can do with the [YouTube Subscribers Formula](https://stevesie.com/cloud/workflows/formulas/youtube-subscriber Read More »

🗓 How to Scrape Airbnb Occupancy Rates in Any City

Aug. 2, 2019, 2:50 a.m.

If you’re looking to become an Airbnb host, you’re probably wondering how much you can expect to earn. One way of estimating this is to find the average occupancy rate for specific locations over the next year, accounting for seasons, special events, etc… While Airbnb makes all of this occupancy and pricing data available via its website & app, it doesn’t allow an easy way to aggregate and analyze all this data to get a pulse on the market. We’ll walk through how to get a day-by-day breakdown... Read More »

🍽 Scrape GrubHub Restaurants & Menu Items from Different Cities

Aug. 1, 2019, 9:09 p.m.

If you’re looking to collect data from GrubHub for market research (maybe you’re looking to join the ‘Hub with a new restaurant!) you may want to get some overall market restaurant and pricing data from a few cities. Let’s walk through how one could do this, documenting some of the underlying API endpoints that GrubHub publicly uses (with the structured data you want) while using its website or app! First Page of Results The simplest place to start is with the GrubHub search endpoint which... Read More »

🔑 How to Get Your YouTube OAuth Access Token

July 30, 2019, 12:49 a.m.

YouTube API endpoints that return back sensitive data (that only you’re entitled to see) require stricter access than simple API Tokens, and instead use OAuth. For example, to get back the list of people who subscribe to you, you’ll need to use the mySubscribers paramterer in the Subscriptions Endpoint, which only works with OAuth. OAuth Playground If you’re only interested in accessing your own data, you can use the Google OAuth Playground to easily get credentials Read More »

🔑 How to Get Your YouTube API Token

July 30, 2019, 12:27 a.m.

You’ll need an API Key to use the YouTube API. All you need to do is head to the credentials section in the Google Console for the YouTube API. Note: Be sure to enable YouTube on your account or you may get an access not configured error Existing Key If you already have an API Key, you’ll see it below in the interface under Key in the UI: New Key If you don’t have an API Key, simply visit your [Google Cloud Credentials](https://conso Read More »

🔎 Scrape UPCs & Product Details from Home Depot SKUs

July 16, 2019, 2:42 p.m.

In the wonderful world of retail there are several identifiers for products - ranging from UPCs (Universal Product Codes) to SKUs (Stock Keeping Units). When you’re buying or selling a product, you typically have one of these pieces of information, but not the other. We’ll walk through the case where you have a list of SKUs for Home Depot products and need to convert that to a list of UPCs (perhaps to use with some third party software that only works with UPCs). We’ll talk about how one... Read More »

🏡 Scraping ALL Airbnb Listings from a City

July 11, 2019, 1:29 p.m.

If you’re looking to collect ALL the Airbnb listings for a location, not just the first 300 that Airbnb returns by default, you’ve found the right article. If you go to Airbnb’s Official Website and search for listings, you’ll notice that you can only paginate up to about 300 before you get cutoff. If you’re searching within a large city, you know there’s way more than 300 listings! So how to we get them all? The trick here is to note that Airbnb returns 300 listings per search and offers... Read More »

🤖 Machine Learning Image Popularity - Predict Image Success

Feb. 6, 2019, 5:48 a.m.

What makes people click, like & share online images? Is it the colors, composition, contrast, tones or something else? In this post, we’ll walk through developing an algorithm to predict whether or not an image is popular on GrubHub with 65% accuracy. Part 1 - Getting Training Image Data In this exercise, we’ll keep things simple and focus on predicting whether or not an image’s click through rate will exceed a certain percent or not. Ideally, we’d like to train our system on images with a... Read More »

🌍 Visualize Worldwide Airbnb Listings

Nov. 6, 2018, 1:04 a.m.

With so much Airbnb data available, it can be hard to get a grasp on the trends and demands in your area. Here, we’ll explore ways to visualize Airbnb listings using Python and popular visualization libraries. Check out the Airbnb Listings endpoint, where you can retrieve structured data for listings around the world. You can enter a city or just use the default settings to get listings from all over the world. If you execute the endpoint, you’ll see an option to download the Expanded CSV... Read More »

🔥 Reveal Tinder Photo Success Rates

Oct. 27, 2018, 11:09 a.m.

When Tinder released Smart Photos, a feature that tests your photos to increase matches, they also released an API endpoint with some very interesting data, notably “Success Rate” for each of your photos - also returning this data back in the response from the Tinder profile endpoint. You’ll notice it send back a successRate for some (or all) of your photos if you have the feature enabled. Exactly what successRate represents is a mystery, but one would guess Read More »