Scraping YouTube Data
YouTube offers an official Data API that lets you interect with the app programatically, including searching for public content and retreiving public information about channels & videos we’re interested in.
To use the official API, you’ll need a Google account and will need to register to use the API. Don’t worry, it’s pretty simple and Google provides tutorials.
Each YouTube user can have multiple channels. Each channel then has one or more playlists. Each playlist then has a collection of videos.
- Username -> Channel IDs - Use the
- Channel ID -> Playlists - Use the
- Playlist -> Playlist Items - Use the
playlistIdfilter. Note that a playlist item will have a
Video IDthat can be used to get comments and other information about the video.
- Video ID -> Comment Threads - Use the
- Comment Thread -> Comments - Use the
Most (if not all) of the endpoints have a
part parameter which specifies what types of data you want the YouTube API to return. It’s a comma-separated list and the more data you’d like back from YouTube, the more “credits” they will charge to your account.
Channel Videos Example
You may have a specific target channel in mind you’d like to get videos for.
Get the Username
You’ll first need to get the username, which may be different from the URL name.
E.g. If I go to https://www.youtube.com/stevesiedata, the URL key is
stevesiedata, however this is not the username!
To get the username of the channel, click on something like the Videos tab and you’ll notice the URL change to something like https://www.youtube.com/user/StevesieLLC/featured which reveals the username! In this case, the username is
stevesiellc and NOT
Get the Channel Info
Now to get the channel info, we can use the User Channels integration to get the channel info from the username:
We can see in the response, that the channel ID is
We can also see the Playlist ID for all of the channel’s uploads:
We’ll use this playlist ID
UUArmutk8nAbYQdaYzgqKOwA to fetch all the videos for the channel.
Get the Playlist Videos
Now that we know the playlist ID, we just need to use the Playlist Videos integration and enter the playlist ID:
We can now download this response back in CSV or JSON format for further analysis.