« Spotify Albums Data Scraper

Spotify API New Releases Scraper

Download Data to Excel & CSV Files

Steve Spagnola
Written by Steve Spagnola
Last updated April 6, 2022

Visualizing Spotify New Release Album Data

Spotify’s API allows anyone with a Spotify account to scrape data on newly released albums, returning back standard data like the artists, track information, duration - and even a link to download a 30-second sample of each track. What’s even more interesting, is that Spotify also can send back “audio features” about each track, offering insight into how its recoomendation algorithm works.

Spotify New Albums

You can see the Spotify New Album Visualization and we’ll walk through how to make this below.

Getting Spotify Data

Our data will come from the Official Spotify API, which allows any Spotify user to access their data for free. In order to use their API though, we need a client to get back CSV data from their API. For this, we’ll use the Spotify Wrapper to aggregate the data from Spotify on our behalf.

New Release Albums

The first step is to fetch the new releases from the Spotify API with the Spotify New Releases Formula. Just import the formula into your Stevesie account and hit execute. You’ll get back the 100 most recent Spotify releases in CSV format.

Album Tracks

Once we get the albums back in CSV format, we’ll want to get the individual tracks back from these albums (so we can visualize the track audio features). We’ll want to look up the tracks for all of the albums from the last step. To do this, open up the CSV for the new albums and find the albums.items.id column. Copy all the values in that column.

Now we can use the Album Tracks Formula to go from our list of Album IDs to the combined track information for all albums. Just import the forumla and paste in your albums.items.id values from the last step, then execute the workflow. Save the output file in a safe place and name it something like Album_Tracks.csv as we’ll use this data directly in our visualization step.

Audio Features

The last thing we’ll need is to look up the track audio features for each track in our new releases. We can use the Track Audio Features Formula to get the audio features one-by-one for each track in the Album_Tracks.csv file.

Simply copy the values in the tracks.items.id column from the tracks output, import the Audio Features Formula and paste in the track IDs into the formula. Execute the workflow and you’ll then have a Audio_Features.csv file that will be the same length as your Album_Tracks.csv file (each row containing a track). We’ll now visualize this data!

Visualizing in Python

We’ll use Python, Pandas & Matplotlib to visualize this data and make the video linked to above. The first step is to load everything into a dataframe (from our 2 data source files):

import pandas as pd

TRACKS_FILEPATH = '~/Desktop/Album_Tracks.csv'
AUDIO_FILEPATH = '~/Desktop/Audio_Features.csv'

tracks = pd.read_csv(TRACKS_FILEPATH)
audio = pd.read_csv(AUDIO_FILEPATH)

# combine both dataframes into a single one, matching by track ID
track_infos = pd.merge(
    left=tracks,
    right=audio,
    left_on='tracks.items.id',
    right_on='input.track_id',
)

# only look at albums, not singles nor compilations
track_infos = track_infos[track_infos['album_type'] == 'album']

# order by release date
track_infos['_date'] = pd.to_datetime(track_infos['release_date'])
track_infos.sort_values(by=['_date'], inplace=True)

Now we’ll have our combined track and album data in a single dataframe track_infos. We can take a quick peak at the data first and do some experiments to get a feel for the distributions and what will be interesting to visualize.

track_infos.groupby('tracks.items.artists[0].name').mean().plot.scatter(
    x='danceability',
    y='tempo',
)

Basic Album Visualization

Now to build the video, we need to get a little advanced. We’re going to group all the tracks by their release date and album name (this is so we can use Pandas to sort by release data for us easily in its group by functionality). Then we’ll iterate through each group (which will be an album) and plot all the tracks in that group by their audio features.

To produce the video, we’ll make each frame individually and save to our hard drive, making the actual video in the next step.

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

plt.style.use('dark_background')

i = 0
continuous_angle = 0

track_info_groups = track_infos.groupby(['_date', 'tracks.items.artists[0].name'])
for group_name, artist_track_infos in track_info_groups:
    artist_name = group_name[1]

    plt.clf()
    fig = plt.figure(figsize=(10, 10))
    fig.set_figheight(10)
    fig.set_figwidth(10)
    ax = fig.gca(projection='3d')

    scat = ax.scatter(
        artist_track_infos['danceability'],
        artist_track_infos['energy'],
        artist_track_infos['valence'],
        c=artist_track_infos['tempo'],
        s=artist_track_infos['duration_ms'] / 250.0
    )

    for track_i in range(len(artist_track_infos)):
        ax.text(
            artist_track_infos.iloc[track_i]['danceability'],
            artist_track_infos.iloc[track_i]['energy'],
            artist_track_infos.iloc[track_i]['valence'],
            artist_track_infos.iloc[track_i]['tracks.items.name'],
            fontsize=18,
        )

    ax.set_xlabel('Danceability')
    ax.set_ylabel('Energy')
    ax.set_zlabel('Happiness')

    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)
    ax.set_zlim(0, 1)

    ax.auto_scale_xyz([0, 1], [0, 1], [0, 1])

    cbaxes = fig.add_axes([0.8, 0.3, 0.03, 0.5]) 
    scat.set_clim(0, 250)
    fig.colorbar(scat, shrink=0.5, aspect=5, cax=cbaxes).set_label('Beats Per Minute')

    ax.text2D(
        0.5,
        0.98,
        artist_name,
        horizontalalignment='center',
        fontweight=600,
        verticalalignment='top',
        fontsize=22,
        transform=ax.transAxes,
    )
    ax.text2D(
        0.5,
        0.935,
        artist_track_infos.iloc[0]['name'][:40],
        horizontalalignment='center',
        fontweight=600,
        verticalalignment='top',
        fontsize=22,
        transform=ax.transAxes,
    )
    ax.text2D(
        0.5,
        0.89,
        artist_track_infos.iloc[0]['release_date'],
        horizontalalignment='center',
        fontweight=600,
        verticalalignment='top',
        fontsize=16,
        transform=ax.transAxes,
    )

    ax.text2D(
        0.5,
        0.075,
        'Spotify New Releases',
        horizontalalignment='center',
        fontweight=600,
        verticalalignment='top',
        fontsize=40,
        transform=ax.transAxes,
    )

    if continuous_angle == 360:
        continuous_angle = 0

    for angle in range(continuous_angle, continuous_angle + 90, 5):
        ax.view_init(15, angle)
        filename = '/tmp/spotify/{}.png'.format(i)
        i += 1
        plt.savefig(filename, format='png', bbox_inches = 'tight', pad_inches = 0, dpi=160)
    continuous_angle += 90

You’ll now have a long list of files generated like /tmp/spotify/0.png, /tmp/spotify/1.png, /tmp/spotify/2.png, etc… Now we can turn these files into a movie using ffmpeg:

ffmpeg -r 12 -f image2 -s 1080x1052 -i /tmp/spotify/%d.png -vcodec libx264 -crf 25 -pix_fmt yuv420p ~/Desktop/spotify_albums_visualization.mp4