Skip to content

Instantly share code, notes, and snippets.

@sunflsks
Last active August 12, 2024 19:37
Show Gist options
  • Save sunflsks/a03419a68bb091f84a0c10bd1659679e to your computer and use it in GitHub Desktop.
Save sunflsks/a03419a68bb091f84a0c10bd1659679e to your computer and use it in GitHub Desktop.
Spotify Private API exploration: /v2/recently_played
# #!/usr/bin/env python3
'''
While working on my custom scrobbler program that utilizes the Spotify API, I noticed a limitation that many
others have also come across
(https://stackoverflow.com/questions/73240867/is-there-a-way-to-retrieve-more-than-50-recently-played-tracks-using-spotipy,
https://stackoverflow.com/questions/74190136/is-there-a-way-to-get-my-full-listening-history-from-the-spotify-api);
the public recently_played API seems to be limited to the 50 most recent tracks, no matter what. However, the spotify
app itself has a built-in Listening History feature that uses a private API to go further back in time. Curious (and with
some time to kill), I decided to look a bit into this API and see what it provides.
The API (endpoint shown below) is not a normal API (in the sense that it returns JSON models, which are
then parsed by the frontend and shown as seen fit); instead, it seems to be Spotify's own implementation of a server-driven
UI. The UI elements themselves are sent over the wire and presented (semi-)directly to the user. This allows for a lot
of flexibility in terms of how the UI itself is drawn (dividers between dates, rows, what information is presented for each song, etc)
A really interesting design pattern that I had no idea about until approximately 3 hours ago!
However this makes it more difficult for me, as the actual data itself is not sent through the API, only the elements deemed
fit to be displayed to the user. This means that only the date is sent, not the actual timestamp for each song (as the Listening
History pane only shows the YYYY-MM-DD date). To implement paging, a timestamp (of what I assume is the last played song sent)
is appended to the JSON; this timestamp is then used in the next request, so on and so forth.
Example Response:
{
"title": "Recently played",
body": [
...
{
"id": "2024-08-12-spotify:track:1uTeYqdZf9oYwgkhE0hlf0-spotify:track:1uTeYqdZf9oYwgkhE0hlf0",
"component": {
"id": "listeninghistory:trackRow",
"category": "row"
},
"text": {
"title": "I Don't Miss You at All"
},
"images": {
"main": {
"uri": "https://i.scdn.co/image/ab67616d00001e02500f0405c1d3feb14d62849c",
"placeholder": "track"
}
},
"custom": {
"has_play_context": false,
"artists": [
{
"uri": "spotify:artist:37M5pPGs6V1fchFJSgCguX",
"name": "FINNEAS"
}
]
},
"logging": {
"ubi:specification_id": "mobile-listening-history",
"ubi:app": "music",
"ubi:impression": false,
"ubi:specification_version": "8.1.0",
"ubi:path": [
{
"name": "container"
},
{
"name": "contextless_item",
"id": "play-entity-2024-08-12",
"uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0"
}
],
"ubi:generator_version": "11.0.1"
},
"metadata": {
"creator_name": "FINNEAS",
"sectionId": "section-header-2024-08-12",
"album_uri": "spotify:album:2b7DunZFOVCs0QgiTI1FJW"
},
"events": {
"rightAccessoryClick": {
"name": "contextMenu",
"data": {
"uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0"
}
},
"click": {
"name": "playFromContext",
"data": {
"uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0",
"player": {
"context": {
"pages": [
{
"tracks": [
{
"uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0"
}
]
}
],
"uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0"
},
"options": {
"skip_to": {
"track_uri": "spotify:track:1uTeYqdZf9oYwgkhE0hlf0"
}
}
}
}
}
}
},
...
],
"custom": {
"last_component_had_play_context": false,
"timestamp": 1723334400
}
}
There are different row IDs for each UI element (dividers, etc).
There also seems to be a number of parameters that can be passed to the API call; it might be interesting to look more
into these and see what they mean.
Example: /listening-history/v2/mobile/0?type=merged&last_component_had_play_context=false&client-timezone=America%2Chicago
LIMITATIONS:
- This API also seems to be limited to only the past 90 days, a far cry from being able to fetch the entire history of a user.
- Given that it's a private API, a client token (which seems to be static? or at least persisting for >3 hours) and
an access token are required. Not sure if the normal OAuth tokens can be used for this; I just intercepted the HTTP
requests from an android emulator running Spotify to get these values
- Again, no proper timestamps for each song played, only the date. A deal-breaker :(
'''
import re
import json
import time
import requests
import datetime
from requests import Response
AUTH_TOKEN=''
CLIENT_TOKEN=''
API_URL='https://spclient.wg.spotify.com/listening-history/v2/mobile'
timestamps = set()
all_songs_played = set()
def get_api(timestamp: int) -> dict:
headers: dict = {
"Authorization": f"Bearer {AUTH_TOKEN}",
"client-token": CLIENT_TOKEN,
"App-Platform": "Android"
}
api_url: str = f"{API_URL}/{timestamp}"
r: Response = requests.get(api_url, headers=headers)
return r.json()
def songs_from_json(input_dict: dict) -> (int, list):
timestamp = input_dict["custom"]["timestamp"]
song_list: list = []
for element in input_dict["body"]:
if element["component"]["id"] != "listeninghistory:trackRow":
continue
song_name: str = element["text"]["title"]
played_date: str = re.search(r"\d{4}-\d{2}-\d{2}", element["id"]).group(0)
song_list.append((song_name, played_date))
return (timestamp, song_list)
def main():
response_dict: dict = get_api(0)
timestamp: int, songs_played: list = songs_from_json(response_dict)
while True:
if timestamp in timestamps:
break
timestamps.add(timestamp)
for song in songs_played:
print(f"Found song {song[0]}, played at {song[1]}")
all_songs_played.update(songs_played)
time.sleep(3) # Don't know what sort of voodoo rate-limiting or blacklisting they may do; playing it safe.
timestamp: int, songs_played: list = songs_from_json(get_api(timestamp))
timestamps.remove(0)
timediff: int = int(time.time()) - min(timestamps)
delta: datetime.timedelta = datetime.timedelta(seconds=timediff)
print(f"Found a total of {len(all_songs_played)} songs over a time range of {delta}")
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment