Skip to content

Instantly share code, notes, and snippets.

@obswork
Created December 31, 2018 17:40
Show Gist options
  • Save obswork/2cb8e4c5b790c28abeedb40c45218aca to your computer and use it in GitHub Desktop.
Save obswork/2cb8e4c5b790c28abeedb40c45218aca to your computer and use it in GitHub Desktop.
extract your "Top 100 Songs 2018" (spotify)
"""Prerequisites:
(1) You'll first need to open up the developer console in e.g. Chrome (cmd-shift-c)
(2) Locate the div element with id "main" (should be relatively easy to find as it's the main enclosing div in the body)
(3) Copy all the inner html of that div (easy way- right-click and select "Edit as HTML", then copy normally)
(4) Save that to a file somewhere (e.g. /tmp/tracklist.html)
(5) Open up a python shell (preferably iPython!) and execute the following
"""
import lxml.html
# read in the file from wherever it is saved
with open('/tmp/tracklist.html', 'r') as f:
html = f.read()
# convert the html blob into an lxml tree
tree = lxml.html.fragment_fromstring(html)
# grab all the songs
songs = tree.xpath("//div[contains(concat(' ', normalize-space(@class), ' '),' tracklist-name ')]")
# grab all the artist/album info
metadata = tree.xpath("//div[contains(concat(' ', normalize-space(@class), ' '),' second-line ')]")
# zip the songs and metadata together, clean up the results a little, and spit them out
for s, m in zip(songs, metadata):
meta = m.text_content()[8:] if "Explicit" in m.text_content() else m.text_content()
meta = meta.replace('•', '/')
print("%s - %s" % (s.text, meta))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment