Skip to content

Instantly share code, notes, and snippets.

@ivan
Last active May 16, 2024 22:16
Show Gist options
  • Save ivan/411e75128eb22f4a278a87f98a58ef74 to your computer and use it in GitHub Desktop.
Save ivan/411e75128eb22f4a278a87f98a58ef74 to your computer and use it in GitHub Desktop.
Download a podcast episode from anchor.fm
#!/usr/bin/env bash
# Download a podcast episode from anchor.fm
#
# Usage:
# grab-anchor-episode "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-from-the-Life-and-Death-of-the-Integral-Center-e31val" # (m4a example)
# grab-anchor-episode "https://anchor.fm/free-chapel/episodes/Are-You-Still-In-Love-With-Praise--Pastor-Jentezen-Franklin-e19u4i8" # (mp3 example)
#
# anchor.fm serves a list of m4a or mp3 files that need to be concatenated with ffmpeg.
#
# For debugging, uncomment:
# set -o verbose
set -eu -o pipefail
url=$1
json=$(curl -sL "$url" | grep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g')
ymd=$(echo -E $json | jq -r '.episodePreview.publishOn' | cut -d 'T' -f 1)
extension=$((echo -E $json | jq -r '.[].episodeEnclosureUrl' | grep -F --max-count=1 :// | grep -oP '\.[0-9a-z]+$' | cut -d . -f 2) || echo m4a)
output_basename=$ymd-$(basename -- "$url").$extension
if [[ -f "$output_basename" ]]; then
echo "$output_basename already exists; skipping download"
exit
fi
temp_dir="$(mktemp -d)"
cd "$temp_dir"
audio_urls=$(echo -E $json | jq -r '.station.audios|map(.audioUrl)|.[]')
for i in $audio_urls; do
output_file=$(basename -- "$i")
wget "$i" -O "$output_file"
echo "file '$output_file'" >> .copy_list
done
ffmpeg -f concat -safe 0 -i .copy_list -c copy "$output_basename"
cd -
mv "$temp_dir/$output_basename" ./
rm -rf "$temp_dir"
@hutber
Copy link

hutber commented Dec 24, 2021

Can't think why its complaining but .copy_list: No such file or directory

@ivan
Copy link
Author

ivan commented Dec 24, 2021

If there is no .copy_list, the issue is that it did not find any audio_urls.

You can add some debug prints e.g. echo -E $json before audio_urls=$(echo -E $json | jq -r '.station.audios|map(.audioUrl)|.[]') if you would like to investigate the JSON.

Which URL causes that?

@solomonrb
Copy link

solomonrb commented Dec 26, 2021

I have a 2 step hybrid solution on Mac that I think is a bit easier, only requires ggrep (via brew install ggrep)

From terminal, insert your anchor url into the following code and run curl -sL "<insert-url-here>" | ggrep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g'

Cmd-F the printed output for "episodeEnclosureUrl":, and copy the string that follows it (e.g. "https: ...")

Replace any \u002F in that string with /, and paste the resultant url into your web browser. Then click the 3 dots and click download!

@ivan
Copy link
Author

ivan commented Dec 27, 2021

The whole point of the script is to deal with anchor.fm's multi-file serving: for many podcasts, anchor publishes audio as multiple files that need to be concatenated. I believe the segments are split up as they were originally edited using their software.

@solomonrb
Copy link

Great point. I guess my solution is only useful for single-file podcasts from anchor.fm

@Potatrix
Copy link

Potatrix commented Jan 6, 2022

Thank you so much for this!

In my case, I'm trying to download all episodes of a certain podcast. After poking around this one for a bit, I found that the Json returned by the curl request contains Urls for other episodes. (Maybe all of them? seems like it was in my case)

For those looking to do the same, here is a helper script that works with this one

./grab-all-anchor-episodes.sh

#!/bin/bash

# Downloads all(?) episodes from a podcaster

# Usage:
# ./grab-all-anchor-episodes.sh "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-from-the-Life-and-Death-of-the-Integral-Center-e31val"
#
# Must be run from same directory as ./grab-anchor-episodes.sh

# URL from an episode seems to contain information about other episodes too
# writes JSON to file in /tmp and iterates through each 'shareLinkPath' and writes to urlList
#
# Runs ./grab-anchor-episode.sh for each URL in list
#
#

url=$1

echo $url

json=$(curl -sL "$url" | grep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g')

echo $json > /tmp/json

python3 - <<END
import os
import json

data = open("/tmp/json", "r")

file = json.load(data)

for url in file['episodePreview']['episodes']:
        urlPath = "https://anchor.fm%s" % url['shareLinkPath']
        os.system("echo %s >> /tmp/urlList" % urlPath)
END

urlList=$(cat /tmp/urlList)

for url in $urlList
do
        ./grab-anchor-episode.sh $url
done

#cleanup
rm /tmp/json
rm /tmp/urlList

@ivan
Copy link
Author

ivan commented Jan 6, 2022

@Potatrix Did the script actually produce incorrect audio files? If it needs to be fixed, it would really help to have the URL for testing.

@viocar
Copy link

viocar commented Feb 16, 2022

This seems to have stopped working at some point. I have the latest version, and anything I try to download, even the provided examples, just results in .copy_list: No such file or directory.

The command I used: bash grab-anchor-episode.sh "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-fr om-the-Life-and-Death-of-the-Integral-Center-e31val"

@viocar
Copy link

viocar commented Feb 18, 2022

With the help of a friend, I've managed to modify the script so that it works again (at least for my purposes). I've forked it here: https://gist.github.com/viocar/a6b6a0f485b3f400b8bcb0f8334b454d

@Potatrix
Copy link

@ivan the script downloaded the audio files fine. Sometimes it didn't convert to mp3 but wasn't really an issue for me. I had a task to download all of the recordings for anchor podcast I manage and needed a quick way to download all of them which is why I made the modification

@viocar I notice a space in your URL but I assume it wasn't like this when you tried to run the script?

@viocar
Copy link

viocar commented Feb 28, 2022

No, I tried several URLs that I copied directly from my browser. I'm not sure why there's a space in my post.

@bshankarpandey
Copy link

Please any one could suggest me script code for tracking anchor.fm podcast audio in Tag Manager tools ?

@MaxMussi
Copy link

Where can I change the output location, sorry, I am new to linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment