Skip to content

Instantly share code, notes, and snippets.

@nfriedly
Last active August 6, 2024 03:06
Show Gist options
  • Save nfriedly/1d0f81fd68addd594d4974923205c384 to your computer and use it in GitHub Desktop.
Save nfriedly/1d0f81fd68addd594d4974923205c384 to your computer and use it in GitHub Desktop.
Chirp Audiobook Download Script

Chirp AudioBook Download Script

⚠️ Not currently working. Chirp changed something that broke this script.


This script eases the process of downloading the audio files from Chirp Audiobooks. It uses the browsers console to generate a list of URLs, and then provides a list of curl or wget commands to download them.

Tested with Firefox + Terminal on MacOS, and Firefox + PowerShell on Windows 10.

As an aside, I want to give a shout out to Libro.fm for providing a simple download button for each purchase. Then you don't need a script like this!

Instructions

  1. Find the book in your Chirp Library.
    • If you've already listened to it, you may need to move it back from your Archive.
  2. Click the book to open Chirp's web player.
  3. Open the browser's Web Developer Tools.
  4. Copy-paste the script.js contents into the console and press [enter].
  5. Initiate the script:
    • If the book is already at the start, click Play (▶).
    • If the book is on any other track, open the Chapters menu (top left) and select the first Track.
  6. Wait while the script advances through each track; it's saving the URLs in the background.
    • It may say "There was an error loading your audiobook, please reload the page." under the Play button, ignore this.
    • It may also show a number of URLs in red in the console, along with a warning after each one. Ignore these also.
  7. When it reaches the final track, the script will show a list of commands on the screen in a white box.
    • Click once to highlight the complete list.
    • Copy-paste it to a command line (Terminal, Power Shell, etc.) and press [enter] to execute it.
      • Some command lines will begin executing immediately, however you still need to press [enter] to execute the final command.
  8. Once the commands finish, you should have a new folder with a cover image and each of the tracks as .m4a files.
    • On macOS, type open . and press [enter] to view the files.
    • On Windows, type explorer . and press [enter] to view the files.
  9. Check the file size of each track:
    • If any are 0 bytes, the download URL may have expired.
      • In that case, go through the process again, but in step 7, first paste the commands into a text editor and delete everything except for the ones to download the 0-byte files.

Enjoy!

const $ = document.querySelector.bind(document);
function filename(name) {
return name.replaceAll('&', 'and').replaceAll(':', ' -').replaceAll(/[^a-z0-9 ._-]+/ig, '');
}
const title = filename($('h1.book-title').textContent);
const credits = [].slice.call(document.querySelectorAll('.credit'))
.map(n => filename(n.textContent))
.join(' - ');
const dirname = `${title} - ${credits}`;
const commands = [
`mkdir "${dirname}"`,
`cd "${dirname}"`,
// note: unlike the audio files, this one doesn't need to follow redirects, so we can use the same curl command everywhere.
`curl -o "cover.jpg" "${$('.cover-image').src }"`
];
const tracks = [];
let count = 0;
function addUrl(url) {
count += 1;
const chapter = filename($('div.chapter').textContent);
tracks.push({
count,
chapter,
url
})
}
function showCommands() {
const padSize = tracks.length.toString().length;
// MacOS comes with curl but not wget. Windows powershell has fake versions of both.
// The "real" curl needs the -L (--location) flag set to know to follow redirects.
// The fake windows version turns it on by default and *refuses to work if you set the flag manually*.
// So, we generate correct curl commands for mac and correct wget commands for Windows/Linux/etc.
const isMac = navigator.userAgent.includes('Macintosh');
const cmd = isMac ? 'curl -L -o' : 'wget -O';
tracks.forEach(({count, chapter, url}) => {
let trackNum = count.toString().padStart(padSize, "0");
commands.push(`${cmd} "${title} - ${trackNum} - ${chapter}.m4a" "${url}"`);
})
const div = document.createElement('div');
div.innerHTML = '<div style="position: absolute; top: 100px; left: 100px; z-index: 100000; background: white; padding: 10px;"><p>Copy these commands to PowerShell/Terminal/etc:</p><textarea id="dl-commands" style="min-height:20em; min-width:30em"></textarea></div>';
document.body.appendChild(div);
const textarea = document.querySelector('#dl-commands');
textarea.value = commands.join('\n');
textarea.onfocus = function(){this.select()};
}
function next() {
const btn = $('button.next-chapter')
if (btn.disabled) {
showCommands()
} else {
btn.click();
}
}
const audio = $('audio');
Object.defineProperty(audio, "src", {
get() {
return '';
},
set(url) {
setTimeout(() => {
addUrl(url);
next();
}, 500);
},
});
@Bostwickenator
Copy link

Bostwickenator commented Feb 12, 2024 via email

@CommanderJoy
Copy link

Just as an update, on a Mac and because I use Audiobook Builder which seems to be finicky about filenames, in the code I substituted mp3 and that works perfectly.

@Sbackus65
Copy link

In the last day or so I have gotten the following comment after each wget command: "HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed." I have tried with Chrome and FIrefox, and with WIndows 10 PowerShell and wiith Ubuntu Konsole...

@nfriedly
Copy link
Author

@Sbackus65 Yep, it seems broken for me too now. Chirp must have changed something.

@CommanderJoy
Copy link

Well the code is no longer working with Chirp, and I wonder if this is no longer possible. While I could insert the code into the javascript console and I could put the code into Terminal, all files are 37 kb and do not play. A sad day for sure. Is there any way to come up with a fix for whatever they did?

@Bostwickenator
Copy link

I have a solution here but it's significantly more complex. I need to clean it up somewhat before posting.

@CommanderJoy
Copy link

I have a solution here but it's significantly more complex. I need to clean it up somewhat before posting.

Wonderful that you have given it a go. I hope it won't be too complex for those of us non-programmers. I can copy and paste and put that into Terminal, but not much more without detailed instructions. Wishing you success!

@danielb2
Copy link

danielb2 commented Aug 5, 2024

playing the book manually, there are now multiple calls generated, and one which this script uses gets an unauthorized message. There are however calls that are made which when copied and pasted into curl work just fine. I think this script is quite fixable

@Bostwickenator
Copy link

Yup the distribution links work. I have it working. The key is they have a short life time you have to get them on the fly. I've done this with an unbundled chrome extension which snoops the distribution links and a selenium webdriver which then picks them up and downloads them. I spent 2 days on it. I just have to remove the cookies I hardcoded into my version so I can put it up on GitHub. I'll do this tomorrow

@CommanderJoy
Copy link

Hey Guys, whatever fix you are working on, could it be accessible for us non-programmer folks. I'm on a Mac, using Safari, so hopefully whatever you conjure up will work with that. Nice that you are taking this on!

@Bostwickenator
Copy link

Bostwickenator commented Aug 5, 2024 via email

@CommanderJoy
Copy link

Oh too bad. I won't install Chrome-don't want it on my system. Ah well. If you ever get it to run on Safari, do post please!

@Bostwickenator
Copy link

Bostwickenator commented Aug 5, 2024 via email

@Bostwickenator
Copy link

OK the code is now available here https://github.com/Bostwickenator/ch.rip

It is a little more involved than the original script here but from my investigation that seems pretty unavoidable due to some clever engineering on the chrip side. I've tried to outline all the steps clearly including why they are included. This should help non programmers setup and run it.

Please report any issues directly on the new repo and I'll see about getting those cleaned up.

If someone figures out a way to make nfriedly's script here work with the 60 second timeouts for the hidden URLs that would obviously be cleaner but I don't think it's plausible right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment