Last active
May 4, 2023 02:21
-
-
Save davidlj95/b9d962ad9bf62a14ebe73e12aac27c70 to your computer and use it in GitHub Desktop.
FlaixFm podcast URL extractor
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function getNextButton() { | |
return document.querySelector('.podcast-pagination .right_arrow'); | |
} | |
function hasNextButton() { | |
const nextButtonStyle = window.getComputedStyle(getNextButton()); | |
return nextButtonStyle.opacity !== '0' | |
} | |
function sleep(ms) { | |
return new Promise(resolve => setTimeout(resolve, ms)); | |
} | |
async function grabAudioUrlAndSleep() { | |
const audioItem = document.querySelector('audio#soundId'); | |
const audioUrl = audioItem.src | |
console.log(`Found audio URL: ${audioUrl}`) | |
audioUrls.push(audioUrl) | |
// To avoid get banned from the audio server | |
await sleep(1000); | |
} | |
const audioUrls = [] | |
while (hasNextButton()) { | |
const podcastPlayButtons = document.querySelectorAll(".podcast-right-bottom .llista-button-component-wrapper > div") | |
for (const podcastPlayButton of podcastPlayButtons) { | |
podcastPlayButton.click(); | |
// Podcast item may be divided by hours | |
const hours = document.querySelectorAll(".player .time_handlers-hours"); | |
// Not divided by hours, grab audio and go | |
if (hours.length == 0) { | |
await grabAudioUrlAndSleep(); | |
} | |
// Divided by hours, loop them and grab audio URLs | |
for (const hour of hours) { | |
hour.click() | |
await grabAudioUrlAndSleep(); | |
} | |
} | |
getNextButton().click(); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
FlaixFM podcast audio URLs extractor
Small snippet so that you can extract podcasts' audio URLs from FlaixFM's podcast website
Extracting audio files URLs
Go there, open a DevTools window. Paste the snippet there. You'll see how you start loading each podcast item. And for each item, you check every hour of the podcast. For every audio file, the URL is displayed in the console and pushed into the
audioUrls
array.Copying audio files URLs to clipboard
Once all podcast URLs have been stored, you can copy the
audioUrls
into the clipboard so you can download them. Type that into the DevTools and then, switch to the website window (otherwise you'll get an error).Downloading audio files using their URLs
Paste the URLs into a file. If using MacOs:
pbpaste > audio_urls.txt
Then, you can use something like
aria2c
to download them all🎉 You've downloaded all podcast items from FlaixFM
Bonus: Merging audio files
Some podcasts items are divided into different audio files, 1 per each hour of the podcast. To join them again, you can use
mp3wrap
mp3wrap 2023-03-05_Podcast.mp3 20230430*.mp3
Title and Album will be a bit weird. You can fix that with
id3v2
Or if you want to add a cover image too, try with
eyeD3
eyeD3 --add-image "cover.jpg:FRONT_COVER" 2023-03-05_Podcast.mp3.mp3