Last active
October 18, 2023 13:52
-
-
Save aminnj/2d05f7f2173e12d518f455d47cdf690d to your computer and use it in GitHub Desktop.
Download reddit-hosted videos/audio
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import os | |
# change this url to the post's url | |
post_url = "https://www.reddit.com/r/holdmycatnip/comments/7vyada/hmc_so_i_can_drink_this_air_real_quick/" | |
# use UA headers to prevent 429 error | |
headers = { | |
'User-Agent': 'My User Agent 1.0', | |
'From': '[email protected]' | |
} | |
url = post_url + ".json" | |
data = requests.get(url, headers=headers).json() | |
media_data = data[0]["data"]["children"][0]["data"]["media"] | |
video_url = media_data["reddit_video"]["fallback_url"] | |
audio_url = video_url.split("DASH_")[0] + "audio" | |
print video_url, audio_url | |
# curl both audio and video separately | |
os.system("curl -o video.mp4 {}".format(video_url)) | |
os.system("curl -o audio.wav {}".format(audio_url)) | |
# mux them | |
os.system("ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -strict experimental output.mp4") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, its not working correctly. Audio filename is changes dynamicly
You can get it from dash_url:
media_data['reddit_video']['dash_url']
xml parse 2 versions of xmls:
filename = bs4.find("adaptationset", {'contenttype': 'audio'}).find('representation').find('baseurl').text
filename = bs4.find("representation", {'id': 'AUDIO-1'}).find('baseurl').text
The second trouble is if post is from subreddit, then:
and then working with media_data
Third problem. That is a audio filename. Its maybe mp4, maybe wav(as default). You need to check it (see filename in trouble 1)
Fourth problem is if no audio track. Check it by filename in dash_url. If no filename, then no audio track