Skip to content

Instantly share code, notes, and snippets.

View avnersorek's full-sized avatar

Avner Sorek avnersorek

View GitHub Profile
@avnersorek
avnersorek / scrape_subreddit.js
Created August 20, 2025 08:17
A Reddit subreddit scraper that reads posts and comments and outputs a text file.
/**
The script is designed to be respectful of Reddit's API limits while comprehensively collecting both posts
and comments for offline analysis or archival purposes.
I have used it to scrape subreddits and throw the text in NotebookLLM.
Takes like 30 minutes to download a subreddit (depending on how busy it is) since there are a lot of rate limits.
1. Scrapes recent posts from a specified subreddit (last 30 days) using Reddit's JSON API
2. Fetches comments for each post along with the post content
3. Saves everything to a text file named {subreddit}_posts.txt with formatted content including: