Skip to content

Instantly share code, notes, and snippets.

@macloo
Created February 22, 2024 18:54
Show Gist options
  • Save macloo/c94e3f8f98c6abe3393699cd36ff6d7c to your computer and use it in GitHub Desktop.
Save macloo/c94e3f8f98c6abe3393699cd36ff6d7c to your computer and use it in GitHub Desktop.
Imports and starter code for a scraping script with BeautifulSoup and Requests
from bs4 import BeautifulSoup
import requests
hdr = {'User-Agent': 'your user-agent info here'}
# find YOUR user-agent HERE: https://www.whatismybrowser.com/detect/what-is-my-user-agent/
url = 'https://www.some_domain.com/some_dir'
page = requests.get(url, headers=hdr)
soup = BeautifulSoup(page.text, 'html.parser')
'''
If you have a list of URLs to scrape, you need to loop over the list, and
make page and soup each time the loop runs.
'''
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment