Skip to content

Instantly share code, notes, and snippets.

@harshanas
Last active August 28, 2019 17:14
Show Gist options
  • Save harshanas/100d60a55cd8da0d7345b4365fd479af to your computer and use it in GitHub Desktop.
Save harshanas/100d60a55cd8da0d7345b4365fd479af to your computer and use it in GitHub Desktop.
BeautifulSoup Code Snippets

Basic Imports, Loading Page and Parsing it to BS

from bs4 import BeautifulSoup
import urllib.request

page = urllib.request.urlopen("https://google.com/").read()
soup = BeautifulSoup(page, 'html.parser')  # Uses html.parser to parse the page

Working with Elements

# Get All Elements inside the page
pageElementsAll = soup.findAll()

# Get Only Body Elements inside the page
pageElementsBody = soup.findAll("body")

# Get only Paragraph Elements
pageElementsBody = soup.findAll("p")

# Get Div Element By Id
elem = soup.find("div", {"id": "articlebody"})

# Get Div Element By Class
elem = soup.find("div", {"class": "articlebody"})

Extract Data inside Elements

# Get Attributes
elem[0].attrs

# Get Tag Name
elem[0].name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment