Skip to content

Instantly share code, notes, and snippets.

@kevsersrca
Created April 20, 2020 10:26
Show Gist options
  • Save kevsersrca/da20d804e9c1f80dd822ecf9647bb60f to your computer and use it in GitHub Desktop.
Save kevsersrca/da20d804e9c1f80dd822ecf9647bb60f to your computer and use it in GitHub Desktop.
from bs4 import BeautifulSoup
import requests
def checkUrl(url):
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
types = set()
for item in soup.find_all(itemtype=True):
types.add(item.get('itemtype'))
for t in types:
if t:
print('- contains a "{}" schema'.format(t))
@kevsersrca
Copy link
Author

tested url : https://www.allrecipes.com/recipe/95294/chewy-caramel/

- contains a "http://schema.org/BreadcrumbList" schema
- contains a "http://schema.org/Review" schema
- contains a "http://schema.org/Recipe" schema
- contains a "http://schema.org/ListItem" schema
- contains a "http://schema.org/NutritionInformation" schema
- contains a "http://schema.org/AggregateRating" schema
- contains a "http://schema.org/Rating" schema

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment