Skip to content

Instantly share code, notes, and snippets.

@sinebeef
Created November 28, 2019 17:59
Show Gist options
  • Save sinebeef/9a7304e088d46de816daebc1f3d75468 to your computer and use it in GitHub Desktop.
Save sinebeef/9a7304e088d46de816daebc1f3d75468 to your computer and use it in GitHub Desktop.
Python script, output list of products / category names for whatever sitemap you provide.
# Lists all the items from either product-sitemap.xml or product_category-sitemap.xml
# you must provide the full url.
from bs4 import BeautifulSoup
import json, requests
import re, sys
url = ""
try:
url = sys.argv[1]
except:
pass
if url:
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
links = soup.find_all('loc')
for link in links:
print(link.text.split('/')[-2])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment