Skip to content

Instantly share code, notes, and snippets.

@alexkuang0
Last active April 3, 2020 08:57
Show Gist options
  • Save alexkuang0/8fcc580aa0652cba42fb46852039ca3d to your computer and use it in GitHub Desktop.
Save alexkuang0/8fcc580aa0652cba42fb46852039ca3d to your computer and use it in GitHub Desktop.
thor-downloader
import requests
from bs4 import BeautifulSoup
main_url = "https://www.thorlabs.com/newgrouppage9.cfm?objectgroup_id=3569"
main_page = BeautifulSoup(requests.get(main_url).text, 'html.parser')
products = main_page.find_all(class_="prodNumber")
prod_urls = []
for product in products:
prod_urls.append("https://www.thorlabs.com/" + product.contents[0].get('href'))
for prod_url in prod_urls:
prod_page = BeautifulSoup(requests.get(prod_url).text, 'html.parser')
pdf = prod_page.find(class_="downloadDoc")
pdf_url = "https://www.thorlabs.com/" + pdf.get('href')
pdf_filename = pdf.get('data-filename')
with open(pdf_filename, "wb") as pdf:
for chunk in requests.get(pdf_url, stream=True).iter_content(chunk_size=1024):
if chunk:
pdf.write(chunk)
@alexkuang0
Copy link
Author

Install dependencies first:

pip install requests
pip install beautifulsoup4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment