Skip to content

Instantly share code, notes, and snippets.

@tudoanh
Created December 3, 2021 03:47
Show Gist options
  • Save tudoanh/cfbbf57be2b21afe64ff3a53f7f3f540 to your computer and use it in GitHub Desktop.
Save tudoanh/cfbbf57be2b21afe64ff3a53f7f3f540 to your computer and use it in GitHub Desktop.
Download web to pdf using google-chrome
#!/usr/bin/env python3
import subprocess
import sys
import requests
import bs4
from slugify import slugify
from rich.console import Console
console = Console()
url = sys.argv[1]
r = requests.get(url)
html = bs4.BeautifulSoup(r.text, features="lxml")
title = html.title.text
file_name = f"{slugify(title)}.pdf"
subprocess.run(["google-chrome", "--headless", f"--print-to-pdf={file_name}", url])
console.print(f"Downloaded from [i]{url}[/i] to [bold green]{file_name}[/bold green]")
@tudoanh
Copy link
Author

tudoanh commented Dec 3, 2021

pip install requests beautifulsoup4 python-slugify rich

chmod a+x web_to_pdf.py

@tudoanh
Copy link
Author

tudoanh commented Dec 3, 2021

Example:

./download.py https://teddit.net/r/Python/comments/r6y73y/indepth_analysis_of_python_solutions_to_advent_of/

Result:
in-depth-analysis-of-python-solutions-to-advent-of-code-day-1-python-1

in-depth-analysis-of-python-solutions-to-advent-of-code-day-1-python-2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment