Skip to content

Instantly share code, notes, and snippets.

@ixuuux
Forked from mezhgano/playwright_save_mhtml.py
Last active May 5, 2023 06:00
Show Gist options
  • Save ixuuux/f995a09d11d0774fbb98ed40c872c4f6 to your computer and use it in GitHub Desktop.
Save ixuuux/f995a09d11d0774fbb98ed40c872c4f6 to your computer and use it in GitHub Desktop.
通过playwright将网页保存为mhtml文件
from playwright.sync_api import sync_playwright
def save_mhtml(path: str, text: str):
with open(path, mode='w', encoding='UTF-8', newline='\n') as file:
file.write(text)
def save_page(url: str, path: str):
with sync_playwright() as playwright:
browser = playwright.chromium.launch(headless=False)
page = browser.new_page()
page.goto(url)
client = page.context.new_cdp_session(page)
mhtml = client.send("Page.captureSnapshot")['data']
save_mhtml(path, mhtml)
browser.close()
if __name__ == '__main__':
save_page('https://example.com/', 'example.mhtml')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment