Skip to content

Instantly share code, notes, and snippets.

@ggorlen
Last active June 4, 2024 20:34
Show Gist options
  • Select an option

  • Save ggorlen/0777f39307eb599b9cb1035ea55e66fe to your computer and use it in GitHub Desktop.

Select an option

Save ggorlen/0777f39307eb599b9cb1035ea55e66fe to your computer and use it in GitHub Desktop.
Puppeteer Resources
const puppeteer = require("puppeteer");
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
const ua =
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36";
await page.setUserAgent(ua);
const url = "https://www.example.com";
// optional optimization stubs:
//await page.setJavaScriptEnabled(false);
//await page.setRequestInterception(true);
//const allowed = [
// "https://www.example.com",
//];
//page.on("request", request => {
// if (allowed.some(e => request.url().startsWith(e))) {
// request.continue();
// }
// else {
// request.abort();
// }
//});
//page.on("request", req => {
// if (req.resourceType() === "image") {
// req.abort();
// }
// else {
// req.continue();
// }
//});
await page.goto(url, {waitUntil: "domcontentloaded"});
const $ = (...args) => page.waitForSelector(...args);
console.log(await page.title());
})()
.catch(err => console.error(err))
.finally(() => browser?.close());

Puppeteer Resources

Gotchas

"load" can time out

On a site that requests a resource that has a long 504 gateway timeout, "load" (the default nav predicate) can time out but "domcontentloaded" won't:

const puppeteer = require("puppeteer"); // ^22.7.1

// request some resource with a 504 gateway timeout
const html = `<img src="https://via.placeholder.com/150/FF0000/FFFFFF?text=CD">`
let browser;

(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  await page.setContent(html, {
    waitUntil: "load" // fix with: "domcontentloaded"
  }); // times out on "load"
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

Unresolved

Haven't used yet but interesting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment