- Logging JSON to the console
page.on("console", msg => Promise.all(msg.args().map(e => e.jsonValue())) .then(args => console.log(...args)) );
- Registering
.onhandlers on a page - Waiting for responses with
page.waitForResponse - Mocking repsonses and mocking fetch in browser context
- Navigating a SPA
- Promisifying
page.on("request", ...) - Opening a bunch of links -- can use
browser.newPage()to create new tabs - Persisting browser data:
- Filling out forms inside iframes
- Injecting functions into the browser:
- Disable JS:
await page.setJavaScriptEnabled(false); - Disable images (can also block other resource types such as
"stylesheet"):await page.setRequestInterception(true); page.on("request", req => { if (req.resourceType() === "image") { req.abort(); } else { req.continue(); } });
- Puppeteer can't find elements when Headless TRUE
- Use the
Promise.allpattern when triggering navigation. - Make sure to use
page.waitForSelector("...", {visible: true})before clicking an element when getting false negatives. - If
page.waitForSelectorisn't firing orpage.clickhas no effect, it could be that there is a hidden element with the same selector earlier in the page than the targeted one. Puppeteer is waiting for the wrong element to appear, basically. Another possible scenario is the element has no text or is zero width. See Puppeteer in NodeJS reports 'Error: Node is either not visible or not an HTMLElement'. Similarly, using eval and a native visibility check can help bypass{visible: true}andpage.$()false negatives. page.exposeFunctionruns the function in Node context, not browser context. See: Why can't I access 'window' in an exposeFunction() function with Puppeteer?- Detecting navigation end can be tricky and inconsistent. See Puppeteer wait until page is completely loaded.
- Using
page.clickcan change the history push behavior relative to triggering a navigation withpage.evaluate(). See this. - For some sites,
headless: falseworks butheadless: truefails. See this thread and this SO answer.
On a site that requests a resource that has a long 504 gateway timeout, "load" (the default nav predicate) can time out but "domcontentloaded" won't:
const puppeteer = require("puppeteer"); // ^22.7.1
// request some resource with a 504 gateway timeout
const html = `<img src="https://via.placeholder.com/150/FF0000/FFFFFF?text=CD">`
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.setContent(html, {
waitUntil: "load" // fix with: "domcontentloaded"
}); // times out on "load"
})()
.catch(err => console.error(err))
.finally(() => browser?.close());