Creating Reproducible Browser Automation Examples

Motivation

Without a minimal, reproducible example (reprex), in most cases, it's impossible to answer your question.
Without a reprex, the question isn't likely to provide value to future visitors with the same problem (the main purpose of Q&A sites like Stack Overflow).

Question Checklist

Include your reprex code as text, not an image in the question itself (not an external link).
- Can someone copy and paste the code into an editor and run it as-is? If not, it's not complete.
- Does running the code reproduce the problem? If not, it's not reproducible.
- Is there anything in the code that can be removed, while still causing the failure? If so, then it's not minimal.
- Autoformat the code (Prettier for HTML, CSS and JS, Black for Python).
Show any error messages, with the full stack trace, as text, as generated by the reprex code.
Include the site you're automating, preferably with the URL or HTML string in the code itself.
- If the site is private and you can't provide access, then your problem is not reproducible.
- Try to come up with a sample site that reproduces the relevant issue, either a custom HTML/JS page, or a public site (preferably a site specifically created for testing browser automation).
Include versions for all packages. Include versions and details for system or environment, if relevant.
Show exact expected output. If you're scraping to JSON or CSV, show a couple of objects or rows so the desired result is clear. If the automation involves filling out a form, show screenshots of the steps and completed form filled manually.
Finally, ask a specific technical question about your code.

Examples

Great

Access elements inside closed shadow DOM
- Good: Provides custom example page under test.
- Good: Provides complete automation code.
- Minor issue: Missing package version.
Puppeteer - Cannot Target Checkbox on Hotel Website
- Good: Provides link to website being scraped, which is too complex to create a standalone version of or find a relevant playground/sandbox site.
- Good: Provides complete automation code.
- Minor issue: Missing package version.

OK

Is it possible to write a "isAlertDisplayed" method on a separate AlertHelper file?
- Good: Provides a stable scraping playground site.
- Minor issue: Missing packages and all boilerplate.
How will I get the content ie the title of a tag while web scrapping with puppeteer?
- Good: Provides a stable scraping playground site.
- Minor issue: Missing packages and all boilerplate.
- Minor issue: Doesn't show output data as text (but a screenshot of dev tools is better than nothing).
How can I get an element by xpath?
- Good: No need for a website because the question is about core Puppeteer API functionality. This is an exception to the rule.
- Bad: Missing package version, which is really important for this particular question, because the method they're calling was only temporarily available in certain versions.
- Minor issue: Missing complete sample boilerplate (can't hurt to provide it, even if it's small).

Bad

Using Puppeteer, how do you get text from an <h1> tag?
- Bad: Doesn't show the site being automated
- Bad: Doesn't show how the browser was launched or what happens between the page arriving at the site and the problematic selection.
- Bad: Potential XY problem (it's unusual to be scraping a server-side template).
Cheerio errors at runtime when built using rollup, dependency resolution seems to be the issue
- Good: Full stack trace was provided as text.
- Good: Package versions were given.
- Bad: Code was given as a link to a repo, which disappeared over time, rendering the question useless and unanswerable.
- Bad: Build commands were not provided (they're probably in the repo that can't be accessed).
Cheerio -Web Scraping - Not able to scrape innertext of a div
- Bad: No URL provided and impossible to answer.

Watch out for XY Problems

Simplifications can be good, but always provide context for what you're trying to achieve. When askers aren't able (or don't bother) to provide the actual page they're automating, they often simplify the problem in a way that invalidates answers or makes answers have to use approaches that don't make sense, frustrating bother the asker, answerer and future visitors.

For example, it's OK to ask "how do I select and click a button on page X?", but also mention your broader goal in clicking the button. If your goal is to scrape some data that can actually be accessed without the DOM, then clicking the button (problem Y) wasn't even a necessary thing to need to do in order to get the actual result (problem X).

Using HTML-only Snippets

If you're providing an HTML snippet, make sure async JS behavior, iframes, shadow roots or cloudflare blocks aren't the real reason you can't select an element.

Provide at least the full HTML tree up to the document root, possibly removing irrelevant excessive <div>s. Often, the correct and best way to locate an element is not using its own attributes, but relying on its ancestor tree.

If an element is part of a list, provide at least two items of the list so it's clear what distinguishes one from the other.

Providing Screenshots

Screenshots can be misleading if it's not clear what state the page is in when the screenshot was taken, or whether the screenshot is from an automated browser or human session. Things change dramatically when you switch to browser automation from a human browsing session, and even more so when you go headless. Don't assume things are the same.

Use caution with screenshots of dev tools component trees. These are dynamic and may not reflect what you see when your automation script runs.

Browser Automation Playground Sites

These are nearly perfect for a reprex, because they're unlikely to change and isolate a single piece of functionality cleanly. The only downside is that they sometimes disappear over time, so they're a bit suboptimal over making your own simple site.

Of course, if the behavior is too complex to capture in a simple site or a browser playground, you can share the actual site as a last resort--better than nothing.

ggorlen/browser-automation-reprex.md