Running Playwright projects from GitHub inside a browser-extension runtime

Browser automation usually assumes a controller outside the browser: a local process, a CI worker, or a driver that can launch and own browser instances. 100xbot started from a different constraint.

The original 100xbot question was:

Can we do browser automation from inside the browser at all, given browser-extension CSP rules and the fact that we cannot just fetch JavaScript over the network and execute it?

The extension manifest keeps extension pages on script-src 'self' 'wasm-unsafe-eval'. Code from GitHub cannot be treated as a normal remote script include. The automation runtime also cannot depend on arbitrary network-loaded JavaScript running with extension privileges.

For a long stretch, this had nothing to do with Playwright. The first answer was 100xbot's own browser-native automation stack. It had background capabilities and content scripts. It had DOM and tab tools. It had workflow execution and file storage. It had a JS-compatible interpreter, and later 100xui for page-local automation.

At that point, the browser side worked. The LLM side was still open.

The tool list kept growing. Each tool had its own input contract and output contract. Each also had rules for tabId and selectors. Some had rules for file keys or page handles. The model had to be taught those private 100xbot rules before it could do ordinary browser work. Even then, it would often drift back toward Playwright-like code. That is the browser automation pattern it had seen the most.

Playwright entered later because fighting that prior was the wrong use of the model. The model naturally wants to write page.locator(...), getByRole(...), and expect(...). The runtime should make that code meaningful. Forcing every workflow through a private vocabulary was the expensive path.

Playwright is designed around a Node-side runner that owns the browser process. The official docs make that architecture explicit: the runner installs browser binaries, runs configured projects across browser engines and branded channels, and exposes APIs such as BrowserType.launch() and BrowserType.connect() for controlling browser instances from the runner side.

That model works because the runner has filesystem access and process control. It also owns project configuration and browser contexts. It owns reporters and traces. It owns videos, downloads, and browser-version matching.

100xbot lives somewhere else. It is a browser extension with the user's real browser and real tabs. It has extension permissions and content scripts. It has background capabilities, IndexedDB storage, and a workflow runtime.

The later Playwright question became:

Can existing Playwright repositories from GitHub execute through a CSP-compatible 100xbot runtime, with repo files stored locally first and no separate runner?

The answer is not a blanket yes. It is a useful partial yes, with test runs and failure cases attached.

Before Playwright

The first version of browser automation in 100xbot was much lower level.

It started with tab and DOM capabilities. Instead of one vague "web page" abstraction, the system grew explicit operations:

dom_click
dom_getText
dom_getValue
dom_querySelector
dom_querySelectorAll
dom_setAttribute
dom_setValue
tab operations for opening, updating, and inspecting browser tabs

Then came dom_executeWang and the content-script executor. Automation code could run in a controlled extension/content-script path instead of being fetched and injected as arbitrary page JavaScript.

The boring problems arrived quickly: scrolling; typing; contenteditable fields; iframes; shadow DOM; DOM-to-file capture; workflow search; interpreter behavior. None of this looked like a Playwright repo yet. It was just the work needed to make a browser extension drive real pages without pretending it had the powers of a Node process.

The runtime also moved from Wang toward jslike, keeping execution local and interpreter-based while making workflow code closer to normal JavaScript.

100xui made the CSP issue concrete. It matched URLs; resolved selectors; highlighted elements; injected content; attached menus; triggered workflows; and automated page interactions. Its portal renderer moved away from HTML templates and toward hyperscript m(), so event handlers attach with addEventListener rather than inline attributes. The same constraint kept showing up: do useful page automation without pretending CSP does not exist.

The conversation history around this period was about URL-matched UI injection; menu actions; background messages; interpreter execution; CSP-safe event handling; and iframe/shadow-DOM selector chains. None of that was Playwright compatibility. It was the native 100xbot answer to browser automation under extension constraints.

Later, the design shifted again. The app-framework work proposed giving workflows a browser facade instead of making agents juggle low-level capability names. One correction from that thread stuck: page should not appear magically. Workflow code should acquire it from a tab handle:

const page = await browser.page(__inputs.tabId);

The tool-count work pushed the same conclusion from another angle. There were too many dom_* and tab_* tools in the prompt-visible tool list, but the raw count was only part of the problem. The model also had to learn the contract for each tool:

which inputs belonged at the outer tool-call level
which values belonged inside workflow_execute
when a result was wrapped in { success, data }
when a direct workflow call returned a plain string, array, or object
when a page handle was available
when a numeric tabId was still required

That made browser automation less predictable than it needed to be. The model would sometimes mix a PlaywrightLikePage with raw dom_* calls, or read .success and .data from a value that was already unwrapped inside workflow code. The pattern was consistent: we were asking the model to keep a private execution model in its head while its training prior kept pulling it toward Playwright-style browser code.

PlaywrightLikePage already existed as a facade over the raw capabilities, and workflow_execute already gave the runtime a place to run multi-step browser code. The design direction became:

hide most raw DOM tools from the prompt
keep them internally available
route composed browser automation through workflow_execute

The before/after was simple:

before: ask the model to choose among many browser tools and remember each input/output contract
after: ask the model to write browser code against a familiar page/locator API, then let the runtime map that code onto extension capabilities

By April 30, that facade had been tested in ordinary browser tasks. A live workflow used:

browser.page(tabId)
locator(...)
fill(...)
press("Enter")

against Google search. That was still not "run a Playwright repo from GitHub." It was proof that the Playwright-like page abstraction could drive a real tab through the existing workflow runtime.

By the time Playwright compatibility work started, 100xbot already had content-script and background execution paths; DOM/tab/file/CDP capabilities; workflow_execute as the generic execution entry point; WangWorkflowExecutor running jslike; IndexedDB-backed file storage; selector resolution across normal DOM/iframe/shadow DOM contexts; and incident history around content-script lifecycle; sandbox APIs; illegal invocation errors; and navigation survival.

Playwright had to fit into that platform, not replace it.

Bringing Playwright In

The Playwright work asked a narrower runtime question:

Can the existing browser-extension runtime accept a Playwright-style project from GitHub and execute the parts that map cleanly onto 100xbot's local capabilities?

The first version of that plan proposed a new playwright_run capability. That was the wrong abstraction. Playwright is a library API that code may use. It is not the executor. Keeping workflow_execute as the executor also kept the LLM contract smaller: write code in one place, use familiar browser primitives, and let the runtime handle the bridge to extension capabilities.

The better design kept the existing executor. workflow_execute stayed the generic code execution capability. Project mode added inputs such as githubUrl; repoUrl; ref; projectPrefix; entry; testMatch; and baseURL. GitHub repositories came in as file data. Repository files executed through the existing WangWorkflowExecutor and jslike runtime. @playwright/test resolved to a native facade backed by 100xbot's Playwright-like browser/page/context APIs. Missing browser behavior belonged in shared capability/runtime layers.

In tracker form, the path became:

GitHub repository
  -> GitHubRepoSync
  -> IndexedDB file_storage
  -> generated project wrapper
  -> WangWorkflowExecutor + jslike
  -> @playwright/test facade
  -> Playwright-like browser/page/locator/context objects
  -> extension DOM, tab, and CDP capabilities

100xbot does not bypass the extension security model with remote executable scripts. It accepts Playwright projects as data; stores those files locally in the extension runtime; runs the parts it can express; and records the gaps.

Importing a GitHub repo into the runtime

The import layer lives in GitHubRepoSync. It parses GitHub URLs; fetches the repo tree through the GitHub API; skips paths that do not belong in a browser-side project cache; fetches raw text files; and stores them under a deterministic prefix:

playwright-projects/{owner}/{repo}/{path}

The default exclusions are intentionally practical: node_modules; .git; Playwright reports; test results; build artifacts; caches; videos; traces; coverage; and common binary/archive formats. Existing files are reused instead of downloaded again, so repeated runs do not keep refetching the same repository content.

workflow_execute then prepares project execution:

If githubUrl or repoUrl is present, import the repository into file storage.
If the project has no .env but does have an example env file, create the root .env in file storage before execution.
If an explicit entry is provided, import that file.
Otherwise, discover setup files and .spec / .test files under the project prefix.
Apply testMatch as the run filter.
Generate a wrapper module that imports the config, setup files, and test files; then returns the accumulated test results.

The runner does not shell out to npm install; clone onto local disk; start a Playwright process; or inject remote script tags. Imported repo files are data until the controlled workflow runtime parses and executes them.

What Real Repos Broke

The first temptation was to add methods to PlaywrightLikePage and call it compatibility. Real repositories made that look naive.

The repos exercised the test framework: test; test.describe; hooks; test.use; test.extend; test.step; retries; skips; expected failures; and serial behavior. They also used browser/context/page objects such as browser.newContext(); project config such as baseURL and storage state; page objects; TypeScript imports; locators like getByRole; getByText; getByLabel; and getByTestId; assertion APIs like toHaveText; toContainText; and toHaveURL; routing; HAR replay; fake clocks; screenshots; downloads; and file inputs.

So the work spread across the test framework; module resolver; browser/page/context objects; locators; content selector resolver; workflow executor; and upstream jslike.

The small packages mattered too

Some of the compatibility work had nothing to do with Playwright APIs. Real Playwright repos also import the small packages test suites tend to use: dotenv; dotenv/config; uuid; and @faker-js/faker. Those imports are boring in Node. In 100xbot they matter because there is no npm install step and no remote package execution. The module resolver has to map known package names to browser-safe local implementations or facades.

That is part of the same LLM problem. A model will naturally preserve or write imports like dotenv, faker, and uuid when it is producing Playwright-style test code. Telling it not to use common test dependencies is another prompt tax. Handling the common ones in the runtime made imported repos and generated workflows behave more like the code the model already wanted to write.

One example: a later repo failed on this pattern:

page.getByLabel("nav-categories").getByText("Hand Tools")

Native Playwright matched an element with aria-label="nav-categories". Our first getByLabel() only handled form-label semantics. The fix was not in the test. The compatibility layer had to support ARIA label semantics too.

Another example: a repo configured:

use: {
  testIdAttribute: "data-test"
}

Native Playwright carries that into page.getByTestId(). The first 100xbot path still looked for data-testid, which meant it was polling the wrong selector. The fix was to make testIdAttribute part of the page/context contract and preserve it through chained locators.

Most failures were not exotic. They came from defaults that ordinary Playwright projects assume.

What We Ran

The work started with user-supplied public repositories:

microsoft/playwright-examples
ecureuill/saucedemo-playwright
blueimp/playwright-example
later checks against test-to-exist/the-internet-playwright; hmcts/tcoe-playwright-example; clerk/playwright-e2e-template; and hathihuyen/ecom-test-project-playwright

Three of them drove the implementation.

SauceDemo

ecureuill/saucedemo-playwright looked like an app test suite: page objects; login-ish flows; selectors; fixtures; and state.

It also showed why public demo apps are noisy baselines. A repo can fail in native Playwright because the demo site changed. Once that happened, it stopped being a clean compatibility signal, so we kept the learnings and moved the baseline elsewhere.

The Internet

test-to-exist/the-internet-playwright became the primary comparison target.

Native Chromium baseline from May 8, 2026:

34 Chromium tests discovered
0 unexpected failures/timeouts
0 flaky
0 skipped

100xbot rerun from May 8, 2026:

34 tests executed
32 passed
2 failed
1 status mismatch against expected status
.env was created from example.env in file storage before execution

The remaining useful mismatch was geolocation:

Test: geolocation.spec.ts / Should access and show users geolocation
Native status: passed
100xbot status: failed
Reason: browser-context permission parity is not available in the current extension runtime

Playwright grants permissions at the browser-context level. 100xbot can apply Emulation.setGeolocationOverride to a page target, and it can try chrome.contentSettings.location.

The verified extension runtime did not expose a browser CDP target for full Browser.grantPermissions parity. We did not add a page-script shim for geolocation prompts. If this gets fixed, it should live in the shared browser/context capability layer.

Microsoft Playwright Examples

microsoft/playwright-examples gave a different kind of signal: API mocking; route handling; HAR replay; and clock behavior.

Native Chromium result from May 8, 2026:

15 tests executed
7 passed
8 failed
The 8 failures were boxed-step / boxed-step-POM tests with net::ERR_NAME_NOT_RESOLVED for https://cloudtesting.contosotraders.com/

100xbot Chrome rerun after route/clock/HAR fixes:

15 tests executed
7 passed
8 failed
The 8 failures matched the same native DNS failure class

The useful part was the comparison: the failures matched native Chromium. The API mocking tests passed. The clock tests passed. The remaining failures came from the sample site's DNS behavior, not from 100xbot.

How It Got Here

The work did not move in a straight line from "browser extension" to "Playwright." First came tab automation and explicit DOM capabilities. Then came content-script execution. Then iframe and shadow-DOM handling; file-backed DOM capture; workflow search; and the move from Wang to jslike.

100xui brought the same runtime into page-local automation. The app-framework work then put a familiar browser/page facade over the lower-level tools. The next step was to hide most raw DOM tools from the prompt-visible tool list and push composed browser work through workflow_execute.

Only after that did GitHub-imported Playwright projects become a realistic target. The Playwright-specific work added repo sync; project-mode execution; module resolution; @playwright/test facades; fixture behavior; stronger expect handling; route primitives; screenshots; environment-file setup; selector fixes; and a tracker for parity runs.

Boundaries

These choices kept the implementation tied to the runtime instead of a demo-only shim.

workflow_execute remains the generic execution entry point; there is no separate Playwright-specific runner.

The capability name stayed framework-neutral.

The geolocation failure is tracked as a browser-context permission gap instead of being hidden with page-script tricks.

Multi-browser parity is out of scope for the current work. 100xbot runs in a controlled Chromium extension runtime today. Native Playwright can run across Chromium; Firefox; WebKit; branded Chrome; branded Edge; and device configurations.

Artifact parity is separate. Native Playwright has reporters; traces; videos; downloads; HAR files; and test-results layouts. 100xbot can return test result objects and some artifacts; byte-compatible Playwright output is a separate problem.

Third-party repo failures were not treated as 100xbot failures until native Playwright passed the same:

repo
browser/project selection
environment
test selection

Compatibility rule

The tracker now has a simple rule:

Do not claim a repo is supported until there is a stored native Playwright baseline and a stored 100xbot run for the same:

project

environment

test selection

browser/project selection

For now, comparisons are against native Chromium unless 100xbot grows explicit multi-browser project support.

The comparison fields are:

file path
test title
expected status
actual status
retry number
failure message
artifacts when relevant

That rule keeps the claim tied to actual runs.

What This Gives 100xbot

Running Playwright projects from GitHub inside a browser extension is not a replacement for Playwright; CI; browser-process ownership; or multi-engine testing.

For 100xbot, existing test automation code can become browser-agent executable knowledge.

A GitHub repo can contain real workflows; selectors; page objects; fixtures; route mocks; and assertions. Once 100xbot can import the repo and resolve modules and execute tests against real browser tabs, the browser agent can reuse the testing ecosystem instead of asking users to rewrite everything as custom automation snippets.

The browser is a bad place to reimplement Playwright. Extension CSP also blocks the lazy version of "just load remote code and run it." The practical path is narrower: keep execution inside a local extension runtime; import GitHub projects as files; resolve known module APIs; and map browser automation calls onto shared 100xbot capabilities.

The gaps are still real:

browser-context permission grants
multi-browser projects
stronger context isolation
complete timeout and actionability semantics
full Node/npm package emulation
CommonJS and dynamic import behavior
reporter and artifact parity
video, tracing, and exact HAR/report layouts

The current state: 100xbot can import Playwright projects from GitHub; store them in browser-side file storage; discover and execute project test files through workflow_execute; run Playwright-style @playwright/test code through a compatibility facade; match native Chromium results on one public sample; and get within one known browser-context permission gap on the primary reference sample.

So the claim is not "Playwright in the browser." The narrower claim is:

100xbot can run Playwright-style projects from GitHub through a browser-extension-native execution stack.

The LLM lesson is the bigger one. Predictability improved when the prompt stopped asking the model to learn a private browser-automation API and the runtime absorbed that translation instead. The model writes the kind of browser code it already tends to write; 100xbot makes that code run inside the extension runtime.

artpar/playwright-execution-in-browser.md

Select an option

No results found