usirin/2026-03-14-a11y-automation-design.md

Created March 14, 2026 06:46

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/usirin/bd178c76be76d2230dc4fea37415bca5.js"></script>
Save usirin/bd178c76be76d2230dc4fea37415bca5 to your computer and use it in GitHub Desktop.

Download ZIP

Cross-Platform Accessibility Automation — Design Doc

Raw

2026-03-14-a11y-automation-design.md

Cross-Platform Accessibility Automation

Problem

PlusQA's accessibility audit has surfaced 382 [A11y] tickets (335 open) across iOS (31%), Android (36%), and web/desktop (22%). Fixing these manually requires engineers to navigate to hard-to-reach screens, understand WCAG criteria, make the fix, then verify with a screen reader. This doesn't scale — feature teams shouldn't be spending cycles on mechanical a11y prop additions when 70% of fixes are templatable.

Approach: A Collection of Primitives

Not a framework. Not a monolithic system. A set of composable primitives that:

Read a ticket and locate the code to fix
Fix the code (template or LLM-generated)
Verify the fix with deterministic property checks
Ship the fix as a PR

Each primitive is independently useful from day 1. The LLM (Claude Code) is the glue that composes them, not the brain that makes judgment calls.

Audit Breakdown

WCAG Category	Count	%	Fix template
1.3.1 Info and Relationships	93	24%	`add_role`
2.4.3 Focus Order	64	17%	`fix_focus_order`
4.1.2 Name, Role, Value	47	12%	`add_label` / `add_role`
2.4.6 Headings and Labels	33	9%	`add_heading`
1.4.4 Resize Text	30	8%	`fix_truncation`
1.1.1 Non-text Content	27	7%	`add_alt_text`
1.4.3 Contrast Minimum	21	6%	`fix_contrast`
Other	67	17%	`custom`

~70% of fixes are mechanical: add an accessibilityLabel, set a role, mark a heading. These are AST transforms, not creative work.

Architecture

TICKET (Asana)
  │
  ├─ INGEST: parse WCAG, platform, search terms (deterministic)
  │
  ├─ LOCATE: grep → LSP hover/incomingCalls/findReferences → module graph
  │   │       (deterministic — finds files, types, screen, blast radius)
  │   │
  │   └─ Cap comment (if available): skip locate, use provided file paths
  │
  ├─ FIX: template transform or LLM code gen
  │   │   (deterministic for templates, LLM for ~30% custom fixes)
  │   │
  │   └─ LSP diagnostics: type-check the fix before building
  │
  ├─ VERIFY: property-based checks against platform-native a11y tree
  │   │
  │   ├─ T1 (static): LSP diagnostics + AST prop check — no build needed
  │   ├─ T2 (a11y tree): build → navigate → snapshot → property check
  │   └─ T3 (visual): T2 + screenshot at 200% zoom + contrast check
  │
  └─ SHIP: git branch → commit → PR → update Asana

Property-Based Verification

Core insight

Define universal accessibility invariants (properties that must always hold), implement them per-platform against raw native data. The interface is universal, the implementations are platform-specific.

Verification protocol

1. Navigate to screen
2. Run properties → violations_before
3. Assert: target violation EXISTS        ← confirms bug is real
4. Apply fix, hot reload (1-3 seconds)
5. Run properties → violations_after
6. Assert: target violation GONE          ← confirms fix works
7. Assert: no NEW violations              ← no regressions

Step 3 is critical — if properties can't detect the bug before the fix, fail fast. Step 7 is free regression detection.

Platform interface

interface AccessibilityPlatform<V> {
  name: string

  checks: {
    interactiveElementsHaveLabels(): V[]  // WCAG 4.1.2
    interactiveElementsHaveRoles(): V[]   // WCAG 4.1.2
    imagesAreAccessible(): V[]            // WCAG 1.1.1
    headingsAreMarked(): V[]              // WCAG 2.4.6
    focusOrderMatchesVisualOrder(): V[]   // WCAG 2.4.3
    textMeetsContrastMinimum(): V[]       // WCAG 1.4.3
    noTruncationAtZoom(): V[]             // WCAG 1.4.4
    statefulElementsExposeState(): V[]    // WCAG 4.1.2
  }

  navigate: {
    openUrl(url: string): Promise<void>
    tap(query: PlatformQuery): Promise<void>
    snapshot(): Promise<PlatformTree>
    screenshot(): Promise<Buffer>
  }
}

Each platform (web, ios, android) implements checks against its raw native data — no lossy normalization into a universal tree format. Web checks work with Playwright's ARIA snapshot (knows about heading levels, aria-labelledby, computed accessible names). iOS checks work with WDA source trees (knows about trait bitmasks, accessibilityElements containers). Android checks work with UIAutomator XML (knows about contentDescription vs text, labelFor relationships).

Violation types are platform-native too

Each platform defines its own violation shape with full platform fidelity. The pipeline only needs didFixWork<V>() — a boolean.

function didFixWork<V>(before: V[], after: V[]): boolean {
  return before.length > 0              // bug existed
    && after.length < before.length     // bug fixed
    && !hasNewViolations(before, after) // no regressions
}

Ticket Ingestion

Self-derived (works for 100% of tickets)

1. Parse WCAG criterion from notes (regex)
2. Parse platform from custom field
3. Map WCAG → fix template (lookup table)
4. Extract search terms from title
5. grep codebase → LSP hover/incomingCalls → locate file + screen
6. Map template → property assertion (lookup table)

Cap-enhanced (works for 8.4% of tickets, growing)

Cap provides exact file paths, root cause, and fix direction. When present, skip step 5 entirely. Cap's structured output could be enhanced to include navigation routes and assertion specs — see "Cap Integration" section.

Navigation

Mobile

Two mechanisms, layered:

Deep links (~50 LinkingTypes in ConstantsIOS.tsx): parsed in parseURL.tsx, dispatched in handleSupportedURL.tsx. Covers main screens, settings (discord://feature/<name>), channels.
Tap sequences: for modals, action sheets, and screens without deep links. Replay steps using a11y labels from the tree. Fragile but necessary — the button-migration-routes.json pattern already proved this works at scale (175 routes mapped).
Unreachable screens: require specific user state (e.g. active raid, monetization-enabled guild). Document and skip. PlusQA reached them manually — we can't always replicate.

Key files: parseURL.tsx, handleSupportedURL.tsx, ConstantsIOS.tsx, SettingsConstants.tsx

Web

Direct URL navigation via Playwright. Routes defined in RouteConstants.tsx (~100+ paths), rendered in ViewsWithMainInterface.tsx.

Build & Hot Reload

A11y fixes are JS-only changes (adding props). No native rebuild needed.

Platform	Dev server	Incremental rebuild	Mechanism
iOS (Metro)	`clyde mobile watch` on :8081	1-3 seconds	Watchman → Babel transform → Fast Refresh via websocket
Android (Metro)	`clyde android watch` on :8081	1-3 seconds	Same Metro server
Web (rspack)	`clyde app watch` on :3333	0.5-2 seconds	rspack HMR → React Fast Refresh via websocket

Dev servers run as persistent background processes. The pipeline connects to them — doesn't start/stop them.

Module Graph

Both Metro (mobile) and rspack (web) can dump dependency graphs for LSP incomingCalls-style tracing:

Metro: METRO_DUMP_GRAPH=/path/to/graph.json clyde mobile watch (already exists)
rspack: ~30-line plugin using stats.toJson({modules: true, reasons: true}) in compiler.hooks.afterDone. The web-bundler already has --analyze and --statoscope flags that dump stats; just need a Metro-compatible format adapter.

Own Primitives (No External Deps)

We build our own iOS/Android/web primitives. mobile-mcp is a reference implementation, not a dependency.

Platform	Tree source	Interaction	What we call
Web	Playwright `page.accessibility.snapshot()`	Playwright API	Direct
iOS	WebDriverAgent HTTP API (`/source`, `/screenshot`, `/tap`)	WDA (local HTTP server on sim)	Direct
Android	`adb shell uiautomator dump` + `adb exec-out screencap`	`adb shell input tap`	Direct

Owning the primitives gives us full control over what data gets extracted (no filtering), how trees are queried, and how the tools evolve.

Cap Integration

Cap already provides high-quality triage for 8.4% of tickets. Proposing Cap output a structured spec alongside its existing natural language comment:

{
  "fix": {
    "template": "add_label",
    "target": { "file": "...", "component": "...", "line": 99 },
    "props": { "accessibilityLabel": { "value": "...", "source": "..." } }
  },
  "navigation": {
    "deep_link": "discord://...",
    "expected_screen": { "indicator": { "label": "...", "role": "header" } }
  },
  "assertion": {
    "property": "interactiveElementsHaveLabels",
    "target": { "role": "button", "within": { "label": "..." } }
  },
  "tier": 2
}

This eliminates the only non-deterministic step (parsing NL → fix spec) for Cap-triaged tickets.

Parallelism

Platform	Method	Concurrency
Web	Playwright browser contexts	Unlimited (trivially parallel)
iOS	Multiple simulators via `xcrun simctl` (21 devices available)	3-5 sims sharing 1 Metro server
Android	Multiple emulators with port isolation (built into clyde)	3-5 emulators sharing 1 Metro server
T1 (static)	No device needed — just LSP	Unlimited (spawn Claude Code agents)

Primitives Inventory

#	Primitive	What it does	Effort
1	`web.propertyChecker`	Playwright a11y snapshot → property checks	Small
2	`ios.treeReader`	WDA `/source` → raw a11y tree with hierarchy/traits	Small-Medium
3	`ios.propertyChecker`	iOS tree → property checks	Small
4	`android.treeReader`	`adb uiautomator dump` → parsed hierarchy	Small
5	`android.propertyChecker`	Android tree → property checks	Small
6	`asana.ingest`	Parse ticket → WCAG, platform, template, search terms	Small
7	`code.locate`	grep + LSP + module graph → file, line, component, screen	Medium
8	`code.fix`	Template AST transform or LLM code gen	Medium
9	`code.verify`	LSP diagnostics post-fix	Small
10	`nav.deepLink`	`parseURL.tsx` + `handleSupportedURL.tsx` → static map	Medium
11	`nav.tapSequence`	Replay a11y-label-based tap steps	Small (pattern exists)
12	`rspack.graphDump`	`stats.toJson()` plugin for web module graph	Small (30 lines)
13	`pipeline.orchestrate`	Compose all primitives: ingest → locate → fix → verify → PR	Medium

Execution Plan

Phase 1: First tickets closed (days 1-3)

Build web.propertyChecker — Playwright adapter, implement interactiveElementsHaveLabels and interactiveElementsHaveRoles
Build asana.ingest — parse ticket fields
Build code.locate (web subset) — grep + LSP for web components
Close first web a11y tickets with T1 validation (LSP diagnostics + AST check)

Phase 2: Mobile (days 4-7)

Build ios.treeReader — WDA raw source tree extraction
Build ios.propertyChecker — implement checks against iOS traits
Build nav.deepLink — parse handleSupportedURL into static map
Close first iOS tickets with T2 validation

Phase 3: Pipeline & scale (week 2)

Build pipeline.orchestrate — full Asana → PR loop
Build android.treeReader + android.propertyChecker
Build rspack.graphDump
Parallel execution across multiple simulators

Phase 4: Regression prevention (week 3)

CI integration — run property checks on changed screens per PR
Nightly full-screen sweep — catch regressions before PlusQA does

Metrics

Tickets closed per week
% of audit resolved
Time per fix (ingest → PR)
Regression catch rate (violations caught in CI before manual report)
Feature team hours saved (tickets that would have been assigned to them)

Open Questions

None — all resolved during design.

References

Asana project: https://app.asana.com/1/236888843494340/project/1212709914769994/list/1212709939264532
Button migration route registry: misc/users/usirin/button-migration-routes.json
Button migration RFC: https://www.notion.so/312f46fd48aa8147b41bf701f505108d
Discord IC Level Matrix: https://www.notion.so/18b82dd0f8264db0879d5f1ae6aaf857
mobile-mcp (reference): https://github.com/mobile-next/mobile-mcp

usirin/2026-03-14-a11y-automation-design.md

Select an option

No results found

Select an option

No results found