Skip to content

Instantly share code, notes, and snippets.

@usirin
Created April 14, 2026 23:01
Show Gist options
  • Select an option

  • Save usirin/5fd1d4599adaa330b7f3344331d10180 to your computer and use it in GitHub Desktop.

Select an option

Save usirin/5fd1d4599adaa330b7f3344331d10180 to your computer and use it in GitHub Desktop.
Research: Programmatic VoiceOver Control on iOS Simulator

Programmatic VoiceOver Control on iOS Simulator

Executive Summary

VoiceOver can be automated on the iOS Simulator, but no single tool or API provides an end-to-end solution. This research investigated five dimensions of the problem: enabling/disabling VoiceOver, sending navigation commands, reading focus state, existing frameworks, and React Native's accessibility APIs. The core finding is that a viable automation stack exists by combining xcrun simctl (VoiceOver lifecycle), AppleScript keystroke injection (navigation), and the macOS AXUIElement API (tree inspection), but reading VoiceOver cursor position remains the hardest unsolved problem. No existing framework (Apple, Google, Deque, or open-source) automates VoiceOver itself. Every tool in the ecosystem validates accessibility metadata, not screen reader behavior. This gap is real and unaddressed.

Recommended architecture: A three-layer system combining (1) defaults write + launchctl for VoiceOver lifecycle control, (2) osascript keystroke injection for navigation, and (3) an in-app debug server calling UIAccessibility.focusedElement(using:) for focus state reads. For React Native apps, this is the only path to automated VoiceOver integration testing.


1. VoiceOver Lifecycle Control

VoiceOver can be enabled and disabled programmatically on the iOS Simulator using a two-step approach.

Working Method

# Enable
xcrun simctl spawn booted defaults write com.apple.Accessibility VoiceOverTouchEnabled -bool true
xcrun simctl spawn booted launchctl start com.apple.VoiceOverTouch

# Disable
xcrun simctl spawn booted defaults write com.apple.Accessibility VoiceOverTouchEnabled -bool false
xcrun simctl spawn booted launchctl stop com.apple.VoiceOverTouch

# Verify (look for PID field when running)
xcrun simctl spawn booted launchctl list com.apple.VoiceOverTouch

Both steps are required. Setting the preference alone does not start VoiceOver because VoiceOverTouch is a Mach on-demand service. On real devices, setting the preference triggers a Mach connection from AccessibilityUIServer that starts VoiceOverTouch automatically. On the simulator, this activation pathway is broken. launchctl start forces the daemon to launch regardless.

Methods Tested and Rejected

Eight alternative approaches were tested: preference-only (no launchctl), Darwin notification posts (notifyutil), AccessibilityUIServer restart, boot with --enabledJob flag, preference pre-set before boot, accessibility shortcut (triple-click side button via idb), Simulator.app menu/AppleScript, and URL scheme to Settings.app. None successfully toggled VoiceOver.

Simulator vs Device Differences

The VoiceOverTouch binary runs on the simulator and intercepts gestures. The VoiceOver welcome dialog appears on first activation. However: the activation pathway differs (manual launchctl start required), speech output may need separate audio configuration, and the accessibility shortcut (triple-click side button) does not work. Most online documentation (including Apple Developer Forums) incorrectly states VoiceOver is unavailable on the simulator. This is outdated as of iOS 18+.


2. Sending Navigation Commands

No single clean API exists for sending VoiceOver navigation commands. Five approaches were identified, each with different tradeoffs.

Approach A: macOS VoiceOver + AppleScript Keystroke Injection (Most Promising)

When macOS VoiceOver is active (Cmd+F5), it treats the Simulator window as a native macOS view. Navigation commands are sent by injecting VoiceOver keyboard shortcuts via System Events:

# Next element (VO + Right Arrow)
osascript -e 'tell application "System Events" to key code 124 using {control down, option down}'

# Activate (VO + Space)
osascript -e 'tell application "System Events" to key code 49 using {control down, option down}'

Key commands: next element (VO+Right), previous element (VO+Left), activate (VO+Space), interact with group (VO+Shift+Down), stop interacting (VO+Shift+Up), read all (VO+A).

Critical limitation: Requires foreground GUI focus on the Simulator window. Non-headless.

Approach B: iOS VoiceOver + External Keyboard Commands

When iOS VoiceOver is enabled inside the simulator (not macOS VoiceOver), Connect Hardware Keyboard (Cmd+Shift+K) maps the Mac keyboard as an external keyboard. The same VO modifier (Control+Option) works. Quick Nav mode (toggle with Left+Right Arrow simultaneously) enables single-arrow navigation without the VO modifier.

Scriptable via the same osascript keystroke injection, but targeting iOS VoiceOver instead of macOS VoiceOver.

Approach C: In-App UIAccessibility.post(notification:argument:)

From within the app, UIAccessibility.post(notification: .screenChanged, argument: targetView) moves VoiceOver focus to a specific element. Useful for verifying focus lands correctly after actions, but does not simulate user navigation traversal. Timing is important: notifications are processed asynchronously and may be ignored if posted before UIKit finishes rendering.

Approach D: XCUITest

Can query the accessibility tree but cannot simulate VoiceOver gestures or read focus state. performAccessibilityAudit (iOS 17+) checks static properties only. Useful for metadata validation, not navigation testing.

Approach E: Accessibility Inspector

GUI-only tool with "Auto Navigate" that walks VoiceOver reading order. No CLI, no AppleScript dictionary, no automation API. The com.apple.private.accessibility.inspection entitlement blocks third-party use of its underlying framework.


3. Reading VoiceOver Focus State

This is the hardest problem. The simulator exposes the full iOS accessibility tree to macOS via AXUIElement, but VoiceOver cursor position is not available through any external API.

What the AXUIElement Bridge Exposes

The iOS content appears under a group with AXSubrole = "iOSContentGroup" in the Simulator's AXUIElement hierarchy. Each iOS element exposes: AXDescription (label), AXValue, AXRole, AXIdentifier, AXFocused, AXSelected, AXFrame, AXEnabled, AXCustomActions, AXChildrenInNavigationOrder, and more.

What AXFocused Does NOT Represent

AXFocused reflects keyboard/system focus, not VoiceOver cursor position. When VoiceOver focuses a button, that button's AXFocused remains false unless it also has keyboard focus. Probing for VoiceOver-specific attributes (AXAccessibilityFocused, AXVoiceOverFocused, AXAssistiveTechnologyFocused) returns error -25212 (not supported).

AXObserver Notifications

AXObserver can watch for kAXFocusedUIElementChangedNotification on the Simulator process. These fire for keyboard focus changes but likely not for VoiceOver cursor movement.

What simctl and MCP Tools Provide

xcrun simctl has no accessibility inspection subcommand. xcrun simctl ui only supports appearance, contrast, and content size. The ios-simulator MCP ui_describe_all returns the full accessibility tree but omits focus/selected state.

Ranked Approaches for Focus State Reading

Approach Accuracy Requires App Modification?
In-app debug server calling UIAccessibility.focusedElement(using: .notificationVoiceOver) Exact Yes
In-app elementFocusedNotification listener writing to file/IPC Exact Yes
Position-based inference (count gestures, index into AXChildrenInNavigationOrder) High No
Visual detection of VoiceOver focus outline in screenshots Medium No
AXUIElement AXFocused property Keyboard focus only No

The in-app debug server is the most reliable approach for owned apps. It calls UIAccessibility.focusedElement(using: .notificationVoiceOver) (the only API that returns the actual VoiceOver-focused element) and exposes the result via HTTP/WebSocket. This requires modifying the app under test.

For fully external observation, position-based inference using AXChildrenInNavigationOrder is the best option: read the navigation order, send N "next element" gestures, and the focused element is the (N+1)th element.


4. Existing Framework Landscape

Every framework validates accessibility metadata. None automate VoiceOver.

Apple Native

  • XCUITest: Queries the accessibility tree. performAccessibilityAudit(for:) (iOS 17+) automates Accessibility Inspector checks (contrast, element description, hit region, dynamic type, text clipping, traits). Audits are limited to on-screen elements and static properties.
  • Accessibility Inspector: GUI-only tool with inspection, audit, and Auto Navigate modes. No CLI or scripting interface. Cannot be automated.

Google

  • GTXiLib: Hooks into XCTest teardown for automated checks (label presence, trait conflicts, tap target size, contrast). Last release July 2021. Effectively unmaintained.
  • EarlGrey 2: Combines EarlGrey's synchronization with XCUITest. Uses accessibility properties for element selection. Pairs with GTXiLib but does not add accessibility validation.

Facebook

  • idb: CLI automation for simulators. GitHub issue #792 (August 2022) requests VoiceOver toggle. Unresolved. Can write accessibility preferences but cannot start/stop launchd services.

Commercial

  • Deque axe DevTools: Automated WCAG scanning via XCUITest integration. Supports UIKit, SwiftUI, React Native, Flutter. Results upload to dashboard. Does not simulate VoiceOver.

Open Source

  • CashApp AccessibilitySnapshot: Snapshot testing of the accessibility hierarchy. Captures labels, traits, and activation points as visual images for diff comparison. Actively maintained (641 stars, last release April 2026). Best tool for accessibility regression testing.
  • A11yUITests: Archived December 2025. Maintainer recommends AccessibilitySnapshot and Reveal instead.
  • Maestro: E2E testing framework that interacts via the accessibility layer. Tests implicitly surface missing labels but do not test VoiceOver.

React Native Ecosystem

  • react-native-testing-library: Accessibility queries (*ByRole, *ByLabelText, *ByHintText) and matchers (toHaveAccessibilityState, toHaveAccessibilityValue). Runs in JS test environment, not on device. Cannot test iOS behavior.
  • react-native-accessibility-engine: Jest .toBeAccessible() matcher with 11 rules. JavaScript-only validation.
  • Detox: Gray-box E2E using accessibility identifiers. toBeFocused() tests keyboard focus, not VoiceOver focus. No VoiceOver simulation.

5. React Native Accessibility Focus APIs

Programmatic Focus Setting

Two APIs exist:

  • AccessibilityInfo.setAccessibilityFocus(reactTag) (deprecated): requires findNodeHandle, being removed in New Architecture
  • AccessibilityInfo.sendAccessibilityEvent(ref, 'focus') (current): works with Fabric, no findNodeHandle needed

Neither can be triggered from JS tests to simulate real VoiceOver focus. They send native notifications that are no-ops without a running screen reader.

FlatList Virtualization vs VoiceOver

FlatList's virtualization fundamentally conflicts with VoiceOver's scroll-into-view behavior. Items removed from the native hierarchy cannot be discovered by VoiceOver. React Native issue #23140 (filed January 2019) remains unresolved. Additional bugs: wrong reading order in horizontal/multi-column FlatLists (#48028, #28299), focus lost on horizontal scroll (#41566).

Workarounds: use ScrollView for small accessibility-critical lists, implement custom accessibilityActions with increment/decrement for manual scroll announcements.

Discord-Specific Infrastructure

Discord has custom native modules beyond stock React Native:

  • AccessibilityFocusView: native component providing onAccessibilityFocus and onAccessibilityBlur callbacks
  • NativeDeviceAccessibilityModule: Android crash workaround for setAccessibilityFocus
  • useAccessibilityNativeStackFocusTracking: saves/restores focus across navigation transitions
  • experimental_accessibilityOrder (RN 0.82+): declarative focus traversal order prop, snapshot-testable but not verifiable without a real screen reader

6. Recommended Architecture

A three-layer automation stack:

Layer 1: VoiceOver Lifecycle (simctl + launchctl)

defaults write → launchctl start/stop

Fully scriptable, no GUI required. Wrap in a CLI tool that handles the two-step enable/disable and state verification via launchctl list.

Layer 2: Navigation Commands (osascript keystroke injection)

osascript → System Events → Simulator window

Requires foreground focus on Simulator. Use iOS VoiceOver + external keyboard commands (not macOS VoiceOver) for closer parity with real device behavior. Quick Nav mode simplifies navigation to arrow keys.

Layer 3: Focus State Reading (in-app debug server)

App-embedded HTTP endpoint → UIAccessibility.focusedElement(using: .notificationVoiceOver)

The only way to read actual VoiceOver cursor position. Embed a lightweight server in debug builds. For CI, use the elementFocusedNotification listener that writes focus changes to a log file readable by the test harness.

Fallback for unmodified apps: Position-based inference using AXChildrenInNavigationOrder from the AXUIElement tree.

Trade-offs

Property Status
VoiceOver enable/disable Solved (simctl + launchctl)
Navigation commands Solved but requires GUI focus (osascript)
Focus state reading Solved for owned apps (in-app server); inference-only for third-party
Headless execution Not possible (keystroke injection needs foreground)
CI compatibility Possible with macOS runners that have GUI sessions

Next Steps

  1. Build a CLI wrapper around the enable/disable commands with state verification
  2. Prototype the in-app debug server as a React Native native module that exposes VoiceOver focus via HTTP
  3. Prototype keystroke-based navigation with timing calibration (how long to wait between gestures for VoiceOver to settle)
  4. Evaluate position-based inference accuracy against the in-app server ground truth
  5. Test CI feasibility on macOS runners (Buildkite) with GUI session access

Consolidated Sources

Direct Experimentation

  • iOS 26.2 Simulator (iPhone 17, macOS Darwin 24.6.0, Apple Silicon)
  • Swift AXUIElement test programs (ApplicationServices API)
  • AppleScript System Events queries on Simulator process
  • AXObserver notification probing on Simulator PID
  • xcrun simctl spawn defaults/launchctl command testing

Apple Documentation

Developer Guides and Blogs

VoiceOver Command References

Tools and Frameworks

React Native

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment