Skip to content

Instantly share code, notes, and snippets.

@shreeve
Last active April 20, 2026 09:15
Show Gist options
  • Select an option

  • Save shreeve/a143154fd39c357b90eecd310ae86171 to your computer and use it in GitHub Desktop.

Select an option

Save shreeve/a143154fd39c357b90eecd310ae86171 to your computer and use it in GitHub Desktop.
CYW43439 (Raspberry Pi Pico W) WiFi driver: low-level bring-up notes, a pure-Zig rewrite plan, and PMKSA cache / ICV_ERROR reliability research

CYW43439 PMKSA Research — verified findings

Purpose. This is the pre-Phase-3 research artifact mandated by docs/CYW43-REWRITE.md §5.9.2. It documents the exact iovar, wire-format, chip-compatibility, and reliability-context for PMKSA cache management on the CYW43439 chip as used in Raspberry Pi Pico W.

Status. Produced 2026-04-20 from first-party sources (Linux kernel tree, pico-sdk issue tracker, cyw43-driver upstream). Sign-off for Phase 3 PMKSA coding: ready.


TL;DR

  1. The PMKSA iovar on CYW43439 is pmkid_info, plain (not bsscfg:-prefixed). Verified from Linux brcmfmac source.

  2. Our blob is firmware 7.95.61 from 2023-01-11 (identical to Embassy's shipped firmware; verified from strings output on both). WLC version < 13.0, so we use the legacy API. V2 and V3 require newer firmware. The legacy API is adequate for everything we need.

    Correction note 2026-04-20: an earlier draft of this document stated our blob was 43439A0_7_95_49_00 (2018-era). That was the pico-sdk reference's bundled version; our actual src/cyw43/firmware/43439A0_combined.bin is 7.95.61. The 43439A0_combined.bin filename is just a label.

  3. Legacy payload: exactly 356 bytes = __le32 npmk + 16 × {u8 bssid[6]; u8 pmkid[16]}. No padding, no bsscfg prefix.

  4. Flush = memset to zero, send the whole 356-byte buffer. That's it.

  5. Important framing correction. The reliability issue most commonly reported on Pico W (pico-sdk #2153) is not a stale-PMKSA problem — it is an ICV_ERROR event flood triggered by missed rekeys under power-save mode. That issue was fixed upstream by adding pend_rejoin on event 49 (cyw43-driver PR #130, merged Jan 2025). Our current Zig driver is missing that handler. The rewrite already commits to handling it via the join state machine (§6.1).

  6. PMKSA clear_on_boot is still worth doing — cheap (one 356-byte iovar write at end of wifiOn), addresses an independent 802.11 failure mode (AP-side cache-key drift after AP reboot), has no documented downside. But it is not the silver-bullet fix for the most-cited Pico W reliability bugs. The plan now reflects both fixes, and orders them: ICV_ERROR handler (high value, free) first, PMKSA clear (medium value, cheap) second.


Sources consulted (all first-party)

# Source Purpose
S1 Linux kernel brcmfmac/cfg80211.c Definitive iovar name + legacy/V3 flow
S2 Linux kernel brcmfmac/fwil_types.h Struct layouts, BRCMF_MAXPMKID = 16
S3 Linux kernel brcmfmac/feature.c CY_CC_43439_CHIP_ID explicit handling; WLC-version feature gate for V3
S4 Hector Martin's PMKID_V3 patch V3 API introduction + use of pmkid_info iovar
S5 pico-sdk issue #2153 Real-world symptom and actual root cause on Pico W
S6 cyw43-driver PR #130 4-line fix: ICV_ERROR → pend_rejoin (merged Jan 2025)
S7 pico-sdk issue #1373 "First connection after reboot fails" symptom
S8 wpa_supplicant / hostapd wpa_common.h 802.11 standard constants: PMKID_LEN=16, PMK_LEN=32
S9 Embassy cyw43 crate PR #3323 Confirmed no PMKSA support in Embassy
S10 soypat/cyw43439 Go driver Confirmed no PMKSA support in this driver either

Links: all captured via WebSearch + WebFetch 2026-04-20.


1. The iovar

Name: pmkid_info (8 characters + NUL).

Addressing: plain iovar, NOT bsscfg:-prefixed. Verified from S1:

/* brcmfmac/cfg80211.c, brcmf_update_pmklist() */
return brcmf_fil_iovar_data_set(ifp, "pmkid_info", pmk_list, sizeof(*pmk_list));

brcmf_fil_iovar_data_set is brcmfmac's standard iovar-set path; it does not apply a bsscfg prefix. Contrast with e.g. bsscfg:event_msgs in cyw43-driver, where the bsscfg prefix is part of the name string. For pmkid_info the name is literally "pmkid_info" with no prefix.

Direction: both read and write (brcmf_fil_iovar_data_set for SET, brcmf_fil_iovar_data_get for GET). For clear-on-boot we only need SET.

Maps to WLC command: WLC_SET_VAR (263) for SET, WLC_GET_VAR (262) for GET. Same wire framing as every other iovar in the reference cyw43-driver.


2. Chip compatibility

CYW43439 is explicitly recognized by brcmfmac as CY_CC_43439_CHIP_ID. From S3 (feature.c):

if (drvr->bus_if->chip != BRCM_CC_43430_CHIP_ID &&
    drvr->bus_if->chip != BRCM_CC_4345_CHIP_ID &&
    drvr->bus_if->chip != BRCM_CC_43454_CHIP_ID &&
    drvr->bus_if->chip != CY_CC_43439_CHIP_ID)
    brcmf_feat_iovar_data_set(ifp, BRCMF_FEAT_GSCAN, ...);

The chip is in the "legacy family" group along with 43430 / 4345 / 43454 — they share firmware lineage. Any feature supported on one is almost always supported across the group. pmkid_info has been present in brcmfmac's PMKSA flow for years (the legacy API predates both V2 and V3), so it is expected to be present on every firmware vintage shipped with these chips.

2.1 API version selection

From S3, brcmf_feat_wlcfeat_map:

static const struct brcmf_feat_wlcfeat brcmf_feat_wlcfeat_map[] = {
    { 12, 0, BIT(BRCMF_FEAT_PMKID_V2) },
    { 13, 0, BIT(BRCMF_FEAT_PMKID_V3) },
};
  • Firmware with WLC version ≥ 12.0 supports PMKID_V2.
  • Firmware with WLC version ≥ 13.0 supports PMKID_V3.
  • Firmware with WLC version < 12.0 supports legacy only.

Our blob vintage is 7.95.61 (Jan 2023, confirmed via version string dump). Firmware 7.x families ship with WLC versions well below 12 — so we are on the legacy PMKID API. The pico-sdk shipping cyw43-driver makes no attempt to negotiate V2/V3 at all, consistent with legacy being the only applicable API for this chip family.

Conclusion: our chip+blob uses the legacy API. V2 is never implemented by brcmfmac anyway (TODO: implement PMKID_V2 throughout cfg80211.c). V3 is the future-compatible path but not relevant to us.

2.2 Negotiation probe (optional)

brcmf_feat_is_enabled(ifp, BRCMF_FEAT_PMKID_V3) is the host-side flag. The firmware itself does not announce capability; the driver reads the WLC version via wlc_ver iovar and sets the feature flag accordingly. Our Zig port does not need to replicate this — we can hard-code legacy for the committed blob, and treat any firmware-upgrade as an R17 revalidation.


3. Wire-format spec (legacy API)

From S2 (fwil_types.h):

#define BRCMF_MAXPMKID   16
#define WLAN_PMKID_LEN   16      /* 802.11 standard, S8 */
#define ETH_ALEN          6

struct brcmf_pmksa {
    u8 bssid[ETH_ALEN];      /* 6 bytes */
    u8 pmkid[WLAN_PMKID_LEN];/* 16 bytes */
};                           /* total 22 bytes, no padding */

struct brcmf_pmk_list_le {
    __le32 npmk;             /* 4 bytes, little-endian count */
    struct brcmf_pmksa pmk[BRCMF_MAXPMKID]; /* 16 × 22 = 352 bytes */
};                           /* total 356 bytes */

Alignment: the __le32 npmk is naturally 4-byte aligned; u8 bssid[6] and u8 pmkid[16] are byte-array fields so no padding. The C compiler emits exactly 356 bytes for this struct on every platform we care about.

3.1 Operation semantics (legacy API, from S1)

Operation How it's built
Flush (clear-all) memset(&pmk_list, 0, sizeof(pmk_list)) — i.e. npmk = 0, all 352 bytes of entry-array zero. Send.
Add an entry Find slot where bssid matches (or first free slot up to npmk). Write bssid + pmkid (16 bytes). If new slot, npmk += 1. Send the whole 356-byte list.
Remove an entry Find slot with matching bssid. Shift all subsequent entries down by one. Zero last slot. npmk -= 1. Send the whole 356-byte list.

All operations rewrite the whole list. This is not an efficiency concern at our scale (356 bytes, once per join/deauth) but is worth noting: there is no per-entry add/delete wire op in the legacy API. Each operation builds a complete list view and sends it atomically.

3.2 Endianness

  • npmk is __le32 — little-endian u32. Must be serialized as LE from the host.
  • bssid and pmkid are byte arrays — no endianness.
  • There is NO length or version prefix in the legacy API (contrast with V2/V3 which have __le16 version + __le16 length headers).

3.3 The complete wire payload for clear-on-boot

356 bytes, all zero. No version field, no length field. The firmware interprets npmk = 0 as "flush all entries."

offset  bytes
  0     00 00 00 00                                           <- npmk = 0
  4     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    <- pmk[0]
  ...
356     (end)

3.4 Reference implementation (from S1)

The flush path reduces to:

memset(&cfg->pmk_list, 0, sizeof(cfg->pmk_list));
brcmf_fil_iovar_data_set(ifp, "pmkid_info", &cfg->pmk_list, sizeof(cfg->pmk_list));

Two lines. sizeof(cfg->pmk_list) == 356.


4. What this fixes — and what it does NOT fix

4.1 What clear_on_boot addresses

802.11 scenario: the CYW43439 firmware can retain a PMKSA cache entry across a host reboot if the host chip does not power-cycle the CYW43 (our case — the chip stays powered, only the RP2040 restarts with watchdog/UF2). On reconnect, the chip attempts fast-reauth using the cached PMKID. If the AP has since forgotten the PMKSA (AP reboot, session timeout, config change), the AP refuses the fast-reauth; the chip retries with stale state; reliability degrades.

Clearing the firmware cache at wifiOn (i.e. at every host boot) forces the first join in a session to be a clean 4-way handshake, side-stepping the "AP-forgot-us-but-chip-thinks-it-knows-us" state drift.

This scenario is inferred from 802.11 fundamentals, not documented as a specific reported Pico W bug.

4.2 What clear_on_boot does NOT address

The most-reported reliability issue on Pico W is NOT a stale-PMKSA issue. From S5 (pico-sdk #2153):

  • Symptom: ASYNC(0000,49,0,0,0) events flood the UART; device stops responding; WLAN.isconnected() still returns True.
  • Event 49 = CYW43_EV_ICV_ERROR = "integrity check value error on received frame."
  • Root cause: under power-save mode, the chip sleeps through a group-key rekey exchange. The AP gives up. The chip wakes with an outdated GTK. Every received unicast/multicast frame fails decryption. The chip never realises the link is dead because it never received (or missed) the deauth.

Fix (S6, cyw43-driver PR #130, merged Jan 2025):

} else if (ev->event_type == CYW43_EV_ICV_ERROR) {
    self->pend_rejoin = true;
    self->pend_rejoin_wpa = false;
    cyw43_schedule_internal_poll_dispatch(cyw43_poll_func);
}

Three lines. On ICV_ERROR, queue a rejoin. The poll loop executes cyw43_ll_wifi_rejoin() which re-issues WLC_SET_SSID and forces a full fresh association.

This fix is present in our reference commit (dd7568… April 2024 vintage). Whether it exists in our reference depends on the exact commit date — but regardless, our current Zig driver src/cyw43/protocol/events.zig drops event 49 via else => {} and does nothing about it. This is the most impactful reliability gap in our current driver.

4.3 Secondary issue

From S7 (pico-sdk #1373): first connection after reboot fails; second succeeds. Root cause: AP still has us in its association table; refuses a new connection until the stale entry times out or we explicitly issue cyw43_wifi_leave() before reboot.

Fix (observed): retry with 12 s timeout, OR call cyw43_wifi_leave() in a shutdown hook. The former is already the reference's happy-path under auto_reconnect. The latter is a host-level integration concern (bindings/wifi.zig).

PMKSA clear_on_boot does NOT fix this. Even with a clean PMKSA cache, the AP's own association table is the obstacle, not any key material.


5. Implementation spec for Phase 3

5.1 Minimum viable clear_on_boot (Phase 3 P3b, MANDATORY)

// In ll/boot.zig or ctrl/state.zig near the end of wifiOn().
//
// Clear firmware-side PMKSA cache so the first join of this boot session
// does a clean 4-way handshake. Prevents "AP-forgot-us-but-chip-cached"
// state drift after AP power-cycle.

fn pmksaClearOnBoot(driver: *Driver) WifiError!void {
    // Legacy API payload: __le32 npmk = 0, 16 × {u8 bssid[6]; u8 pmkid[16]}
    // Total 356 bytes, all zero.
    var buf: [356]u8 = @splat(0);

    // Send as plain iovar (no bsscfg prefix).
    try ioctl.setIovar(driver, "pmkid_info", &buf);
}

Placement: at the end of wifiOn(), after WLC_UP has completed successfully and before any join() call is permitted. Driver.state transitions from wifi_up to wifi_up_pmksa_cleared; join() requires the latter.

Error handling:

  • Firmware returns 0 → great, done. State advances.
  • Firmware returns non-zero → record the status; this is an R17-adjacent signal (either the iovar is unsupported on our blob — unexpected given S3 — or the blob is newer than expected and we need V2/V3). For the P3b hardware-verification gate: a non-zero response is a Phase 3 validation failure. For post-verification runtime: log a warning and continue (degraded reliability rather than refusing to boot).

5.2 cache_in_boot (Phase 3 P3c, STRETCH)

In-boot caching adds:

  • A host-side (bssid, pmkid) → slot_index map, bounded at 16 entries (matches BRCMF_MAXPMKID).
  • On successful joined transition: query firmware for the current PMKID via pmkid_info GET; copy into host cache.
  • On DEAUTH_IND / DISASSOC_IND / ICV_ERROR for the current BSSID: evict that entry from the host cache and re-send the updated list to firmware.

Important: the PMKID produced by the 4-way handshake is firmware-generated, not host-computed. The host cache is a mirror; the firmware's view is authoritative at the wire level. We send the merged list back to firmware so the two views stay consistent.

Host cache is NOT persisted across reboots in V1 — flash-write support is not yet implemented (ISSUES.md open item #3). Persistent PMKSA is Phase B material at the earliest.

5.3 Phase 3 ordering

The plan §8.3 should be updated to reflect the research-derived ordering:

  1. Highest-priority reliability fix: ICV_ERROR (event 49) → pend_rejoin. This is a 3-line addition in the join state machine's event dispatcher. Our plan already commits to exhaustive event decoding — this specifically must include event 49 with the queue-rejoin action.
  2. Secondary: PMKSA clear_on_boot. ~20 LOC. Addresses the independent "AP-forgot-us" failure mode.
  3. Stretch: PMKSA cache_in_boot with DEAUTH eviction. Skip for P3; revisit in Phase B if soak data suggests reconnect-latency is a user-visible issue.

All three land in the Phase 3 commit range; the first two are gates for Phase 4 cutover.


6. Verification plan (on-hardware)

6.1 Does the iovar work on our blob?

30-minute test:

  1. Build a minimal Phase-1-ish test harness (just boot + wifiOn + issue pmkid_info).
  2. Inspect the iovar response status.
  3. Expected: 0 (success). Anything else → consult S3's WLC version compatibility matrix and §5.9.4 fallback in the plan.

6.2 Does clear_on_boot change observed behavior?

Evidence of effectiveness (per §11.3 negative-proof gate):

  1. Timing: reconnect latency after AP power-cycle, new-driver + clear_on_boot vs old-driver (which never issues the iovar). Measure association-to-keyed time across ≥10 trials. New driver should show latency consistent with full 4-way handshake — not the "retry-then-succeed" pattern seen in S7.
  2. On-wire (preferred, requires monitor-mode NIC): capture 802.11 frames during reconnect. New driver's Association Request should have no PMKID in the RSN Information Element. Old driver may include a stale PMKID.
  3. AP-log evidence (preferred if hostapd is the AP): hostapd logs should show no "PMKSA cache entry found for ..." message for our BSSID on the reconnect.

6.3 Does the ICV_ERROR handler work?

This is separate from PMKSA but worth noting:

  • Reproduce the pico-sdk #2153 scenario: put the chip in aggressive PM, wait for rekey, observe event 49 flood.
  • With the handler present: the first event 49 triggers pend_rejoin; the driver reconnects in ~5 s.
  • Without the handler: event 49 floods forever (current state of our Zig driver).

7. Open questions (deliberately deferred)

Q1: Does firmware auto-populate PMKSA cache on a successful handshake, without us calling pmkid_info SET? Best guess: yes, based on brcmfmac design. Verification: issue pmkid_info GET after a join and see whether npmk > 0. If yes, firmware is self-populating; we only need clear_on_boot to prevent carry-over. Phase 3 experiment.

Q2: Does clear_on_boot need to fire on every wifiOn, or only on the first one after a host reboot? The chip stays powered across a host-reboot-via-watchdog on Pico W, so firmware state is preserved — meaning clear_on_boot on every wifiOn is the right choice. (There is no Pico W scenario where the chip has been freshly powered but we skip the clear; both cases converge on "send the iovar at wifiOn.")

Q3: For an extreme-aggressive clear_on_every_join policy, is there a cost? Semantically: every join becomes a full 4-way handshake, so fast-reauth is lost. On a scale where joins happen at the rate of "once per session, maybe once per reconnect," the cost is negligible. On a theoretical mass-roam scenario (many APs, rapid roaming between them within a boot), fast-reauth matters — but that is not Pico W's use case.


8. Sign-off

For Phase 3 P3b coding:

  • Iovar name identified: pmkid_info
  • Payload shape verified: 356 bytes, __le32 npmk + 16 × {bssid[6], pmkid[16]}
  • Bsscfg prefix: no, plain iovar
  • Endianness: npmk is little-endian; rest is byte-arrays
  • Blob compatibility: CYW43439 explicitly supported via legacy API in brcmfmac; WLC<12.0 firmware uses legacy only
  • License boundary: brcmfmac is GPL-2.0; this research document cites algorithmic behavior and struct layouts as protocol evidence, not copied code. Zig implementation should be written from this spec without referencing brcmfmac source directly.
  • Hardware verification plan defined (§6)
  • Reliability-context framing corrected: PMKSA complements, does not replace, the ICV_ERROR handler

Ready for Phase 3 coding. The plan's §5.9.4 fallback path is still appropriate if outcome (1) does not pan out on hardware, but the expected outcome is (1) — the iovar works on our blob and clear_on_boot is a ~20-line addition.


Research conducted 2026-04-20. See docs/CYW43-REWRITE.md §5.9 for how these findings integrate with the larger rewrite plan.

CYW43 pure-Zig rewrite — execution plan

Status box

  • State: plan produced, peer-reviewed (GPT-5.4, conversation pico-cyw43-rewrite-plan-2026). Ready to execute in subsequent coding sessions.
  • Goal: replace src/cyw43/ with a reference-quality pure-Zig driver that matches or exceeds pico-sdk/lib/cyw43-driver reliability.
  • Non-goals (this milestone): Bluetooth, SDIO transport, lwIP glue, AP-mode as default path. All enumerated as deferred phases in §8.
  • Primary regression gate: -Dengine=js UF2 byte-identical to .preflight-baseline/pico-preintegration.uf2 throughout the rewrite.
  • Secondary regression gate: wire-byte equivalence of the new driver against captured old-driver SPI traces at init / scan / join / idle (see §7.2).
  • Scope in lines: audit of ~5,400 reference C; delivery of ~2,800–3,500 Zig across ~18 new files. See §4.
  • Effort: 5 focused coding sessions of 6–10 hours each, across 2–4 weeks of hardware-iteration calendar time. See §8.
  • Highest risks: wire-format misalignment (endianness, padding, unaligned access on Cortex-M0+); event-ordering assumptions that differ from firmware’s actual delivery order; PMKSA iovar discovery on our blob vintage (now a mandatory deliverable — §5.9, R6); auth-retry improvements colliding with firmware-side retry logic. See §9.

Reading order. Read this document top-to-bottom before writing any code. Sections build on each other. If you are executing the plan, do not skip §3 (Zig idiom style guide) or §10.3 (attribution mechanics). The go/no-go checklist in §11 is a hard gate, not a suggestion.

Companion references.

  • AGENTS.md § "CYW43 Gotchas" — 7 hard-won gotchas (#15–#21) that apply to any driver.
  • ISSUES.md #25 — the 180 s UART corruption burst; the rewrite must collect data sufficient to identify the offending event.
  • ZIG-0.16.0-REFERENCE.md / ZIG-0.16.0-QUICKSTART.md — 0.16 idioms.
  • docs/CYW43-PMKSA-RESEARCH.md — first-party-verified spec for the pmkid_info iovar (pre-Phase-3 research completed).
  • docs/NANORUBY.md + src/ruby/nanoruby/UPSTREAM.md — the template for this plan’s structure and for the vendoring-change-tracking discipline applied in §10.

Plan revision log. Key decisions made during planning, in reverse chronological order:

Date Decision Record
2026-04-20 GPT-5.4 final sign-off (peer-review turn 6) with 12 targeted tightenups all applied. Plan is "proceed to Phase 1" per peer. Tightenups: (a) WPA3 bool → Wpa3Mode enum with .auto default + fallback table; (b) event-mask blob-coupling note; (c) yieldDuringLongOp non-reentrancy rules (5 explicit); (d) EventLog tuple-key precision ({event_type, status, reason, auth_type, ifidx}); (e) PMKSA clear_on_boot lifecycle pinned to 5-step wifiOn flow with race-window rationale; (f) §6.1.5 rejoin-storm coalescing rule (prevents livelock under crypto-error bursts); (g) three-flag semantics pin-downs for WPA2/WPA3/open + failure-latch clearing; (h) logger-disabled soak variant for ISSUES.md #25 (separates event existence from UART print perturbation); (i) fixture-metadata sidecar schema with mandatory SHA checks (closes R17 loop); (j) §8.0.4 no-optimization-before-parity rule; (k) §8.1 host-interrupt-pin behavioral verification task; (l) §8.0.2 checkpoints labeled required vs conditional. This row + 12 sections named.
2026-04-20 Unknown-event logging mechanism consolidated into authoritative §6.1.4. Replaces scattered fragments in §3.2, §3.4, §4.1, §5.8, §6.1 event-table last row, §8.2, and §11.3 with a single spec: full UnknownEvent struct (10 fields), 89-entry event-name table, EventLog ring buffer (16 entries, 5-s coalescing window), concrete log-line formats, Driver.getEventLog() query API, wifi events UART shell command, and specific ISSUES.md #25 resolution procedure. This is the component that closes the loop on any remaining spurious events. §6.1.4 + §11.3 checkbox updated
2026-04-20 Executability closures applied — added build.zig integration sketch (§8.0.1), peer-review checkpoint table (§8.0.2), PIO SPI carryover note (§8.0.3), full golden-trace capture procedure + logic-analyzer pin mapping + spi_trace_to_mock.py tool spec (§7.2.3–7.2.4), mock SPI framework code sketch (§7.2.5), HostHooks.yieldDuringLongOp watchdog-feeding hook (§5.3), event-mask bit-by-bit construction table (§12.3), NVRAM variant verification checklist (§12.4.1), event-payload capture procedure for Phase 2 (§7.1.1). This table + sections named.
2026-04-20 Reference drivers consulted (Embassy Rust @ 3c70a9bf, soypat Go @ 045049fee, Linux brcmfmac). 12 new stability-relevant findings surfaced. §10.4.1 lineage, §2.8 G17–G28
2026-04-20 Firmware vintage reality-check: our blob is 7.95.61 (Jan 2023), not 7.95.49.00 as earlier drafts assumed. No update recommended (identical to Embassy's shipped firmware; WPA3 already supported). §12.4, §5.9.2, docs/CYW43-PMKSA-RESEARCH.md correction
2026-04-20 Three-flag link-state model (auth_ok, join_ok, keyed) adopted, replacing pico-sdk's wifi_join_state bitmask. Cleaner to test; aligns with Embassy+soypat. §6.1
2026-04-20 WPA3-SAE mandated as Phase 3 deliverable, not optional. Firmware supports sae+mfp; reference driver has the join flow; no cost to include. wpa3_mode: Wpa3Mode = .auto by default per §5.10 (replaces earlier enable_wpa3: bool proposal after GPT-5.4 turn-6 review flagged bool-vs-enum conflation). §5.10
2026-04-20 Seven additional events added to the handler: MIC_ERROR (17), UNICAST_DECODE_ERROR (50), MULTICAST_DECODE_ERROR (51), PSM_WATCHDOG (41), PMKID_CACHE (21), GTK_PLUMBED (84), BCNLOST_MSG (31). Several are ignored by every reference driver; we're a step-change. §2.2 expanded table, §6.1 event-to-transition table
2026-04-20 PSK_SUP reason=14 → IGNORE rule added. This fixes a real latent bug in earlier drafts — without it, roam events trigger spurious rejoins. §6.1, §7.1 test matrix
2026-04-20 PMKID_CACHE event used for cache_in_boot sync (not polling). §5.9.5
2026-04-20 PMKSA reframed from time-boxed enhancement to hard Phase 3 deliverable (peer-review-approved override). §5.9
2026-04-20 PMKSA research artifact produced with verified iovar spec. docs/CYW43-PMKSA-RESEARCH.md
(initial) Peer-reviewed rewrite plan committed. §1–12 core structure

Section 1 — Executive summary

The pico project runs firmware on Raspberry Pi Pico W (RP2040 + CYW43439 2.4 GHz WiFi/BT combo). WiFi is provided today by src/cyw43/ — a ~2,200-line Zig driver that associates and carries TCP/TLS/MQTT in a happy-path workload but has documented reliability gaps:

Gap Symptom Root cause Fix in this plan
No auto-reconnect on DEAUTH/DISASSOC Dropped link stays down until next TX fails. Event handler has else => {} on most paths. Exhaustive event decoder + join state machine §6.1.
Minimal PSK retry Transient key-exchange edge misses are terminal. No pend_rejoin deferred-action mechanism. pend-flag machinery §6.1.
Stale AP state Repeated 4-way handshake failures after router power-cycle. No PMKSA cache management. clear_on_boot (mandatory Phase 3, §5.9).
Many events dropped silently ISSUES.md #25: 180 s UART-corruption burst. Exhaustive event decoding absent. Broad event mask + rate-limited unknown-event logging (§8.2).
72-line join.zig ~10% of reference driver's join logic ported. Several hundred lines of state-machine unfolded. Three-flag model §6.1 + 20+ event handlers.
No link-status monitoring Link drops invisible until next send fails. No dedicated link state machine. Link state machine §6.2 with BCNLOST_MSG trigger.
ICV_ERROR flood under power-save (pico-sdk #2153) Silent link-death after ~minutes of PM2; isconnected() lies. Event 49 dropped via else => {}. Event 49 + MIC/UNICAST/MCAST_DECODE_ERROR + PSM_WATCHDOG in same pend_rejoin class (§6.1).
Spurious rejoin during roams Unnecessary reassociation when AP hands off. PSK_SUP reason=14 treated as failure. Explicit IGNORE rule (§6.1 table).
WPA3-only APs break (AGENTS.md gotcha #28) DEAUTH type=6 repeated on WPA3 networks. No SAE join flow; mfp=1 hard-coded. WPA3-SAE implementation (§5.10).

What this plan produces. Not code. A document that a subsequent coding session — with access to this plan, AGENTS.md, the reference C, and the user-ai MCP — can follow step-by-step to produce a new src/cyw43_new/ pure-Zig driver that:

  1. is file-by-file structured around protocol layers, not around the C source’s historical decomposition (§4);
  2. ports behavior, not bit layout, of the reference state machines (§6);
  3. decodes every event type exhaustively from day 1 (§6.4 + §9 mitigation for ISSUES.md #25);
  4. surfaces a clean Zig instance API over a compat façade that preserves existing bindings/wifi.zig calls during migration (§5);
  5. is validated against golden SPI-wire traces before any hardware cutover (§7.2);
  6. carries rigorous attribution of reference-driver lineage (§10).

Scope cuts. Out of scope for this rewrite (each is a potential future phase, explicitly not required now):

  • SDIO transport (Pico W is SPI-only).
  • Bluetooth HCI (WiFi-only rewrite).
  • lwIP integration (we have our own TCP/IP in src/net/).
  • AP-mode as a default-reachable path. Internal architecture must preserve bsscfgidx / interface-id plumbing so AP-mode can land in a future phase without an API-wide refactor.

What "done" looks like. -Dcyw43=new is the default. The old tree is deleted. bindings/wifi.zig still works. The driver survives the acceptance matrix in §7.3, including router power-cycle (H2 — PMKSA clear_on_boot is what makes this pass) and forced deauth (H3). The JS-mode UF2 is still byte-identical to the pre-rewrite baseline.


Section 2 — Deep audit of the reference C driver

The reference is misc/pico-sdk/lib/cyw43-driver/ at commit dd7568229f3bf7a37737b9e1ef250c26efe75b23 (April 2024), under LICENSE.RP. Non-SPI / non-WiFi files (cyw43_sdio.*, cyw43_bthci_uart.c, cyw43_lwip.c, cyw43_stats.*, cyw43_btbus.h) are audited to the extent needed to carve the SPI+WiFi seam; they are not ported.

2.1 cyw43.h (733 lines — public API header)

Purpose. Public C API consumed by the pico-sdk integrator (cyw43_arch_*). Defines the cyw43_t top-level state struct and the full function surface.

Top-level struct cyw43_t (lines 108–152):

  • cyw43_ll_t cyw43_ll; — the low-level opaque state (array of u32 words, size CYW43_LL_STATE_SIZE_WORDS).
  • uint8_t itf_state; — bitmask: bit 0 = STA up, bit 1 = AP up.
  • uint32_t trace_flags; — debug flags (CYW43_TRACE_ASYNC_EV, CYW43_TRACE_ETH_TX, CYW43_TRACE_ETH_RX, CYW43_TRACE_ETH_FULL, CYW43_TRACE_MAC).
  • volatile uint32_t wifi_scan_state;0=idle, 1=active, 2=complete. Volatile because it’s written from event-callback context.
  • uint32_t wifi_join_state;this is the join state machine. See §2.3.
  • void *wifi_scan_env; int (*wifi_scan_cb)(void*, const cyw43_ev_scan_result_t*); — scan callback.
  • bool pend_disassoc, pend_rejoin, pend_rejoin_wpa; — deferred actions processed by cyw43_poll_func.
  • AP settings: ap_auth, ap_channel, ap_ssid[32], ap_key[64] + lengths.
  • uint8_t mac[6]; — from OTP or generated LAA.

Link-status constants (lines 98–106):

CYW43_LINK_DOWN    = 0
CYW43_LINK_JOIN    = 1    (wifi up, no IP yet)
CYW43_LINK_NOIP    = 2    (associated, DHCP pending)
CYW43_LINK_UP      = 3    (IP assigned)
CYW43_LINK_FAIL    = -1
CYW43_LINK_NONET   = -2   (no matching SSID)
CYW43_LINK_BADAUTH = -3

Public functions (summarised; line refs to cyw43.h):

Function L Purpose Ported?
cyw43_init / cyw43_deinit 165 / 174 Lifecycle. Yes.
cyw43_ioctl 188 Raw ioctl dispatch. Yes (thin).
cyw43_send_ethernet 202 TX ethernet frame. Yes.
cyw43_wifi_pm / cyw43_wifi_get_pm 220 / 239 Power-save mode. Yes.
cyw43_wifi_link_status 261 Query STA link state (maps wifi_join_state → LINK_*). Yes.
cyw43_wifi_set_up 278 Bring STA/AP interface up. Yes.
cyw43_wifi_get_mac 290 Fetch MAC. Yes.
cyw43_wifi_update_multicast_filter 303 Add/remove mcast addr. Yes.
cyw43_wifi_scan 318 Start scan + register callback. Yes.
cyw43_wifi_scan_active 328 Is a scan running? Inline — yes.
cyw43_wifi_join 351 WPA/WPA2/WPA3 join. Yes.
cyw43_wifi_leave 362 Disassoc. Yes.
cyw43_wifi_get_rssi 374 RSSI query. Yes.
cyw43_wifi_get_bssid 383 BSSID query. Yes.
cyw43_wifi_ap_* 394..495 AP-mode (init/set_ssid/set_password/set_auth/get_stas). Deferred (§8).
cyw43_is_initialized 505 init flag. Yes.
cyw43_cb_tcpip_* 518..552 Integrator-supplied callbacks. Replaced by event-sink interface (§5).
cyw43_tcpip_link_status 574 Combined link status. Yes.
cyw43_gpio_set / cyw43_gpio_get 588 / 601 CYW43-hosted GPIO (LED). Yes.
cyw43_pm_value 627 Pack PM params into u32. Inline — yes.
cyw43_bluetooth_* 662–720 BT. Out of scope.

Global state. Reference exposes cyw43_t cyw43_state;, void (*cyw43_poll)(void);, uint32_t cyw43_sleep; as extern (lines 154–156). In Zig we do not mirror this. The new driver is an explicit Driver instance passed into every function that needs it. The cyw43_poll function pointer shape is replaced by a method on the driver; cyw43_sleep becomes a field. See §3.8 (globals) and §5.

2.2 cyw43_ll.h (322 lines — low-level API)

Purpose. The "low-level" layer: the set of operations the mid-level driver (cyw43_ctrl.c) calls into. Also defines the on-wire event struct and scan-result struct.

IOCTL commands (lines 56–64). Bottom bit encodes SET vs GET: (cmd & 1) ? SDPCM_SET : SDPCM_GET, actual WLC_* command is cmd >> 1. Constants:

CYW43_IOCTL_GET_SSID       0x32   (WLC_GET_SSID   = 25,  get)
CYW43_IOCTL_GET_CHANNEL    0x3a   (WLC_GET_CHANNEL= 29,  get)
CYW43_IOCTL_SET_DISASSOC   0x69   (WLC_DISASSOC   = 52,  set)
CYW43_IOCTL_GET_ANTDIV     0x7e   (WLC_GET_ANTDIV = 63,  get)
CYW43_IOCTL_SET_ANTDIV     0x81   (WLC_SET_ANTDIV = 64,  set)
CYW43_IOCTL_SET_MONITOR    0xd9   (WLC_SET_MONITOR= 108, set)
CYW43_IOCTL_GET_RSSI       0xfe   (WLC_GET_RSSI   = 127, get)
CYW43_IOCTL_GET_VAR        0x20c  (WLC_GET_VAR    = 262, get)
CYW43_IOCTL_SET_VAR        0x20f  (WLC_SET_VAR    = 263, set)

Event types. The on-wire event_type field is big-endian u32. Table below enumerates every event our decoder specifically handles; the full 89-entry name table appears in §12.3. Events not in this table are decoded but log-only (with rate-limiting per §8.2).

Scope expansion from original reference. The pico-sdk C driver handles a narrow event set (~10 event types). The plan expands to 20+ after cross-referencing Embassy, soypat, and Linux brcmfmac for events the reference C ignores but which carry real reliability signal. The "Source" column indicates which reference surfaced the handling pattern:

# Name Used for Action Source
0 SET_SSID Join success/failure signaling. status=NO_NETWORKS → failed_nonet. join_ok / failed_* pico-sdk C cyw43_ctrl.c:373
1 JOIN Associated with AP. join_ok = true Embassy runner.rs:1198
3 AUTH Authentication complete/failed. FAIL+reason=16+auth_type=3 is WPA3-specific. auth_ok update; WPA3 link-down Embassy runner.rs:1183-1196
5 DEAUTH Received deauth from AP (SUCCESS). Link-down class Embassy runner.rs:1182
6 DEAUTH_IND Deauth indication. reason=2 → bad-pw disassoc. pend_disassoc or link-down pico-sdk C cyw43_ctrl.c:396
7 ASSOC Association complete (AP mode; decoder-only for STA). Log only n/a
11 DISASSOC Locally initiated disassoc. Link-down class pico-sdk C cyw43_ctrl.c:353
12 DISASSOC_IND Remote disassoc. Link-down class pico-sdk C has it commented-out — we enable it
16 LINK Link up/down (flags bit 0). Link-down or link-up pico-sdk C cyw43_ctrl.c:402
17 MIC_ERROR TKIP/WPA MIC failure — crypto drift. pend_rejoin New in plan — all reference drivers ignore; sibling of ICV_ERROR
21 PMKID_CACHE Firmware-originated PMKSA change notification. §5.9.3 cache_in_boot sync New in plan — enables cleaner cache_in_boot design
23 PRUNE AP pruned. reason=8 = RSN mismatch → WPA1 fallback rejoin. pend_rejoin_wpa + pend_rejoin pico-sdk C cyw43_ctrl.c:366
31 BCNLOST_MSG Beacon loss from AP. Early disassoc signal. LinkState degraded New in plan — §6.2 trigger
41 PSM_WATCHDOG Firmware microcode watchdog fired. Firmware in distress. Log WARN + stats + pend_rejoin New in plan — all references ignore
46 PSK_SUP Supplicant status. status=6 + flags=0 + reason=0 → keyed. reason=14 → IGNORE (roam). reason=15 timeout → rejoin. keyed / IGNORE / pend_rejoin / failed_badauth Embassy runner.rs:1208-1222
49 ICV_ERROR Integrity check value failed — crypto drift. pend_rejoin cyw43-driver PR #130 (Jan 2025) + pico-sdk issue #2153
50 UNICAST_DECODE_ERROR Can't decrypt our unicast — our session key is wrong. pend_rejoin New in plan
51 MULTICAST_DECODE_ERROR Can't decrypt broadcast. Mixed-client networks produce natural misses. Threshold-triggered pend_rejoin (3 within 5 s) New in plan — §6.1.1
69 ESCAN_RESULT Scan progress/complete. status=8 partial / status=0 done. Dispatch to scan sink pico-sdk C cyw43_ctrl.c:340
80 CSA_COMPLETE_IND Channel switch complete. Log only pico-sdk C
84 GTK_PLUMBED Group key successfully installed. Update last_healthy_ms counter New in plan — positive health signal
87 ASSOC_REQ_IE Our assoc request IEs (debug). Log only pico-sdk C
88 ASSOC_RESP_IE AP's assoc response IEs (debug). Log only pico-sdk C

All other events (32 ROAM_PREP, 37 ROAM_START, 36 JOIN_START, 38 ASSOC_START, 35 RESET_COMPLETE, 40 RADIO, etc.) are log-only with rate-limiting. Unknown event types fall through to the unknown-event handler per §2.4.9 and §8.2.

Event status values (lines 85–95). Generic across event types. SUCCESS=0, FAIL=1, TIMEOUT=2, NO_NETWORKS=3, ABORT=4, NO_ACK=5, UNSOLICITED=6, ATTEMPT=7, PARTIAL=8, NEWSCAN=9, NEWASSOC=10.

PSK_SUP supplicant states (lines 98–112). Carried in the event status for PSK_SUP:

SUP_DISCONNECTED=0    SUP_CONNECTING=1      SUP_IDREQUIRED=2
SUP_AUTHENTICATING=3  SUP_AUTHENTICATED=4   SUP_KEYXCHANGE=5
SUP_KEYED=6           SUP_TIMEOUT=7         SUP_LAST_BASIC=8
SUP_KEYXCHANGE_PREP_M4=9   SUP_KEYXCHANGE_WAIT_G1=10
SUP_KEYXCHANGE_PREP_G2=11

Reference uses status 4 | 8 | 10 with reason 15 (SUP_WPA_PSK_TMO) as trigger for pend_rejoin. Any other non-KEYED non-timeout is terminal-BADAUTH.

Roam reasons (115–123), prune reasons (126–144), supplicant failure reasons (147–162). Complete tables — the new Zig code must include them verbatim because log decoding depends on them.

Auth types (170–175):

CYW43_AUTH_OPEN              = 0
CYW43_AUTH_WPA_TKIP_PSK      = 0x00200002
CYW43_AUTH_WPA2_AES_PSK      = 0x00400004     (preferred)
CYW43_AUTH_WPA2_MIXED_PSK    = 0x00400006
CYW43_AUTH_WPA3_SAE_AES_PSK  = 0x01000004
CYW43_AUTH_WPA3_WPA2_AES_PSK = 0x01400004

These are wire-format values. The low byte is wsec (the WLC_SET_WSEC value — 4 = AES, 2 = TKIP, 6 = AES|TKIP); the upper bytes encode WPA/WPA2/WPA3 auth flags used elsewhere.

Scan-result struct (cyw43_ev_scan_result_t, 216–227). 48 bytes including bssid[6], ssid_len, ssid[32], channel (u16, top byte is flags), auth_mode (1/2/4 bitmask = WEP/WPA/WPA2), rssi (i16). The layout has several _0[5], _1[2], _2[5], _3 padding fields — the wire order and byte offsets matter, do not re-pack.

Async event struct (cyw43_async_event_t, 230–242). Fixed header (flags, event_type, status, reason, 30 bytes reserved, interface, 1 byte reserved), then a union containing either a scan_result or (in the reference) other typed payloads. On-wire fields flags/event_type/status/reason are big-endian, decoded by the parser.

LL API functions (270–318):

cyw43_ll_init / cyw43_ll_deinit
cyw43_ll_bus_init                  — SPI bringup + fw/nvram/clm upload
cyw43_ll_bus_sleep                 — sleep/wake (KSO on SDIO; simpler on SPI)
cyw43_ll_process_packets           — drain RX queue
cyw43_ll_ioctl                     — raw ioctl
cyw43_ll_send_ethernet             — TX ethernet
cyw43_ll_wifi_on                   — enable country + basic iovars + WLC_UP
cyw43_ll_wifi_pm / _get_pm         — power-save
cyw43_ll_wifi_scan                 — escan
cyw43_ll_wifi_join                 — start association (fills last_ssid_joined)
cyw43_ll_wifi_set_wpa_auth         — switch to WPA1 (for WPA2→WPA1 fallback on PRUNE)
cyw43_ll_wifi_rejoin               — re-issue SET_SSID with cached last_ssid_joined
cyw43_ll_wifi_get_bssid / _get_mac / _update_multicast_filter
cyw43_ll_wifi_ap_init / _set_up / _get_stas  — AP mode (deferred)
cyw43_ll_gpio_set / _get           — CYW43 GPIO (LED is gpio 0)
cyw43_ll_has_work / _bt_has_work   — "anything pending?"
cyw43_ll_write_backplane_reg/mem, _read_backplane_reg/mem  — BT use only in ref

Mid-level callbacks (309–312). Integrator must provide:

cyw43_cb_read_host_interrupt_pin   — read WL_HOST_WAKE / WL_IRQ
cyw43_cb_ensure_awake              — set cyw43_sleep=CYW43_SLEEP_MAX
cyw43_cb_process_async_event       — dispatch decoded event
cyw43_cb_process_ethernet          — dispatch RX ethernet

In Zig these become fields on a HostHooks struct passed to Driver.init().

2.3 cyw43_ctrl.c (808 lines — mid/high level)

Purpose. The behavioral contract: join state machine, scan result dispatch, link state transitions driven by async events, deferred-action processing.

2.3.1 Join state machine (the core reliability win)

Encoding (lines 56–65):

WIFI_JOIN_STATE_KIND_MASK = 0x000f    // 4-bit kind enum
WIFI_JOIN_STATE_ACTIVE    = 0x0001
WIFI_JOIN_STATE_FAIL      = 0x0002
WIFI_JOIN_STATE_NONET     = 0x0003
WIFI_JOIN_STATE_BADAUTH   = 0x0004
WIFI_JOIN_STATE_AUTH      = 0x0200    // flag: authenticated
WIFI_JOIN_STATE_LINK      = 0x0400    // flag: link-up received
WIFI_JOIN_STATE_KEYED     = 0x0800    // flag: PSK supplicant done
WIFI_JOIN_STATE_ALL       = 0x0e01    // success condition

A join succeeds when wifi_join_state == (ACTIVE | AUTH | LINK | KEYED), i.e. 0x0e01. At that point cyw43_ctrl.c:434-438 clears the flag bits back to ACTIVE and calls cyw43_cb_tcpip_set_link_up(STA).

State transitions driven by cyw43_cb_process_async_event (333–439):

                     +---------+
                     |  idle   |  wifi_join_state = 0
                     +----+----+
                          | cyw43_wifi_join()
                          v
                     +---------+
                     | ACTIVE  |
                     +----+----+
                          |
                  +-------+-----------+-----------+-----------+
                  |                   |           |           |
              EV_AUTH              EV_LINK     EV_PSK_SUP  EV_SET_SSID
              ok/fail              (flags&1)   (status)    (status)
                  |                   |           |           |
                 |AUTH              |LINK       s=6:|KEYED   s=0: noop
                 status=6:                      s=4|8|10 r=15
                 ignore                           pend_rejoin
                 else: BADAUTH                    else: BADAUTH
                                                               s=3 r=0: NONET
                                                               else: FAIL

          if all four bits set (ACTIVE|AUTH|LINK|KEYED) → tcpip_set_link_up()
                                                         state = ACTIVE alone
                          |
                          v
                  +---------+
                  |JOINED   |  wifi_join_state = ACTIVE only, link up
                  +----+----+
                       |
              +--------+--------+---------+
              |                 |         |
          EV_DEAUTH_IND     EV_DISASSOC EV_ICV_ERROR
          (reason=2:        (locally    (always pend_rejoin)
           wrong passwd)    disassoc)
          pend_disassoc=t;  state=0;    pend_rejoin=t
          next poll issues  tcpip_set_
          WLC_DISASSOC      link_down
              |                 |         |
              v                 v         v
                         ... back to idle / rejoining ...

          EV_PRUNE (status=0 reason=8):  RSN mismatch at AP — try WPA1:
            pend_rejoin = true
            pend_rejoin_wpa = true   (poll issues cyw43_ll_wifi_set_wpa_auth first)

Deferred actions in cyw43_poll_func (218–271):

if (pend_disassoc)   → WLC_DISASSOC ioctl; clear flag
if (pend_rejoin_wpa) → cyw43_ll_wifi_set_wpa_auth(); clear flag
if (pend_rejoin)     → cyw43_ll_wifi_rejoin(); state = ACTIVE

The order matters: pend_disassoc first (cleans slate), then pend_rejoin_wpa (switches auth mode), then pend_rejoin (starts new SET_SSID).

What cyw43_ll_wifi_rejoin does (cyw43_ll.c:2184): re-issues WLC_SET_SSID on the 36-byte last_ssid_joined buffer cached at join time. Does not re-send the PMK, does not re-set WSEC — the firmware keeps those from the original join call.

Why this is the reliability win. Every reconnect scenario (deauth, disassoc, ICV error, router power-cycle, password-related, RSN-mismatch, mid-edge PSK timeout) is funneled through this single deferred-rejoin mechanism. Our current Zig driver lacks the machinery — a single deauth is terminal.

2.3.2 Scan state machine

Much simpler (333–351):

wifi_scan_state:
  0 = idle
  1 = scanning (set by cyw43_wifi_scan)
  2 = complete (set when ESCAN_RESULT event arrives with status=0)

per-result (status=8):
  call wifi_scan_cb(env, &ev->u.scan_result); continue
on complete (status=0):
  wifi_scan_state = 2

Gotcha. The reference declares wifi_scan_state as volatile (cyw43.h:115) because it’s written from event-callback context and polled from cyw43_wifi_scan_active. In our Zig port the same must be true — use an atomic or explicit volatile load/store; the driver must not require a lock to read scan state.

2.3.3 Other control-plane helpers

  • cyw43_ensure_up (148–215) — pin init, REG_ON reset pulse (20ms low, 50ms high), SDIO init, bus init (uploads firmware), enables interrupts, kicks the scheduler.
  • cyw43_wifi_on (475) — cyw43_ensure_up, turn on RF switch, cyw43_ll_wifi_on(country).
  • cyw43_wifi_set_up (553) — if up: ensure on, set PM default, init AP or init netif; if down (AP only): tear down AP link.
  • cyw43_wifi_join (632) — preconditions STA active, calls cyw43_ll_wifi_join, sets wifi_join_state = ACTIVE and (if OPEN) |= KEYED.
  • cyw43_wifi_leave (663) — one-liner cyw43_ioctl(WLC_DISASSOC).

2.3.4 Event-name table (298–313)

A sparse 89-entry table mapping event type → string. Only populated for event types the reference handles; all others print as decimal. The new Zig code must include every event name in the table, not just handled ones — for diagnostics (ISSUES.md #25).

2.4 cyw43_ll.c (2,435 lines — wire protocol)

The heavy reading. Broken down by responsibility below.

2.4.1 Constants and wire layouts (125–296)

Backplane function IDs (cyw43_internal.h):

BUS_FUNCTION       = 0   // SPI/SDIO config (F0)
BACKPLANE_FUNCTION = 1   // AHB/SoC registers (F1)
WLAN_FUNCTION      = 2   // 802.11 data path (F2)

Backplane addresses:

CHIPCOMMON_BASE  = 0x1800_0000
SDIO_BASE        = 0x1800_2000   (reused on SPI for some regs)
WLAN_ARMCM3_BASE = 0x1800_3000   (= ARM CM3 core)
SOCSRAM_BASE     = 0x1800_4000
WRAPPER_OFFSET   = 0x0010_0000
BACKPLANE_ADDR_MASK = 0x7fff
SBSDIO_SB_ACCESS_2_4B_FLAG = 0x8000

SDPCM channels (line 180):

CONTROL_HEADER    = 0     (ioctl)
ASYNCEVENT_HEADER = 1     (events)
DATA_HEADER       = 2     (ethernet)

CDC flags (lines 184–189):

CDCF_IOC_ID_SHIFT = 16
CDCF_IOC_ID_MASK  = 0xffff0000
CDCF_IOC_IF_SHIFT = 12
SDPCM_GET = 0
SDPCM_SET = 2

WLC commands (191–211). Already summarised in §2.2. Note: WLC_GET_VAR = 262 / WLC_SET_VAR = 263 are used for all iovars.

Security constants (288–295):

CYW43_WPA_AUTH_PSK      = 0x0004
CYW43_WPA2_AUTH_PSK     = 0x0080
CYW43_WPA3_AUTH_SAE_PSK = 0x40000
AUTH_TYPE_OPEN = 0
AUTH_TYPE_SAE  = 3
MFP_NONE = 0, MFP_CAPABLE = 1, MFP_REQUIRED = 2
CYW43_WPA_MAX_PASSWORD_LEN     = 64
CYW43_WPA_SAE_MAX_PASSWORD_LEN = 128
CYW_EAPOL_KEY_TIMEOUT          = 5000  (ms)

Scan/IE constants (262–266):

DOT11_CAP_PRIVACY           = 0x0010
DOT11_IE_ID_RSN             = 48       // WPA2
DOT11_IE_ID_VENDOR_SPECIFIC = 221
WPA_OUI_TYPE1               = "\x00\x50\xF2\x01"  // WPA1 marker

2.4.2 SPI command word packing (cyw43_spi.c:58)

pack_cmd(write, inc, fn, addr, sz) =
    (write << 31) | (inc << 30) | (fn << 28) | ((addr & 0x1ffff) << 11) | sz

sz is the byte length of the data phase for writes, and the requested read length for reads. fn is 0/1/2 (2 bits used).

Byte ordering on wire. Before the host switches the CYW43 SPI block to 32-bit mode (pre-SPI_BUS_CONTROL write), the chip expects WORD_LENGTH_16 + ENDIAN_BIG — which produces the pattern {b[1], b[0], b[3], b[2]} when we write a u32 host-little-endian. That’s what cyw43_put_swap32/cyw43_get_swap32 (cyw43_spi.c:40-48) do: a two-u16 byte swap rather than a full endian reverse. After the switch (host writes WORD_LENGTH_32 | ENDIAN_BIG | ...), subsequent accesses use plain cyw43_put_le32 byte order.

The endianness matrix (referenced across the plan):

Field Direction Order
gSPI command word, pre-32-bit-mode host → CYW43 half-word-swapped LE ({b1,b0,b3,b2})
gSPI command word, post-32-bit-mode host → CYW43 plain LE
Register data (post-mode-switch) both plain LE
Backplane register byte-addressable writes host → CYW43 LE per byte
SDPCM header fields (size, size_com, …) both LE
CDC header fields (cmd, len, flags, status) both LE
BDC header (4 bytes of flags/priority/flags2/data_offset) both individual bytes
Async event flags/event_type/status/reason CYW43 → host BE (decoder calls be16toh / be32toh)
Iovar u32 args (bsscfg:... prefix, etc.) host → CYW43 LE
Scan result struct fields CYW43 → host LE
Ethernet frame payload both network order (BE for ethertype)

All on-wire reads/writes in the new driver must pass through the endianness-aware helpers in §3.11; no @ptrCast of a packed struct over a raw []u8 buffer (see §3.5).

2.4.3 Backplane window management (349–393)

cyw43_set_backplane_window(self, addr):
  addr = addr & ~BACKPLANE_ADDR_MASK       // top bits only
  if addr == self->cur_backplane_window: return
  for each of HIGH/MID/LOW bytes whose value differs:
    write that byte of SDIO_BACKPLANE_ADDRESS_{HIGH,MID,LOW}
  self->cur_backplane_window = addr

Critical architectural invariant. The backplane window registers are write-only from SPI (AGENTS.md gotcha #17). The software cache (cur_backplane_window) is the only authoritative record. Every backplane access in the new driver must go through a single Backplane.setWindow(addr) call-site; no ad-hoc writes to HIGH/MID/LOW elsewhere. The plan enforces this as an invariant; ll/boot.zig and the firmware-upload loop must not bypass it.

cyw43_read_backplane / cyw43_write_backplane (366/381) call setWindow, mask off window bits, set the 4-byte-access flag (SBSDIO_SB_ACCESS_2_4B_FLAG = 0x8000) on the remaining address, do the access, then restore the window to CHIPCOMMON_BASE_ADDRESS as a known baseline.

2.4.4 Bus init + firmware upload (cyw43_ll_bus_init, 1424–1794)

Step-by-step sequence (SPI path only):

  1. Call cyw43_spi_init and cyw43_spi_gpio_setup / _reset (port-provided).
  2. Poll SPI_READ_TEST_REGISTER (addr 0x0014) for value 0xFEEDBEAD, using the pre-mode-switch byte-swapped read (read_reg_u32_swap), up to 10 × 1 ms.
  3. Write SPI_BUS_CONTROL (addr 0x0000) with WORD_LENGTH_32 | ENDIAN_BIG | HIGH_SPEED_MODE | WAKE_UP | (4 << (8*SPI_RESPONSE_DELAY)) | (INTR_WITH_STATUS << (8*SPI_STATUS_ENABLE)) — this is a single 32-bit write that replaces four byte-sized fields atomically. Use write_reg_u32_swap because we’re still in pre-mode-switch byte order. After this write cyw43_spi_set_polarity(self, 0) is called (port-specific PIO polarity reset).
  4. Verify SPI_BUS_CONTROL readback (now uses plain cyw43_read_reg_u32).
  5. Set SPI_RESP_DELAY_F1 = CYW43_BACKPLANE_READ_PAD_LEN_BYTES (16 for SPI on Pico W).
  6. Clear pending SPI_INTERRUPT_REGISTER bits.
  7. Enable a specific set of interrupts: F2_F3_FIFO_RD_UNDERFLOW | F2_F3_FIFO_WR_OVERFLOW | COMMAND_ERROR | DATA_ERROR | F2_PACKET_AVAILABLE | F1_OVERFLOW.
  8. Set ALP: write SDIO_CHIP_CLOCK_CSR with SBSDIO_ALP_AVAIL_REQ (0x08); poll for SBSDIO_ALP_AVAIL (0x40) with 10 × 1 ms. On SPI the ALP-force bits aren’t set (unlike SDIO).
  9. Clear ALP request (write 0 to SDIO_CHIP_CLOCK_CSR).
  10. Core reset sequence:
    • disable_device_core(CORE_WLAN_ARM, halt=false)
    • disable_device_core(CORE_SOCRAM, halt=false)
    • reset_device_core(CORE_SOCRAM, halt=false)
    • Clear SRAM_3 remap: write SOCSRAM_BANKX_INDEX = 3, SOCSRAM_BANKX_PDA = 0.
  11. Firmware sanity check (cyw43_check_valid_chipset_firmware, 395–418). Read last 800 bytes of fw blob, find the 16-byte DVID trailer, find the "Version: " string in the ~500 bytes before it. This is a sanity check, not cryptographic validation.
  12. Firmware upload via cyw43_download_resource — 64-byte chunks to backplane addr 0, with CYW43_WRITE_BYTES_PAD(len) (4-byte align on SPI). AGENTS.md gotcha #15: must use 64-byte chunks; larger silently corrupts. Gotcha #16: payload words must be LE-packed.
  13. NVRAM upload at (CYW43_RAM_SIZE - 4 - wifi_nvram_len) = 0x8_0000 - 4 - len, then write ((~(len/4) & 0xffff) << 16) | (len/4) to (CYW43_RAM_SIZE - 4). This is the "nvram header" the firmware checks at boot.
  14. reset_device_core(CORE_WLAN_ARM, false), device_core_is_up() sanity check.
  15. Poll SDIO_CHIP_CLOCK_CSR for SBSDIO_HT_AVAIL (0x80) — up to 1000 × 1 ms. Firmware-dependent ~29 ms.
  16. Write SDIO_INT_HOST_MASK = I_HMB_SW_MASK (0xf0).
  17. SPI: lower F2 watermark to 32 (SPI_F2_WATERMARK).
  18. Poll SPI_STATUS_REGISTER for STATUS_F2_RX_READY (bit 5) — up to 1000 × 1 ms.
  19. KSO setup (1713–1736). Write SDIO_WAKEUP_CTRL |= SBSDIO_WCTRL_WAKE_TILL_HT_AVAIL, set SDIOD_CCCR_BRCM_CARDCAP = CMD_NODEC, write SDIO_CHIP_CLOCK_CSR = SBSDIO_FORCE_HT (keep HT), set SDIO_SLEEP_CSR |= SBSDIO_SLPCSR_KEEP_SDIO_ON. Then write SDIO_PULL_UP = 0xf to put SPI interface block to sleep.
  20. Clear pad pulls. Write SDIO_PULL_UP = 0, read back.
  21. Clear residual DATA_UNAVAILABLE bit in SPI_INTERRUPT_REGISTER.
  22. cyw43_ll_bus_sleep(false) — the first wake-before-access to transition bus_is_up to true.
  23. CLM upload — call cyw43_clm_load (see §2.4.5).
  24. Iovar writes: bus:txglom = 0, apsta = 1.
  25. If mac provided, cyw43_write_iovar_n("cur_etheraddr", 6, mac, STA).

Returns 0 on success; ~400 ms typical.

Startup timing invariant (1886–1891). cyw43_ll_wifi_on enforces cyw43_hal_ticks_us() - self->startup_t0 >= 150000 (150 ms). Reference comments say missing this causes SDIOIT/OOB WL_HOST_WAKE IRQs to misbehave in bus-sleep mode. Preserve this delay; treat it as "early-bringup stability timing not fully characterised" and don’t drop below it.

2.4.5 CLM upload (cyw43_clm_load, 1351–1396)

CLM (Country Locale Module) is a ~8–9 KB blob appended to the firmware blob, loaded via the clmload iovar. Upload in 1024-byte chunks (1024+512 on SDIO). Each chunk is preceded by a 20-byte header:

offset  size  field
0       8     "clmload\x00"
8       2     flag (u16 LE)
10      2     type = 2 (u16 LE)
12      4     len  (u32 LE)
16      4     CRC (always 0)
20      N     chunk bytes

Flag bits: DLOAD_HANDLER_VER = 1<<12, DL_BEGIN = 2, DL_END = 4. First chunk sets DL_BEGIN, last chunk sets DL_END, all chunks set DLOAD_HANDLER_VER.

After upload, issue clmload_status as GET_VAR with a 19-byte buffer; first u32 of response should be 0 on success.

2.4.6 SDPCM framing

Header layout (struct sdpcm_header_t, 626–636):

offset  size  field
0       2     size        (total length including header)
2       2     size_com    (= ~size & 0xffff; XOR check)
4       1     sequence    (TX seq #)
5       1     channel_and_flags   (channel: low 4 bits)
6       1     next_length (0 typically)
7       1     header_length (= SDPCM_HEADER_LEN + (DATA ? 2 : 0))
8       1     wireless_flow_control
9       1     bus_data_credit
10      2     reserved

Total: 12 bytes.

TX credit protocol (cyw43_sdpcm_send_common, 641–717):

  • Host TX sequence number is wwd_sdpcm_packet_transmit_sequence_number, increments per packet.
  • Device publishes bus_data_credit in the header of every RX packet (byte 9).
  • Stall condition: wlan_flow_control != 0 OR last_bus_data_credit == tx_seq. In other words, credits are one-byte unsigned modular; when they catch up we stall.
  • Stall recovery: enter 1-second busy-wait loop. On SDIO, poke SDIO_TO_SB_MAILBOX with bit 3 every 100 ms to kick the device. On SPI this poke is a no-op — the only way credits come back is via RX packet decoding. Therefore the SPI stall-loop must:
    • repeatedly call cyw43_ll_sdpcm_poll_device to drain RX;
    • for each RX ASYNCEVENT, dispatch cyw43_cb_process_async_event (do not dispatch DATA — reentrancy hazard: sending another ethernet frame in response while in the middle of sending this one would corrupt the TX buffer);
    • timeout after 1 s, return -ETIMEDOUT.
  • Credit arithmetic is modulo 256 (credit = header->bus_data_credit - last_bus_data_credit; accept if credit <= 20, reject otherwise to tolerate out-of-order or stale credits). Preserve this in Zig using &% 0xFF.

Header packing for TX:

size                = SDPCM_HEADER_LEN + payload.len
size_com            = ~size & 0xffff
sequence            = tx_seq++
channel_and_flags   = kind (CONTROL or DATA)
next_length         = 0
header_length       = 12 + (DATA ? 2 : 0)    // the 2 bytes are BDC-align padding for DATA
wireless_flow_control = 0
bus_data_credit     = 0
reserved[2]         = 0

Writes go to WLAN_FUNCTION (F2), addr 0, with CYW43_WRITE_BYTES_PAD(size) (4-byte align for SPI, 64-byte align for SDIO).

2.4.7 CDC (IOCTL) header

Header layout (struct ioctl_header_t, 719–724):

offset  size  field
0       4     cmd         (WLC_*, LE)
4       4     len         (lower 16: output len, upper 16: input len) LE
8       4     flags       LE: [31:16] ioc_id, [15:12] interface, [2] SET/GET
12      4     status      LE, 0 from host; device sets on response

Total: 16 bytes.

Flags packing:

flags = ((ioc_id & 0xffff) << 16) | ((iface & 0xf) << 12) | (SDPCM_GET or SDPCM_SET)

Response matching. wwd_sdpcm_requested_ioctl_id is incremented per send; sdpcm_process_rx_packet extracts id = (ioctl_header->flags & 0xffff0000) >> 16 and matches against the last-sent id, dropping mismatches.

IOCTL dispatch (cyw43_do_ioctl, 1154–1185):

send_ioctl(kind, cmd, payload, iface)
start = now()
while (now() - start < CYW43_IOCTL_TIMEOUT_US /* 500 ms */):
    ret = poll_device()
    if CONTROL matching id:
        copy response back to payload
        return 0
    elif ASYNCEVENT:
        dispatch event (recursion-safe)
    elif DATA:
        dispatch ethernet RX (recursion-safe)
    else: warn
return -ETIMEDOUT

Reentrancy hazard. A dispatched ASYNCEVENT can (indirectly) call back into the driver’s ioctl surface (e.g. PRUNE → pend_rejoin → poll_func → cyw43_ll_wifi_set_wpa_auth → another ioctl). The reference avoids this by queueing pend flags rather than calling directly. The new Zig driver must preserve this queue-not-call discipline — no event handler is allowed to call doIoctl synchronously.

2.4.8 BDC (data-path) header

Header layout (struct sdpcm_bdc_header_t, 788–793):

offset  size  field
0       1     flags       (version in high nibble: 2 = 0x20; AGENTS.md gotcha #21)
1       1     priority
2       1     flags2      (low 4 bits = interface id)
3       1     data_offset (in 4-byte words; additional header bytes beyond BDC)

Total: 4 bytes.

TX ethernet (cyw43_ll_send_ethernet, 795–820):

spid_buf[0..12]     SDPCM header (header_length = 14 due to DATA)
spid_buf[12..14]    2 bytes zero padding
spid_buf[14..18]    BDC header (flags=0x20, priority=0, flags2=itf, data_offset=0)
spid_buf[18..]      ethernet frame (copy from buf or pbuf_copy_partial)

Send via cyw43_sdpcm_send_common(DATA_HEADER, 6 + len, buf) — the 6 is the 2 pad bytes + 4 BDC bytes.

RX ethernet. sdpcm_process_rx_packet case DATA_HEADER (888–909):

bdc_header    = &buf[header->header_length]
itf           = bdc_header->flags2
payload_start = bdc_header + 4 + (bdc_header->data_offset << 2)
payload_len   = header->size - (payload_start - header)
// caller: cyw43_cb_process_ethernet(itf, payload_len, payload_start)

The data_offset is firmware-variable; the decoder must honour it.

2.4.9 Async event pipeline

Wire framing. Events ride in ASYNCEVENT channel SDPCM packets. Payload after BDC is an ethernet-format frame with ethertype 0x886C (Broadcom custom). First 24 bytes of the ethernet payload are the Broadcom header:

offset 12..14  ethertype = 0x886C (BE)
offset 19..22  Broadcom OUI = 0x00_0010_18 (BE)
offset 24..    event_header_t:
  [0]   uint16 be  version
  [2]   uint16 be  flags
  [4]   uint32 be  event_type
  [8]   uint32 be  status
  [12]  uint32 be  reason
  [16]  uint32 be  auth_type
  [20]  uint32 be  datalen
  [24]  uint8[6]   src_addr
  [30]  uint16 be  datalen2
  ... (more fields not used by reference)

Reference code (592–621, cyw43_ll_parse_async_event) does something subtle:

// buf = &spid_buf[46], alignment only 2 bytes.
// Copy word-by-word 2 half-words into each u32 slot of buf[-2..]
for i in ((len + 3) >> 2) downto 1:
    *d++ = s[0] | s[1] << 16
    s += 2
// After relocation, buf[-2..] is word-aligned.
ev = &buf[-2]
ev->flags      = be16toh(ev->flags)
ev->event_type = be32toh(ev->event_type)
ev->status     = be32toh(ev->status)
ev->reason     = be32toh(ev->reason)

Do not port the relocation trick. The new Zig code must parse bytes directly using LE/BE helpers; no struct-cast over unaligned buffer. This is the correct path on Cortex-M0+ which does hard-fault on unaligned word access (unlike the reference’s apparent tolerance on host-test builds). See §3.5.

2.4.10 Scan result IE parsing (cyw43_ll_wifi_parse_scan_result, 538–590)

The escan result wraps a larger cyw43_scan_result_internal_t (503–528) at offset 48 from the event. After the fixed fields, IEs live at offset ie_offset with length ie_length. Walk them:

while (ie_ptr < ie_top):
    ie_type = ie_ptr[0]
    ie_len  = ie_ptr[1]
    if ie_ptr + 2 + ie_len <= ie_top:   // bounds check
        if ie_type == 48 (RSN):
            ie_rsn = ie_ptr
        elif ie_type == 221 (VENDOR):
            if memcmp(ie_ptr+2, "\x00\x50\xF2\x01", 4) == 0:
                ie_wpa = ie_ptr
    ie_ptr += 2 + ie_len
security = 0
if ie_rsn:     security |= 4  // WPA2
if ie_wpa:     security |= 2  // WPA
if capability & 0x10:  security |= 1  // WEP

The bounds check is mandatory. Malformed IE lists (length fields that overrun ie_top) must not crash the decoder. In Zig this is a while with explicit if (ie_ptr + 2 + ie_len > ie_top) break;.

2.4.11 WiFi bring-up (cyw43_ll_wifi_on, 1843–1914)

After bus_init:

  1. Set country (20-byte payload: "country\0" + country & 0xffff + rev + country & 0xffff). Reference uses specific rev override for CYW43_COUNTRY_WORLDWIDE on SDIO.
  2. cyw43_delay_ms(50).
  3. Set WLC_SET_ANTDIV = 0 (chip antenna).
  4. Iovars: bus:txglom = 0, apsta = 1, ampdu_ba_wsize = 8, ampdu_mpdu = 4, ampdu_rx_factor = 0.
  5. Wait until startup_t0 + 150 ms.
  6. Set event mask. 19 bytes of 0xff, then clear specific bits (events 19, 20, 40, 44, 54, 71). Sent via bsscfg:event_msgs iovar with u32 bsscfgidx = 0 prefix. The wire format: "bsscfg:event_msgs\0" + <4-byte LE u32 bsscfgidx> + <19-byte mask>, for a total of 41 bytes. The 18 + 4 + 19 in the C source looks like an off-by-one but it’s correct: the name buffer is declared 18 bytes in spid_buf (including the NUL), 4 bytes for bsscfgidx, 19 for mask.
  7. cyw43_delay_ms(50).
  8. WLC_UP ioctl with no payload.
  9. cyw43_delay_ms(50).

2.4.12 Join (cyw43_ll_wifi_join, 2051–2177)

Pre-conditions: wifi already on.

For WPA2-PSK (typical):

ampdu_ba_wsize = 8  (iovar)
WLC_SET_WSEC = auth_type & 0xff            // 4 = AES
"bsscfg:sup_wpa" = {0, 1}                  // bsscfgidx=0, supplicant on
"bsscfg:sup_wpa2_eapver" = {0, -1}         // EAP version: auto
"bsscfg:sup_wpa_tmo" = {0, 5000}           // supplicant timeout 5s
// Set PMK (actually passphrase in PMK format):
WLC_SET_WSEC_PMK with 68-byte buf:
  [0..2]  LE u16: key_len
  [2..4]  LE u16: 1      (WSEC_PASSPHRASE flag)
  [4..64] key bytes
  // 2ms delay before this ioctl — firmware-required
WLC_SET_INFRA = 1
WLC_SET_AUTH  = 0   (open; SAE would be 3 for WPA3)
"mfp" = 1           (MFP_CAPABLE for WPA2/WPA3; MFP_NONE for WPA1/open)
WLC_SET_WPA_AUTH = 0x80   (WPA2_AUTH_PSK)
// Cache ssid for rejoin:
last_ssid_joined[0..4]  = LE u32 ssid_len
last_ssid_joined[4..]   = ssid bytes

if bssid specified:
  use "join" iovar with 70-byte payload including chanspec for channel
else:
  WLC_SET_SSID = 36-byte payload from last_ssid_joined

For WPA3-SAE-PSK: use "sae_password" iovar (130 bytes) instead of WLC_SET_WSEC_PMK. Set WLC_SET_AUTH = 3 (AUTH_TYPE_SAE).

For WPA1 fallback (triggered by PRUNE reason=8): cyw43_ll_wifi_set_wpa_auth just writes WLC_SET_WPA_AUTH = 4 (WPA_AUTH_PSK).

No PMKID caching in the reference. The reference driver does not implement PMKSA caching. Every reconnect goes through the full 4-way handshake, which is also why a stale-PMK scenario (AP power-cycle while firmware still has our old PMKID) causes repeated handshake failures in practice. The new driver mandates PMKSA management as a Phase 3 deliverable (§5.9) — clear_on_boot as the default, cache_in_boot as a stretch. See §5.9 and §9 risk R6.

2.4.13 RX poll device (cyw43_ll_sdpcm_poll_device, SPI path 1006–1122)

if not had_successful_packet:
    if host_interrupt_pin != active: return -1
cyw43_ll_bus_sleep(false)
if not had_successful_packet:
    spi_int = read_u16(SPI_INTERRUPT_REGISTER)
    if spi_int != last_spi_int:
        if spi_int & BUS_OVERFLOW_UNDERFLOW: warn; stat++
    // (optional CYW43_CLEAR_SDIO_INT block)
    if spi_int: write_u16(SPI_INTERRUPT_REGISTER, spi_int)   // clear
    last_spi_int = spi_int
    if not (spi_int & F2_PACKET_AVAILABLE): return -1
// Read bus status, retry up to 1000x on 0xFFFFFFFF (bus not ready)
bus_gspi_status = read_u32(SPI_STATUS_REGISTER) (retry loop)
if bus_gspi_status & GSPI_PACKET_AVAILABLE:
    bytes_pending = (bus_gspi_status >> 9) & 0x7FF
    if invalid (0, oversize, underflow):
        write_u8(SPI_FRAME_CONTROL, 1)   // reset frame state
        had_successful_packet = false
        return -1
else: return -1
read_bytes(WLAN_FUNCTION, 0, bytes_pending, spid_buf)
// First 4 bytes are hdr[0]=size, hdr[1]=size_com with XOR check
check hdr[0] ^ hdr[1] == 0xffff
return sdpcm_process_rx_packet(spid_buf, ...)

2.4.14 cyw43_ll_process_packets (1126–1150)

Drains RX until poll_device returns no-packet, dispatching events/ethernet per-packet. This is the main loop call-path invoked by cyw43_poll_func whenever cyw43_ll_has_work() is true.

2.4.15 KSO / bus sleep (1248–1343)

KSO mode is the "keep SDIO on" protocol. Required on SDIO for bus-sleep integration with chip clock gating; on SPI it has weaker applicability but is still used in the reference.

cyw43_kso_set(value):
  write_value = value ? SBSDIO_SLPCSR_KEEP_SDIO_ON : 0
  write_reg_u8(SDIO_SLEEP_CSR, write_value)
  write_reg_u8(SDIO_SLEEP_CSR, write_value)   // yes, twice
  for i = 0..63:
      read_value = read_reg_u8(SDIO_SLEEP_CSR)
      if value:
          if (read_value & (KEEP_ON|DEVICE_ON)) == (KEEP_ON|DEVICE_ON) and read_value != 0xff:
              return
      else:
          if (read_value & KEEP_ON) == 0:
              return
      delay_ms(1)
      write_reg_u8(SDIO_SLEEP_CSR, write_value)
  warn("cyw43_kso_set failed")

cyw43_ll_bus_sleep(can_sleep):
  if can_sleep:
      if not bus_is_up: return
      kso_set(false)
      bus_is_up = false
  else:
      cyw43_cb_ensure_awake()   // integrator hook to reset sleep countdown
      if bus_is_up: return
      kso_set(true)
      bus_is_up = true

For the new Zig driver, the plan is conservative: preserve wake-before-access semantics on every TX ioctl and RX poll, without literally tying it to the KSO register sequence. If a reduced sleep protocol (e.g. just the SPI WAKE_UP bit toggle) works on Pico W, that’s an acceptable implementation of the contract. But the contract is "the device must be awake before F2 access" — and the burden of proof is on "it works without KSO" (observe stability under bus-sleep workloads in §7.3).

2.5 cyw43_spi.c / cyw43_spi.h (290 lines total)

cyw43_spi.c: port-agnostic SPI wrapping (cyw43_read_bytes, cyw43_write_bytes, cyw43_read_reg_u8/16/32, cyw43_write_reg_u8/16/32, read_reg_u32_swap, write_reg_u32_swap). Requires port-provided cyw43_spi_init, _deinit, _gpio_setup, _reset, _set_polarity, _transfer.

cyw43_spi.h: SPI register addresses (0x0000–0x001f for function-0 regs), SPI_STATUS bit layout, SPI interrupt bit layout. Most of this is already mirrored in our src/cyw43/regs.zig — the new tree will consolidate it into src/cyw43_new/bus/regs.zig.

On the byte-swap accessors. read_reg_u32_swap / write_reg_u32_swap (cyw43_spi.c:62, 76) are used exactly twice in the init path: once to read the test register before mode switch, and once to write the mode-switch value itself. After that, all access uses the plain cyw43_read_reg_u32. The new Zig code should name these readReg32Swapped / writeReg32Swapped and restrict them to the boot path. Everywhere else uses plain LE.

2.6 cyw43_config.h (223 lines — tunables)

Central port-integration header. Defines defaults for every tunable; the port overrides via CYW43_CONFIG_FILE / cyw43_configport.h. Key tunables to preserve (by name) in our Zig Config struct (§5.1):

C macro Default Zig Config field Use
CYW43_USE_SPI 0 hard-coded true (pico W only) transport selection
CYW43_IOCTL_TIMEOUT_US 500000 ioctl_timeout_us: u32 = 500_000 ioctl wait
CYW43_SLEEP_MAX 50 sleep_max_ticks: u32 = 50 bus-sleep countdown
CYW43_RESOURCE_VERIFY_DOWNLOAD 0 verify_firmware: bool = false debug
CYW43_BACKPLANE_READ_PAD_LEN_BYTES 16 (SPI) compile-time constant read pad size
CYW43_BUS_MAX_BLOCK_SIZE 64 (SPI) compile-time constant fw upload chunk
CYW43_USE_OTP_MAC 0 use_otp_mac: bool = false MAC source
CYW43_GPIO 0 enable_gpio: bool = true LED iovar on

Logging/debug macros (CYW43_DEBUG, CYW43_VDEBUG, CYW43_PRINTF, CYW43_WARN) in the C reference map in the Zig port to functions on the Config.logger interface — see §5.2.

2.7 cyw43_country.h

117 lines, trivial — defines CYW43_COUNTRY(A, B, REV) as a 3-byte packed value and enumerates common country codes. Port verbatim into src/cyw43_new/ctrl/country.zig.

2.8 Summary of reliability/recovery gaps in our current Zig driver

The audit surfaces the following behaviors present in the reference (or in Embassy/soypat/brcmfmac for events the C reference doesn't handle) but absent or defective in src/cyw43/:

# Gap Source Severity
G1 Join state machine (three-flag model per §6.1; originates in bit-flag form at cyw43_ctrl.c:56-65, 383-438) pico-sdk C High
G2 pend_disassoc / pend_rejoin / pend_rejoin_wpa deferred actions cyw43_ctrl.c:99, 218-254, 369-431 High
G3 last_ssid_joined cache + rejoin iovar cyw43_ll.c:2140-2187 High
G4 Exhaustive event decoding (PSK_SUP sub-states, PRUNE, ICV_ERROR, etc.) cyw43_ctrl.c:340-432 High (diagnostic for ISSUES.md #25)
G5 Event-name table for all 89 slots cyw43_ctrl.c:298-313 Low
G6 SDPCM TX credit stall drain loop cyw43_ll.c:651-696 Medium
G7 IOCTL timeout loop servicing async events + data cyw43_ll.c:1154-1185 Medium
G8 Alignment-safe async event parsing cyw43_ll.c:592-621 High (ARM hard-fault risk)
G9 IE-walker in scan results (RSN/WPA/WEP) cyw43_ll.c:538-590 Medium
G10 150 ms startup bring-up timing gate cyw43_ll.c:1886-1891 Medium
G11 Firmware "Version:" trailer sanity check cyw43_ll.c:395-418 Low
G12 Single backplane-window cache owner cyw43_ll.c:349-364 Medium (architectural)
G13 Broad event mask (all 19 bytes 0xff minus 6 explicit clears) cyw43_ll.c:1893-1904 High (directly unblocks ISSUES.md #25 diagnosis)
G14 Multicast filter management cyw43_ll.c:1929-1977 Low
G15 Scan callback ownership (env pointer + cb) cyw43_ctrl.c:340-351 Medium
G16 AUTH bad→good recovery (status=0 clears BADAUTH) cyw43_ctrl.c:383-395 Medium
G17 PMKSA clear_on_boot (mandatory §5.9; not in any reference driver) Plan §5.9, docs/CYW43-PMKSA-RESEARCH.md High
G18 ICV_ERROR (49) → pend_rejoin cyw43-driver PR #130 (Jan 2025) High (pico-sdk #2153)
G19 MIC_ERROR (17) → pend_rejoin Embassy events.rs:52 + brcmfmac knowledge Medium (crypto-drift sibling)
G20 UNICAST_DECODE_ERROR (50) → pend_rejoin Embassy events.rs:110 Medium (crypto-drift sibling)
G21 MULTICAST_DECODE_ERROR (51) → threshold rejoin Embassy events.rs:112 Medium (crypto-drift sibling, needs §6.1.1 threshold)
G22 PSM_WATCHDOG (41) → log+stats+rejoin Embassy events.rs:94, all drivers ignore Medium (firmware distress signal)
G23 GTK_PLUMBED (84) as health heartbeat Embassy events.rs:173 Low (positive signal tracking)
G24 BCNLOST_MSG (31) → degraded link Embassy events.rs:80 Medium (§6.2 trigger)
G25 PMKID_CACHE (21) event for cache_in_boot sync Embassy events.rs:60 Medium (§5.9.5 — cleaner cache_in_boot design)
G26 PSK_SUP reason=14 IGNORE (CCX_FAST_ROAM noise) Embassy runner.rs:1216, soypat identical High (real bug — without this we trigger spurious rejoin during roams)
G27 WPA3 AUTH FAIL reason=16 auth_type=3 detection Embassy runner.rs:1183, soypat identical Medium (§5.10)
G28 WPA3-SAE join flow (sae_password iovar, AUTH_TYPE_SAE) pico-sdk C cyw43_ll.c:2065-2111, Embassy PR #3323 Medium (§5.10 deliverable)

Legend for source column: rows G1–G16 originate in the pico-sdk C reference and are the core port. Rows G17–G28 are additions sourced from cross-referencing Embassy Rust + soypat Go + Linux brcmfmac (details in §10.4.1). G17 and G18 are the two most impactful reliability fixes; G26 is a real bug in our earlier draft caught by reading Embassy.


Section 3 — Impedance-mismatch catalog

Each category: C idiom → Zig 0.16 idiom. Every concrete example is a pattern to replicate across the rewrite. References to ZIG-0.16.0-REFERENCE.md are by section heading, not page.

3.1 Pointer arithmetic → slices

C (cyw43_ll.c:1353):

uint8_t *buf = &self->spid_buf[SDPCM_HEADER_LEN + 16];
memcpy(buf, "clmload\x00", 8);

Zig:

const buf = self.spid_buf[sdpcm_header_len + cdc_header_len ..];
@memcpy(buf[0..8], "clmload\x00");

Rule: never carry a [*]u8 across function boundaries. If you need a subrange of a buffer, pass a []u8 slice; the compiler preserves length; indexing is bounds-checked in Debug. For multi-hop functions where the inner function needs to know the offset origin, pass (buf: []u8, origin: usize) and let the inner compute buf[origin..].

3.2 Tagged unions

C (cyw43_ll.h:230-242):

typedef struct _cyw43_async_event_t {
    uint16_t _0;  uint16_t flags;  uint32_t event_type;
    uint32_t status; uint32_t reason;  uint8_t _1[30];
    uint8_t interface; uint8_t _2;
    union { cyw43_ev_scan_result_t scan_result; } u;
} cyw43_async_event_t;

Zig: use a tagged union for the decoded-into-host-memory form:

pub const Event = union(EventKind) {
    set_ssid: SetSsidEv,
    auth: AuthEv,
    link: LinkEv,
    psk_sup: PskSupEv,
    escan_result: ScanResult,
    prune: PruneEv,
    deauth_ind: DeauthIndEv,
    disassoc: DisassocEv,
    icv_error: void,
    unknown: UnknownEv,   // carries raw type + status + reason + payload len
};

The decoder converts the on-wire byte buffer into Event by length-checked parsing; no struct cast over raw bytes.

3.3 Function pointers → interface structs

C:

int (*wifi_scan_cb)(void *env, const cyw43_ev_scan_result_t *);

Zig (preferred pattern):

pub const ScanSink = struct {
    ctx: *anyopaque,
    onResult: *const fn (ctx: *anyopaque, result: *const ScanResult) void,
};

The *anyopaque context pointer is typed to the opaque "environment" owning the callback. Callees do const typed_ctx: *MyStuff = @ptrCast(@alignCast(ctx));.

For simple single-method hooks (e.g. Config.logger):

pub const Logger = struct {
    ctx: *anyopaque,
    puts: *const fn (ctx: *anyopaque, msg: []const u8) void,
};

Avoid *const fn (...) anyerror!void in hot paths — error union type inference across boundaries is a compilation-time cost; prefer explicit error sets (WifiError).

3.4 enum with explicit wire values

C:

#define CYW43_EV_ESCAN_RESULT (69)

Zig:

pub const EventKind = enum(u16) {
    set_ssid = 0,
    join = 1,
    auth = 3,
    deauth = 5,
    deauth_ind = 6,
    assoc = 7,
    disassoc = 11,
    disassoc_ind = 12,
    link = 16,
    prune = 23,
    psk_sup = 46,
    icv_error = 49,
    escan_result = 69,
    csa_complete_ind = 80,
    assoc_req_ie = 87,
    assoc_resp_ie = 88,
    _, // non-exhaustive — other values are decoded as Event.unknown
};

The _ trailing discriminant makes the enum non-exhaustive — indispensable for event types where firmware may emit values we don’t name.

3.5 Bit-packed registers / wire structs

DO NOT use packed struct over raw SPI RX buffers. Two reasons:

  1. Alignment. ARM Cortex-M0+ hard-faults on unaligned u16/u32 loads. The CYW43 RX buffer frequently delivers event payloads at odd offsets (e.g. spid_buf + 46). @ptrCast([*]u8, buf) + @ptrCast(*EventHeader, ...) + field access = potential fault.
  2. Endianness. Many fields are big-endian on the wire (event headers). packed struct in Zig assumes host endianness; you’d need separate Be mirrors and manual conversion, which negates the benefit.

Use byte readers instead:

pub fn readEventHeader(buf: []const u8) EventHeader {
    return .{
        .flags      = std.mem.readInt(u16, buf[2..4], .big),
        .event_type = std.mem.readInt(u32, buf[4..8], .big),
        .status     = std.mem.readInt(u32, buf[8..12], .big),
        .reason     = std.mem.readInt(u32, buf[12..16], .big),
    };
}

std.mem.readInt(T, buf[n..m], .big | .little) is the Zig 0.16 idiom. It takes a compile-time-known slice length and emits efficient code with no unaligned access.

packed struct is acceptable for host-side register composition (e.g. SPI command word) where the value starts as a u32 and the bit layout helps readability. But even then prefer bit-shift composition — it’s clearer and is exactly what the reference uses.

3.6 void * opaque state → generic parameter or concrete driver struct

C:

void *cb_data;
int (*cyw43_cb_process_async_event)(void *cb_data, const cyw43_async_event_t *);

Zig:

pub const Driver = struct {
    allocator: std.mem.Allocator,
    spi: Spi,
    hooks: HostHooks,
    log: Logger,
    state: CoreState,
    // ...
};

A single Driver struct replaces the opaque cyw43_t; callbacks become methods on that struct.

3.7 printf-style logging → project logger

Freestanding build forbids std.debug.print. The driver gets a Logger interface in Config (§5.2). Minimum operations: puts([]const u8), putHex32(u32), putDec(u32), putBytes([]const u8). The log call-sites in src/cyw43/ already use this pattern via fmt.puts etc.; preserve the convention.

No formatted output (std.fmt.bufPrint) in the driver — call the primitives directly. Format strings drag in the full formatter and explode code size.

3.8 Global mutable state → file-scope var on the driver struct

C (cyw43_ctrl.c:70-72):

cyw43_t cyw43_state;
void (*cyw43_poll)(void);
uint32_t cyw43_sleep;

Zig: the driver is an instance type. For the pico integration which has exactly one CYW43 chip, the binding site (src/bindings/wifi.zig) owns a single var driver: Driver = undefined; and passes &driver into every call. The driver itself does not declare a module-scope singleton.

Rationale: easier testing, cleaner dependency graph, no implicit ordering between the driver and the integrator’s init.

3.9 #define-heavy configuration → comptime config struct

C: cyw43_config.h 223 lines of #ifndef X #define X default #endif.

Zig:

pub const Config = struct {
    ioctl_timeout_us: u32 = 500_000,
    sleep_max_ticks: u32 = 50,
    verify_firmware: bool = false,
    country: Country = .worldwide,
    default_pm: u32 = PmValue.performance,
    use_otp_mac: bool = false,
    enable_gpio: bool = true,
    log: Logger,
    hooks: HostHooks,
    // transports are injected via `init(allocator, transport: Spi, config: Config)`
};

Compile-time behavior (e.g. whether to compile verification code) is a comptime field accessed via if (config.verify_firmware) { ... } inside the driver functions. With config stored in the driver struct, the runtime check is a branch on a bool field; for zero-cost dispatch the field can be promoted to a comptime parameter of a generic wrapper — but that’s an optimisation, not required.

3.10 Volatile access + memory barriers

SPI register reads return device-observable state; the PIO peripheral MMIO backing them is already volatile in src/platform/hal.zig. At the driver layer we do not need @as(*volatile T, ...) — that’s a transport-layer concern. The PIO SPI transport in src/cyw43_new/transport/pio_spi.zig is responsible for volatile access; the driver-layer code treats the returned value as a plain u32.

No memory barriers required on M0+ single-core; existing HAL regWrite is sufficient.

3.11 Error codes as int → Zig error sets

C:

#define CYW43_EIO       5
#define CYW43_EINVAL   22
#define CYW43_EPERM     1
#define CYW43_ETIMEDOUT 110
int ret = cyw43_ll_ioctl(...);
if (ret < 0) return ret;

Zig:

pub const WifiError = error{
    // I/O / bus
    SpiTestFailed,
    BusInitTimeout,
    BackplaneUnreachable,
    HtClockTimeout,
    F2NotReady,
    FirmwareSanityFailed,
    ClmLoadFailed,

    // IOCTL
    IoctlTimeout,
    IoctlResponseMismatch,
    IoctlKindInvalid,

    // SDPCM
    SdpcmCreditStall,
    SdpcmFrameTooLarge,
    SdpcmFrameMalformed,

    // WiFi control
    NotInitialized,
    NotStaActive,
    InvalidSsidLen,
    InvalidKeyLen,
    UnsupportedAuthType,
    PmksaNotCleared,       // join() called before PMKSA clear completed (§5.9)
    PmksaIovarUnsupported, // blob vintage does not support the iovar; see §5.9.4

    // Join
    JoinTimeout,
    JoinBadAuth,
    JoinNoNetwork,
    JoinFail,

    // Scan
    ScanInProgress,
    ScanNotStarted,
    ScanResultTruncated,

    // Ethernet
    EthFrameTooLarge,
};

Error union return: fn join(self: *Driver, cfg: JoinConfig) WifiError!void. Call-sites use try.

3.12 Byte-order conversions

std.mem.readInt(T, slice, .big | .little) and std.mem.writeInt(T, slice, value, .big | .little) are the 0.16 idioms. For the occasional inline nibble swap (pre-mode-switch gSPI command words), write a small helper fn writeSwapped32(dst: *[4]u8, val: u32) void that produces {b1, b0, b3, b2}.


Section 4 — Proposed Zig module architecture

Parallel tree at src/cyw43_new/. Existing src/cyw43/ is frozen during the rewrite (§8.1) — no changes except the minimum needed to keep the -Dcyw43=old build green.

Total estimated size: ~2,800–3,500 Zig lines across 18 files.

src/cyw43_new/
├── cyw43.zig                 (~120 LOC — public API re-export; compat façade)
├── types.zig                 (~180 LOC — public enums, Event, ScanResult, LinkState)
├── config.zig                (~120 LOC — Config struct, Logger, HostHooks interfaces)
├── errors.zig                (~60  LOC — WifiError union, Result helpers)
├── firmware.zig              (~50  LOC — @embedFile split; blob split helpers)
├── UPSTREAM.md                        — C-driver SHA tracking + M1..Mn local mod log
├── LICENSE-REFERENCE.md               — LICENSE.RP copy + attribution notes
│
├── transport/
│   ├── spi.zig               (~80  LOC — Spi vtable interface)
│   └── pio_spi.zig           (~420 LOC — RP2040 PIO implementation; replaces src/cyw43/transport/pio_spi.zig)
│
├── bus/
│   ├── regs.zig              (~220 LOC — SPI/backplane/SDPCM/CDC/event register & field defs)
│   ├── cmd.zig               (~40  LOC — gSPI command word packing + byte-swap helpers)
│   ├── bus.zig               (~170 LOC — u8/u16/u32 reg access, readBytes/writeBytes with pad handling)
│   └── backplane.zig         (~120 LOC — window cache, bpRead/bpWrite, bpReadBlock/bpWriteBlock)
│
├── ll/
│   ├── frame.zig             (~260 LOC — SDPCM + CDC + BDC headers pack/parse; credit arithmetic)
│   ├── ioctl.zig             (~200 LOC — doIoctl, setIovar/getIovar, iovar_u32, bsscfg helpers)
│   ├── events.zig            (~320 LOC — exhaustive event decoder; Event tagged union; name table)
│   ├── scan.zig              (~200 LOC — escan request + IE walker + result dispatch)
│   ├── boot.zig              (~420 LOC — cyw43_ll_bus_init equivalent: SPI setup, fw/nvram upload, core reset, ALP/HT wait)
│   ├── clm.zig               (~80  LOC — CLM chunked upload)
│   └── power.zig             (~140 LOC — bus sleep/wake; KSO implementation hidden behind "wake-before-access" contract)
│
├── ctrl/
│   ├── state.zig             (~100 LOC — Driver struct, core mutable state, pend flags)
│   ├── poll.zig              (~140 LOC — poll loop: drain RX, process pend_disassoc/rejoin_wpa/rejoin)
│   ├── join.zig              (~280 LOC — wifi_join + state machine + last_ssid cache + timeout)
│   ├── link.zig              (~100 LOC — link up/down transitions, integrator callback dispatch)
│   └── country.zig           (~130 LOC — country code table, direct port of cyw43_country.h)
│
└── hal.zig                   (~100 LOC — HostHooks interface: readIrqPin, ensureAwake, delayMs, ticksUs)

4.1 Per-file responsibility notes

  • cyw43.zigpub const Driver, pub const init, pub const deinit. Re-exports common types. Defines the compat façade: pub fn joinWpa2Compat(...) wrapping the new instance API to match the old module-level API shape. Kept intentionally thin.
  • types.zig — public-facing. Event, EventKind, ScanResult, LinkState, Country, AuthType, PmValue, ItfId. Internal wire structs do not live here — they live next to their encoders/decoders (e.g. SDPCM header in ll/frame.zig).
  • config.zigConfig, Logger, HostHooks, Transport. The public "how to wire up the driver" surface.
  • errors.zigWifiError. Isolated so other modules can @import("errors.zig").WifiError.
  • firmware.zig@embedFiles of 43439A0_combined.bin and 43439A0_nvram.bin. Splits the combined blob into firmware-prefix + CLM-suffix with a known-at-compile-time offset.
  • transport/spi.zig — interface: transferRx, transferTx, setPolarity, reset. Per §3.3 uses the ctx: *anyopaque + vtable pattern.
  • transport/pio_spi.zig — the one concrete implementation we ship. Directly ports the existing src/cyw43/transport/pio_spi.zig with cleanups (that file is 406 LOC and sound; ~80% line-for-line preservation).
  • bus/regs.zig — all on-wire constants. Replaces src/cyw43/regs.zig and expands coverage (event-type enum, CDC flag shifts, SDPCM constants).
  • bus/cmd.zigpackCmd(write, incr, fn, addr, sz) u32, writeSwapped32, readSwapped32. Isolated so the swap pattern has a single owner.
  • bus/bus.zigreadRegU8/16/U32, writeRegU8/16/32, readBytes(fn, addr, buf), writeBytes(fn, addr, src). Handles backplane read-pad alignment automatically.
  • bus/backplane.zigBackplane struct with cur_window: u32 cache. Methods: setWindow, read32, write32, readBlock, writeBlock. Single owner of window state per §2.4.3 invariant.
  • ll/frame.zigSdpcmHeader (in-memory aligned struct), CdcHeader, BdcHeader, pack_sdpcm, parse_sdpcm, parse_cdc_response, pack_bdc_tx, parse_bdc_rx, credit-arithmetic helpers.
  • ll/ioctl.zigdoIoctl(driver, kind, cmd, iface, payload) WifiError!usize, plus convenience wrappers: setIoctlU32, setIovar, setIovarU32, setBsscfgIovarU32, readIovarU32. Implements the 500 ms timeout + RX drain loop.
  • ll/events.zigdecodeEvent(bytes) Event. Includes the full event-name table (all 89 slots in the reference + any additional observed names; unknowns return Event.unknown{ raw_type, status, reason, len }). Event is the tagged union in §3.2.
  • ll/scan.zigstartScan(driver, opts) WifiError!void, parseScanResult(bytes) ScanResult (bounds-checked IE walker).
  • ll/boot.zigbusInit(driver, mac) WifiError!void. The 400+ ms bring-up sequence from §2.4.4. Uses ll/clm.zig for the CLM upload sub-step.
  • ll/clm.zigclmLoad(driver, clm_blob) WifiError!void.
  • ll/power.zigensureAwake(driver), allowSleep(driver). Implements the KSO sequence internally but the public contract is "wake before access, allow sleep when idle". Gives us freedom to drop KSO if SPI-only behavior proves sufficient (§2.4.15).
  • ctrl/state.zigDriver struct. Fields: spi, bus, backplane, config, log, hooks, join_state: JoinState, link_state: LinkState, scan_state: ScanState, mac: [6]u8, last_ssid_joined: [36]u8, pend_disassoc/pend_rejoin/pend_rejoin_wpa: bool, sleep_countdown: u32, sdpcm_tx_seq/last_credit: u8, ioctl_id: u16, spid_buf: [2048]u8, itf_state: u8, trace_flags: u32, initted: bool.
  • ctrl/poll.zigpollOnce(driver). The port of cyw43_poll_func. Drives pend-action processing and RX drain.
  • ctrl/join.zigjoinWpa2(driver, JoinConfig), rejoin(driver), plus the state-machine transitions driven by events. The events module dispatches into functions here; callers never interact with raw states.
  • ctrl/link.zigthin module. Derived link-status projection + callback emission only. No independent association policy lives here; all state transitions are owned by ctrl/join.zig + ctrl/state.zig. link.zig translates the current join_state + itf_state + IP-readiness flag into the LinkState enum consumed by HostHooks.onLinkUp / onLinkDown. If a future contributor is tempted to put an "if link degraded and no event in N seconds, do X" policy here, that policy belongs in ctrl/join.zig instead.
  • ctrl/country.zig — country code table. Port of cyw43_country.h as a Zig Country enum(u24) with a helper toWire(self) u32.
  • hal.zig — the HostHooks interface: readIrqPin(), ensureAwake(), delayMs(n), ticksUs(), ticksMs().

4.2 Comparison against current src/cyw43/ structure

Current path Target path Disposition
cyw43.zig (69 LOC) cyw43.zig + types.zig Rewritten. Public API expands.
device.zig (375 LOC) Split across ll/* and ctrl/state.zig Deleted. Responsibilities cleanly factored.
regs.zig (182 LOC) bus/regs.zig (expanded to ~220 LOC) Ported and extended.
types.zig (23 LOC) types.zig + errors.zig Ported, expanded, split.
board.zig (65 LOC) Keep in pico integration layer. Not a driver concern. Moves to src/bindings/wifi.zig or src/platform/boards.zig.
transport/bus.zig (162 LOC) bus/bus.zig Ported with cleanups.
transport/pio_spi.zig (406 LOC) transport/pio_spi.zig Ported ~80% as-is.
control/boot.zig (307 LOC) ll/boot.zig Rewritten to match C reference sequence exactly.
control/ioctl.zig (219 LOC) ll/frame.zig + ll/ioctl.zig Split. Credit arithmetic and frame shape cleanly separated.
control/join.zig (72 LOC) ctrl/join.zig (~280 LOC) Expanded ~4×. Adds state machine + retry + last_ssid cache.
control/scan.zig (138 LOC) ll/scan.zig Ported + IE walker added.
control/gpio.zig (24 LOC) Merged into ctrl/state.zig or a small ctrl/gpio.zig Tiny; inline or keep.
protocol/events.zig (70 LOC) ll/events.zig (~320 LOC) Rewritten. Exhaustive decoding; _ non-exhaustive enum.
netif/ethernet.zig (68 LOC) ll/frame.zig + ctrl/link.zig TX ethernet belongs next to frame framing; RX dispatch in ctrl.
netif/service.zig (24 LOC) ctrl/poll.zig Ported into the poll loop.
firmware/*.bin firmware/*.bin Preserved unchanged. Blobs are not modified.

4.3 Dependency graph (informal)

types.zig ──────────────────────────────────────────────────┐
errors.zig ─────────────────────────────────────────────────┤
config.zig ─────────────────────────────────────────────────┤
hal.zig  ───────────────────────────────────────────────────┤
firmware.zig ───────────────────────────────────────────────┤
                                                            │
transport/spi.zig ──→ transport/pio_spi.zig ────────────────┤
                                                            │
bus/cmd.zig ────────→ bus/bus.zig ──→ bus/backplane.zig ────┤
bus/regs.zig  ─────────────────────────────────────────────────→ referenced by all ll/
                                                                                  │
                        ll/frame.zig ──→ ll/ioctl.zig  ──→ ll/events.zig          │
                        ll/power.zig                       ll/scan.zig  ←─────────┘
                        ll/clm.zig ────→ ll/boot.zig
                                                   │
                                                   ▼
                                    ctrl/state.zig ──→ ctrl/join.zig
                                                   └→ ctrl/link.zig
                                                   └→ ctrl/poll.zig
                                                   └→ ctrl/country.zig
                                                                                  │
                                                                                  ▼
                                                                          cyw43.zig  (public)

No cycles. bus/* does not depend on ll/*; ll/* does not depend on ctrl/* (events dispatch up via callback). This is the dependency-inversion the current tree gets mostly right; the rewrite tightens it.


Section 5 — Public API design

The public surface exposed to src/bindings/wifi.zig, src/net/*.zig, and any other pico code. Designed from first principles; §5.10 gives the compat façade preserved for migration.

5.1 Config struct

pub const Config = struct {
    country: Country = .worldwide,
    default_pm: u32 = PmValue.performance,
    ioctl_timeout_us: u32 = 500_000,
    sleep_max_ticks: u32 = 50,
    verify_firmware: bool = false,   // verify blob upload after write; adds ~40 ms to boot
    use_otp_mac: bool = false,       // if true, use MAC from OTP instead of caller-supplied
    enable_gpio: bool = true,        // enable CYW43 GPIO iovar (required for LED on Pico W)
    wpa3_mode: Wpa3Mode = .auto,     // WPA3-SAE policy; see §5.10 for fallback semantics
    auto_reconnect: bool = true,     // use join state machine's pend_rejoin on transient failures
    pmksa_policy: PmksaPolicy = .clear_on_boot, // mandatory Phase 3 deliverable — see §5.9
    log: Logger,
    hooks: HostHooks,
    transport: *anyopaque,           // Spi vtable pointer
    transport_vt: *const Spi.VTable,
};

5.2 Logger interface

pub const Logger = struct {
    ctx: *anyopaque,
    vt: *const VTable,
    pub const VTable = struct {
        puts: *const fn (ctx: *anyopaque, msg: []const u8) void,
        putDec: *const fn (ctx: *anyopaque, value: u32) void,
        putHex32: *const fn (ctx: *anyopaque, value: u32) void,
    };
    pub inline fn puts(self: Logger, msg: []const u8) void {
        self.vt.puts(self.ctx, msg);
    }
    // ...
};

5.3 HostHooks interface

pub const HostHooks = struct {
    ctx: *anyopaque,
    vt: *const VTable,
    pub const VTable = struct {
        readIrqPin: *const fn (ctx: *anyopaque) bool,   // returns "active"
        delayMs: *const fn (ctx: *anyopaque, ms: u32) void,
        delayUs: *const fn (ctx: *anyopaque, us: u32) void,
        ticksMs: *const fn (ctx: *anyopaque) u32,
        ticksUs: *const fn (ctx: *anyopaque) u32,
        onLinkUp: *const fn (ctx: *anyopaque, itf: ItfId) void,
        onLinkDown: *const fn (ctx: *anyopaque, itf: ItfId) void,
        onEthernetRx: *const fn (ctx: *anyopaque, itf: ItfId, frame: []const u8) void,
        onEvent: ?*const fn (ctx: *anyopaque, event: *const Event) void = null,

        /// Called during long-running driver operations (firmware upload,
        /// CLM load, credit-stall-wait, ioctl timeout loop). Replaces the
        /// reference C driver's CYW43_EVENT_POLL_HOOK macro (cyw43_ll.c:434).
        /// The host must feed watchdog, run its own cooperative scheduler,
        /// and return quickly — target budget ≤ 1 ms per call.
        /// Called from driver context; must not call back into the driver.
        yieldDuringLongOp: ?*const fn (ctx: *anyopaque) void = null,
    };
};

The onEvent hook is optional — if set, the driver dispatches every decoded event to the host. The pico integration sets it so DEAUTH-triggered reconnect logic can live in the pico layer (as opposed to only inside the driver); see §5.8.

yieldDuringLongOp is also optional but strongly recommended for embedded integrators. Call sites inside the driver:

  • ll/boot.zig::busInit — once per 64-byte firmware chunk (~3,500 calls over ~400 ms).
  • ll/clm.zig::clmLoad — once per 1024-byte CLM chunk (~10 calls).
  • ll/frame.zig::waitForCredit — every poll iteration during credit stall.
  • ll/ioctl.zig::doIoctl — every 1 ms wait iteration.

Pico integration implements this as watchdog.feed(); scheduler.poll(); led.poll(); — same set of work as the superloop's main iteration per docs/NANORUBY.md §A4. Without this hook, firmware upload can starve the watchdog on a Pico W configured for aggressive (<1 s) watchdog timeout.

Non-reentrancy rules (MANDATORY — integrators will get this wrong otherwise):

  1. Must not call back into any CYW43 driver API. No driver.sendEthernet(), no driver.doIoctl(), no driver.pollOnce(), no driver.getEventLog(). The driver is in a partial-state inside its long op; reentry corrupts state.
  2. Must not mutate driver-owned buffers or state. The integrator's ctx is the integrator's state; touching it is fine. Touching anything reachable through the driver instance is not.
  3. Allowed operations only: feed watchdog, poll a non-driver scheduler, update a GPIO (LED), service a non-driver IRQ handler's posted-work queue, increment host-side counters.
  4. Must return quickly. Budget ≤ 1 ms per call. Not a place for network polling or flash I/O.
  5. Idempotent on spurious call. The driver may call this more often than strictly necessary (e.g. once per chunk even if some chunks are fast). Integrator must tolerate 3,500+ calls over 400 ms of firmware upload.

Violating rule 1 is the most likely integration bug. A reviewer checking new integrator code should grep yieldDuringLongOp implementations for any symbol that starts with cyw43. / driver. and reject immediately.

5.4 Driver.init / deinit

pub fn init(allocator: std.mem.Allocator, config: Config) WifiError!Driver;
pub fn deinit(self: *Driver) void;

init does nothing on the SPI bus — it just sets up state. To bring the chip up:

pub fn wifiOn(self: *Driver, country: Country) WifiError!void;
pub fn wifiOff(self: *Driver) WifiError!void;

5.5 Scan

pub const ScanOpts = struct {
    ssid: ?[]const u8 = null,   // null = all
    passive: bool = false,
};

pub fn scanStart(self: *Driver, opts: ScanOpts, sink: ScanSink) WifiError!void;
pub fn scanActive(self: *const Driver) bool;
pub fn scanStop(self: *Driver) void;

pub const ScanSink = struct {
    ctx: *anyopaque,
    onResult: *const fn (ctx: *anyopaque, r: *const ScanResult) bool,  // return true to continue
    onComplete: *const fn (ctx: *anyopaque) void,
};

onResult returning false signals "stop scanning" — the driver issues an abort.

5.6 Join / leave

pub const JoinConfig = struct {
    ssid: []const u8,
    key: []const u8 = &.{},
    auth: AuthType,
    bssid: ?[6]u8 = null,     // bind to specific AP
    channel: u32 = CYW43_CHANNEL_NONE,
    timeout_ms: u32 = 15_000,
};

pub fn join(self: *Driver, config: JoinConfig) WifiError!void;
pub fn leave(self: *Driver) WifiError!void;
pub fn rejoin(self: *Driver) WifiError!void;  // re-issue last join

join is synchronous: it polls internal state until join_state == joined or timeout. During the polling, it drives hooks.delayMs(1) and internal RX drain. This is fine for pico’s cooperative model.

5.7 State queries

pub fn linkStatus(self: *const Driver) LinkState;  // LinkState enum below
pub fn rssi(self: *Driver) WifiError!i32;
pub fn macAddr(self: *const Driver) [6]u8;
pub fn bssid(self: *Driver) WifiError![6]u8;
pub fn currentAuth(self: *const Driver) ?AuthType;

pub const LinkState = enum {
    down,
    associating,
    associated_no_ip,   // up at L2; integrator knows about L3
    up,                 // integrator has called markIpReady
    degraded,           // beacon loss / signal warning but still associated
    reconnecting,
    fail_badauth,
    fail_nonet,
    fail_general,
};

5.8 Event subscription

The HostHooks.onEvent callback fires for every decoded event (not just those the driver handles internally). Pico uses this for:

  • Logging unknown event types (ISSUES.md #25 diagnostic).
  • Externalizing DEAUTH-triggered reconnect policy beyond what auto_reconnect implements.
  • Surfacing RSSI warnings and roam events to user-facing JS/Ruby.

The hook runs in poll context, not IRQ — safe to allocate, log, etc.

5.9 PMKSA cache (mandatory Phase 3 deliverable; improvement over reference)

The reference C driver does not materially implement host-side PMKSA cache management. Neither does our current Zig driver. Research (see docs/CYW43-PMKSA-RESEARCH.md) establishes both the mechanism and the scope of what PMKSA addresses.

PMKSA handling is a hard Phase 3 deliverable. Cutover (Phase 4) does not flip the default build until clear_on_boot is landed and verified on hardware (§7.3 H2).

Framing correction from research. The most-reported Pico W reliability issue (pico-sdk #2153) is not a PMKSA-cache problem. It is an ICV_ERROR (event 49) flood triggered by missed rekey exchanges under power-save mode. The fix is a 3-line addition to the event handler (queue pend_rejoin on event 49) — exactly the pattern our join state machine (§6.1) already commits to. This is the highest-value reliability fix in the rewrite, and it is orthogonal to PMKSA. The rewrite gets both.

PMKSA clear_on_boot addresses an independent failure mode: when the CYW43 firmware retains a PMKSA cache entry across a host watchdog-reset (chip stays powered, only RP2040 restarts) and then the AP has since forgotten the PMKSA (AP reboot, session timeout, config change), the chip attempts fast-reauth with stale state and the AP refuses. Clearing the firmware cache at every wifiOn forces the first join to be a clean 4-way handshake, side-stepping this drift.

Together, both mechanisms cover the reliability surface:

Fix Cost Addresses Evidence
Exhaustive event decoder + event 49 → pend_rejoin 3 lines in event dispatcher pico-sdk #2153 (ICV_ERROR flood under PM) cyw43-driver PR #130 (Jan 2025)
PMKSA clear_on_boot ~20 LOC, one 356-byte iovar at wifiOn end AP-side PMKSA drift across host reboot (802.11-fundamental) Inferred from 802.11 spec + brcmfmac behavior

Both ship in Phase 3. See §5.9.1 for PMKSA API surface; §6.1 for the event handler (which includes event 49 per the state-machine success criteria).

5.9.1 API surface

pub const PmksaPolicy = enum {
    // Active management modes — one of these must be chosen for Phase 4 cutover.
    clear_on_boot,    // default. Wipe firmware PMKID cache at every wifiOn so the
                      // first join is a clean 4-way handshake, preventing
                      // firmware-vs-AP state drift after router power-cycle.
    cache_in_boot,    // clear_on_boot + host maintains a (BSSID → PMKID) cache for
                      // the remainder of this boot. Enables fast reauth (skip 4-way
                      // handshake on reconnect to same AP within a boot session).
                      // Evicts on DEAUTH/DISASSOC per §5.9.3.

    // Escape hatch only — present so a debug build can A/B against reference
    // behavior. Not for production.
    disabled,         // no explicit PMKSA management; rely on firmware defaults.
                      // Behavior matches the reference C driver; expect stale-PMK
                      // failures after AP power-cycle.
};

// in Config:
pmksa_policy: PmksaPolicy = .clear_on_boot,

// Additional config for cache_in_boot mode:
pmksa_cache_capacity: usize = 4,   // number of (BSSID, PMKID) entries retained

Default is clear_on_boot. cache_in_boot is a Phase 3 stretch goal (ship if budget allows; defer to Phase B if not) — but clear_on_boot is non-negotiable.

5.9.2 Iovar research (COMPLETED — see docs/CYW43-PMKSA-RESEARCH.md)

The pre-Phase-3 research mandated here has been completed in this planning session and is captured at docs/CYW43-PMKSA-RESEARCH.md. Summary of verified findings (Phase 3 coding may proceed from these facts without further research):

  • Iovar name: pmkid_info (plain, NOT bsscfg:-prefixed). Confirmed from Linux kernel brcmfmac/cfg80211.c at function brcmf_update_pmklist.
  • Chip compatibility: CYW43439 is explicitly handled by brcmfmac as CY_CC_43439_CHIP_ID alongside legacy-family siblings BCM43430 / BCM4345 / BCM43454 (confirmed from brcmfmac/feature.c).
  • API version: our blob is firmware 7.95.61 from 2023-01-11 (verified via strings src/cyw43/firmware/43439A0_combined.bin). Firmware 7.x pre-dates WLC version 12.0 — so we use the legacy API. V2 is never implemented in brcmfmac; V3 requires WLC ≥ 13.0. The 43439A0_combined.bin filename is a label; the inside is 7.95.61 (identical to Embassy's bundled firmware). Earlier drafts of this plan incorrectly said 7.95.49.00 — that was the pico-sdk reference's bundled version at our dd7568 audit SHA, not what our repo actually ships. Corrected 2026-04-20.
  • Legacy payload: exactly 356 bytes = __le32 npmk (LE u32) + 16 × {u8 bssid[6]; u8 pmkid[16]}. No padding, no version/length header.
  • MAXPMKID = 16, PMKID_LEN = 16. 802.11 standard values.
  • Flush operation: zero the whole 356-byte buffer and send it. npmk = 0 means "clear all entries."
  • Endianness: only npmk is LE-encoded; everything else is byte-arrays.
  • License boundary: brcmfmac is GPL-2.0. The research document reads brcmfmac behavior as protocol evidence (not copied code). Zig implementation written from the research spec without referencing brcmfmac source directly. §10 rules apply.

Commit docs/CYW43-PMKSA-RESEARCH.md is the Phase 3 P3a pre-work artifact. §11.3 go/no-go checkbox "pre-work artifact present" is satisfied by its existence in the repo.

5.9.3 Runtime behavior spec

clear_on_boot (the default, mandatory):

  • Lifecycle point (precise). The clear runs in this exact position within wifiOn():

    wifiOn(country) flow:
      1. cyw43_ll_wifi_on prerequisites (country, antenna, iovars, 150ms gate)
      2. Event mask set via bsscfg:event_msgs  ──── events can now arrive
      3. WLC_UP ioctl                          ──── interface is up
      4. cyw43_delay_ms(50)                    ──── reference's post-UP settle
      5. >>> PMKSA clear_on_boot here <<<      ──── our insertion point
      6. Driver state transitions to wifi_up_pmksa_cleared
      7. wifiOn() returns
    

    Placement rationale (too-early / too-late race windows):

    • Too early (before WLC_UP): firmware may not accept the iovar before the interface is up; response status 0xffffffe2 (NOTASSOCIATED) observable.
    • Too late (after first join attempt): the whole point is to ensure the first join sees an empty cache. If the clear lands mid-join or post-join, it's worse than useless — it discards the PMKID we just established.
    • Our chosen point: after WLC_UP + 50ms settle (firmware is fully up and accepting iovars) but before wifiOn() returns (so join() cannot possibly be called yet — caller doesn't even have control).
  • join() checks state == wifi_up_pmksa_cleared and errors with WifiError.PmksaNotCleared if the clear has not completed — this prevents a caller from racing wifiOn() and join() in a way that would allow stale PMK reuse on the first association.

  • Issue the PMKSA-clear iovar (zero-length list, per research doc).

  • Failure policy on a supposedly-supported blob. A non-zero response from the iovar call is a Phase 3 validation failure — it must not be silently tolerated during the P3b hardware-verification gate. For runtime after verification has passed, an unexpected non-zero response logs a warning and continues (degraded reliability rather than refusing to boot — devices need to remain debuggable). If the failure is on a blob that research already flagged as unsupported, fall through to §5.9.4 fallback; do not silently continue.

  • No per-join action required in this mode; firmware starts each join with an empty PMKID cache.

cache_in_boot (stretch):

  • On every successful joined transition (§6.1), compute or request the PMKID for the current BSSID and add (bssid, pmkid) to the host-side cache.
    • PMKID is produced by the firmware as part of the 4-way handshake. The iovar to read the current-session PMKID is (per brcmfmac) pmkid_info with GET direction, returning the whole cache list.
  • On DEAUTH_IND / DISASSOC_IND / EV_ICV_ERROR for the current BSSID: evict that BSSID's entry from the host cache and from firmware (via del_pmksa iovar, or set-list with that entry removed). Eviction is the critical correctness step — without it, the cache causes the reliability issue it was supposed to solve.
  • Cache is bounded (pmksa_cache_capacity, default 4). Eviction policy on overflow: LRU (last-added is always kept; oldest is dropped).
  • Cache is not persisted across reboots in V1. Persistent PMKSA requires flash-write support and bindings/storage.zig flash-write isn't implemented yet (ISSUES.md open item #3). Mark as future work in UPSTREAM.md.

disabled (debug only):

  • No iovars sent. Behavior identical to cyw43-driver. Used only for A/B debugging against the reference.

5.9.4 Fallback if the iovar is absent on our blob vintage

Pre-Phase-3 research might reveal that our 2023-era 7.95.61 blob lacks the pmkid_info iovar entirely. (Extremely unlikely — 7.95.61 is recent, the chip family has had pmkid_info for years per brcmfmac, and Embassy's shipped firmware is identical to ours.) The Phase 3 task must still test this on real hardware (issue the iovar, check response status; ~30 minutes of work). Three outcomes:

  1. Iovar works as specified — land §5.9.3 as designed. (Expected outcome.)

  2. Iovar returns firmware error — upgrade the blob vintage. soypat ships 7.95.62 (Apr 2023), which is the newest public vintage for this chip family. This triggers risk R17 (re-run golden traces + hardware matrix against the new blob) but unblocks PMKSA. Document the vintage upgrade in UPSTREAM.md.

  3. Blob upgrade is infeasible — in this specific failure mode only, the plan may fall back to an alternate primitive that must itself be documented in brcmfmac or WHD as clearing the same supplicant-side cache state the missing iovar would have cleared. The research document must name the alternate iovar/command and cite the source evidence before Phase 3 accepts this fallback path. Ad-hoc sequences invented locally (e.g. "maybe toggling bsscfg:sup_wpa flushes the cache") are not acceptable as an outcome-(3) implementation — they re-introduce the exact "invent-undocumented-behavior" risk that the pre-work research was meant to eliminate. If no alternate primitive exists in either reference, the correct answer is outcome (2): upgrade the blob.

    cache_in_boot is deferred to Phase B in the outcome-(3) scenario.

Outcome (1) is expected. Outcome (2) is acceptable but triggers R17. Outcome (3) exists only as a documented escape hatch backed by external evidence; the researcher must not preemptively assume it.

5.9.5 cache_in_boot synchronization via PMKID_CACHE event

Original §5.9.3 design polled firmware via pmkid_info GET after each join to refresh the host-side cache. After examining Embassy and soypat event enums, a cleaner design uses the firmware-originated PMKID_CACHE event (type 21) as the sync trigger:

// ctrl/pmksa.zig (sketch):
// Firmware emits EV_PMKID_CACHE whenever its internal PMKSA state changes
// (new cache entry after 4-way handshake, entry expired, etc.). We mirror
// to host-side cache reactively rather than polling.

fn onPmkidCacheEvent(self: *Driver, ev: *const Event) void {
    // Read firmware's current cache via pmkid_info GET.
    var buf: [356]u8 = undefined;
    self.ioctl.getIovar("pmkid_info", &buf) catch return;
    self.pmksa_cache.syncFromFirmware(&buf);
    self.stats.pmksa_cache_syncs += 1;
}

Benefits over polling:

  • No redundant GETs after non-state-changing events.
  • Cache is exactly consistent with firmware state (no race window).
  • Observable in logs: every PMKSA cache change is traceable.
  • Still ships DEAUTH_IND / DISASSOC_IND / ICV_ERROR → evict-and-SET to proactively remove entries on known-bad sessions (belt + suspenders).

This adjustment is recorded here but the cache_in_boot mode remains a Phase 3 stretch (§5.9.3). clear_on_boot does not require this event; it's a Phase-3-P3c-only consideration.

5.10 WPA3-SAE support (Phase 3 deliverable)

Decision: YES, implement WPA3-SAE as a shipped feature in Phase 3.

5.10.1 Feasibility check — all four preconditions satisfied

Precondition Evidence
Firmware supports SAE Our 7.95.61 blob capability string contains mfp-sae. Verified via strings src/cyw43/firmware/43439A0_combined.bin.
Firmware supports MFP Same string contains mfp (Management Frame Protection — mandatory for WPA3).
Protocol knowledge pico-sdk C reference implements SAE join at cyw43_ll.c:2065-2111 via sae_password iovar + WLC_SET_AUTH=3 (AUTH_TYPE_SAE) + wpa_auth = CYW43_WPA3_AUTH_SAE_PSK (0x40000).
Cross-verified in other drivers Embassy added WPA3 support in PR #3323 (merged). Capability observed in misc/embassy/cyw43/src/control.rs via grep for sae.

No blob upgrade needed. No chip feature gap. The only reason earlier drafts marked WPA3 as "optional" was conservative scope-cutting, not a feature gap.

5.10.2 Wire-level implementation spec

Extend the join flow (§2.4.12) with a WPA3 branch:

if auth_type ∈ {WPA3_SAE_AES_PSK, WPA3_WPA2_AES_PSK}:
    wpa_auth = CYW43_WPA3_AUTH_SAE_PSK            // 0x40000
    (for WPA3_WPA2 transition mode: wpa_auth |= CYW43_WPA2_AUTH_PSK)
    auth_cmd = AUTH_TYPE_SAE                      // 3 for WLC_SET_AUTH
    mfp_val  = MFP_REQUIRED                       // 2 for pure WPA3
                                                  // MFP_CAPABLE (1) for WPA3+WPA2 mixed
    key_length_max = 128                          // CYW43_WPA_SAE_MAX_PASSWORD_LEN

    # Use "sae_password" iovar, not WLC_SET_WSEC_PMK:
    buf[0..2] = LE u16 key_len
    buf[2..130] = key bytes (zero-padded to 128)
    cyw43_delay_ms(2)                             # firmware needs prep time
    iovar_set("sae_password", buf, 130, STA)

For WPA3-SAE: WLC_SET_WSEC gets auth_type & 0xff (same as WPA2 path — AES is 4).

5.10.3 Config surface

// in Config:
pub const Wpa3Mode = enum {
    off,           // skip WPA3 iovars entirely; reject wpa3_* AuthTypes at join()
    wpa2_only,     // compile WPA3 code but default to WPA2 for all joins unless
                   // caller explicitly passes a wpa3_* AuthType
    auto,          // DEFAULT: try SAE first if AuthType is wpa3_*; on specific
                   // failure signals (AUTH FAIL reason=16 auth_type=3, or
                   // sequence of 3 SAE timeouts), attempt WPA2 fallback IFF
                   // wpa3_wpa2_aes_psk (transition-mode) was the AuthType.
                   // Pure wpa3_sae_aes_psk does NOT fall back to WPA2.
    prefer_sae,    // always prefer SAE; never fall back (for WPA3-only AP)
};
wpa3_mode: Wpa3Mode = .auto,

Fallback semantics (.auto mode):

AuthType Initial attempt On WPA3 failure Failure signals that trigger fallback
open / wpa_tkip_psk / wpa2_aes_psk / wpa2_mixed_psk Always WPA2 path n/a n/a
wpa3_wpa2_aes_psk (transition) SAE WPA2 AUTH FAIL reason=16 auth_type=3, or 3 consecutive SAE timeouts in one join call
wpa3_sae_aes_psk (pure) SAE No fallback — returns JoinBadAuth Same signals, but fallback not attempted

Fallback decision is made entirely inside ctrl/join.zig without exposing the mode-switch to the caller (the caller asked for wpa3_wpa2_aes_psk specifically because they want either to work). Fallback counts against the same join timeout_ms budget; only one fallback attempt per join call.

Rationale for .auto as default, not a plain bool = true: WPA3 support is binary at the firmware level but trinary at the policy level (do we force SAE, allow fallback, or refuse WPA3 entirely?). GPT-5.4 turn-6 review flagged that a plain boolean conflates the two — .auto gives integrators explicit control without requiring them to reason about WPA3 semantics.

AuthType enum in types.zig already enumerates WPA3 (per §2.2 and §5.1):

pub const AuthType = enum(u32) {
    open                 = 0,
    wpa_tkip_psk         = 0x00200002,
    wpa2_aes_psk         = 0x00400004,
    wpa2_mixed_psk       = 0x00400006,
    wpa3_sae_aes_psk     = 0x01000004,  // pure WPA3
    wpa3_wpa2_aes_psk    = 0x01400004,  // WPA3 + WPA2 transition mode
};

When wpa3_mode = .off, passing one of the WPA3 values to join() returns WifiError.UnsupportedAuthType. This provides a flash-saving escape hatch (~1–2 KB of SAE code+config) for WPA2-only deployments. When .wpa2_only, the code is compiled but the driver defaults to the WPA2 path unless the caller explicitly passes a wpa3_* AuthType.

5.10.4 AGENTS.md gotcha #28 status

AGENTS.md §CYW43 gotchas #28 currently documents: "WPA3/mixed-mode APs break WPA2 join." This was accurate for the old driver which has mfp=1 (MFP_CAPABLE) hard-coded and doesn't do the SAE handshake. The new driver implements SAE properly and should close this gotcha — WPA3 APs become first-class supported. Plan a final gotcha-list update in Phase 4.

5.10.5 WPA3 validation — hardware matrix additions

New entries for §7.3:

# Scenario Pass criteria
H11 WPA3-SAE join (pure) Join AP configured WPA3-only; 4-way handshake completes using SAE. Existing table entry; now mandatory.
H13 WPA3 failed-password detection AUTH FAIL reason=16 auth_type=3 observed; returns JoinBadAuth within 5 s (§6.1 WPA3-specific rule).
H14 WPA3-WPA2 transition AP Join AP in transition mode; observe which auth path the firmware chose; both WPA2 fallback and WPA3 primary are acceptable.

WPA3 validation is NOT a Phase 4 cutover blocker (phasing: ships with Phase 3 code + validated in Phase 3 soak; additional hardening in Phase 4). -Dcyw43=new_shadow builds must compile WPA3 code by default; any regressed WPA2-only test indicates a WPA3 implementation leaking into the WPA2 path and must be fixed.

5.11 Power-save mode

pub const PmValue = struct {
    pub const none = pack(.{ .mode = .none });
    pub const aggressive = pack(.{ .mode = .pm1 });
    pub const performance = pack(.{ .mode = .pm2, .sleep_ret_ms = 200, .li_beacon = 1, .li_dtim = 1, .li_assoc = 10 });
    // ...
};

pub fn setPowerSave(self: *Driver, pm: u32) WifiError!void;
pub fn getPowerSave(self: *const Driver) WifiError!u32;

Packed u32 mirrors the reference’s encoding so the value round-trips through calling code.

5.12 Ethernet TX/RX

pub fn sendEthernet(self: *Driver, itf: ItfId, frame: []const u8) WifiError!void;

RX goes through HostHooks.onEthernetRx — no recv() API in the driver (async / poll-driven).

5.13 Country / regulatory

pub fn setCountry(self: *Driver, country: Country) WifiError!void;
// exposed as a field in Config; setCountry is only for runtime changes

5.14 Compat façade

File cyw43.zig exports (alongside the new API) the pre-rewrite module-level surface so src/bindings/wifi.zig compiles unchanged during migration:

// Legacy module-level API kept as compatibility facade.
// Delegates to the default-configured Driver instance owned by this module.
// Phase 4 of the migration removes this section.

pub var default_driver: Driver = undefined;
pub var default_initted: bool = false;

pub fn init(board: Board) WifiError!void {
    if (!default_initted) {
        default_driver = try Driver.init(...);
        default_initted = true;
    }
}
pub fn ledSet(on: bool) WifiError!void { return default_driver.gpioSet(CYW43_GPIO_LED, on); }
pub fn joinWpa2(ssid: []const u8, key: []const u8) WifiError!void { /* delegate */ }
pub fn service() void { default_driver.pollOnce(); }
pub fn getIpAddress() [4]u8 { /* unchanged behavior */ }
pub fn hasIpAddress() bool { /* unchanged behavior */ }
// etc.

Section 6 — State-machine design

Four state machines, each with ASCII diagram + transition enumeration.

6.1 Join state machine

Model choice — three-flag, not bitmask. After comparing Embassy and soypat drivers (misc/embassy/cyw43/src/runner.rs:1171-1240, misc/cyw43439/ioctl.go:520-610), the plan adopts the three-flag link-state model rather than pico-sdk's wifi_join_state bitmask. The three flags independently track each phase of association; computed link state is simply:

link_up = join_ok and (!secure_network or keyed)

Cleaner to reason about, easier to test, and matches how two independent clean-room ports converged. The pico-sdk bitmask approach (§2.3.1) is the behavioral reference for which events transition which flag — not the storage model.

pub const JoinState = enum {
    idle,
    scanning,
    associating,         // join() issued, awaiting events
    joined,              // auth_ok && join_ok && (!secure || keyed)
    rejoining,           // pend_rejoin processing
    disassoc_pending,    // pend_disassoc processing
    failed_badauth,      // terminal unless retry_badauth enabled
    failed_nonet,        // terminal unless retry_nonet enabled
    failed_general,      // retried on transient-backoff schedule
};

pub const JoinFlags = struct {
    /// auth_ok: 802.11 authentication / SAE handshake succeeded.
    /// - WPA2: EV_AUTH status=SUCCESS.
    /// - WPA3 SAE: EV_AUTH status=SUCCESS (emitted after SAE, before 4-way).
    /// - Open: EV_AUTH status=SUCCESS (firmware still emits for open nets).
    auth_ok: bool = false,

    /// join_ok: association/link-layer complete (EV_JOIN status=SUCCESS).
    /// NOT "full success" — use isJoined().
    join_ok: bool = false,

    /// keyed: 4-way handshake + GTK installed.
    /// - WPA2/WPA3: EV_PSK_SUP status=UNSOLICITED flags=0 reason=0.
    /// - Open: NEVER set (no PSK_SUP for open networks) — isJoined()
    ///         accounts for this via `secure` parameter.
    keyed: bool = false,

    pub fn isJoined(self: JoinFlags, secure: bool) bool {
        return self.join_ok and self.auth_ok and (!secure or self.keyed);
    }
    pub fn clear(self: *JoinFlags) void {
        self.* = .{};
    }
};

Flag-semantics edge cases (from GPT-5.4 turn-6 pin-downs):

  1. Open network join. secure=falseisJoined = join_ok && auth_ok. PSK_SUP is never received; keyed stays false forever and that is correct. §7.1 test matrix item 4 includes an open-network synthetic event sequence that MUST reach joined without any PSK_SUP event.

  2. WPA3 vs WPA2 auth_ok. Both paths use EV_AUTH status=SUCCESS to set the flag; the sequence before it differs but the outcome event is the same. Integrator-visible AuthType tells us which path is in play; auth_ok is the same bool either way.

  3. Out-of-order flag updates (JOIN before AUTH): order-independent by design. Test matrix includes the permutation.

  4. Failure-latch clearing on new join. Critical. join() entry point clears ALL stale state (flags + pend-flags + threshold counters) before accepting new join intent:

    pub fn join(self: *Driver, config: JoinConfig) WifiError!void {
        self.flags.clear();                              // auth_ok = join_ok = keyed = false
        self.join_state = .associating;
        self.pend_disassoc = false;
        self.pend_rejoin = false;
        self.pend_rejoin_wpa = false;
        self.mcast_err_count = 0;                        // §6.1.1 threshold counter
        self.mcast_err_window_start_ms = 0;
        // ... proceed with association
    }

    Missing any of these clears causes "previous session's failure contaminates new session" intermittent bugs. Regression guard: test matrix exercises failed-then-retried join and asserts flags == {} + all pend-flags false at start of second attempt.

  5. Link-down event while already rejoining → coalesced by §6.1.5 below; does not restart backoff timer.

State diagram:

                    [idle]
                       | join(...)
                       ▼
                 [associating]
     events drive flag updates (order-independent):
        auth_ok ← (EV_AUTH status=SUCCESS) OR (EV_AUTH status=UNSOLICITED ignored)
        join_ok ← EV_JOIN status=SUCCESS
        keyed   ← EV_PSK_SUP status=UNSOLICITED flags=0 reason=0 (AND auth_ok)

     when flags.isJoined(secure_network) → [joined]
                       |
                       ├── "link-down class" events  → clear flags, [rejoining] or [idle]
                       │   ├── EV_LINK status=0 flags=0 (reason=1: loss of signal; reason=2: controlled shutdown)
                       │   ├── EV_DEAUTH status=SUCCESS (AP deauthed us)
                       │   ├── EV_DISASSOC
                       │   ├── EV_DEAUTH_IND reason=2 (bad password → pend_disassoc, not rejoin)
                       │   └── [WPA3] EV_AUTH status=FAIL reason=16 auth_type=3
                       │
                       ├── "crypto drift class" events → pend_rejoin
                       │   ├── EV_ICV_ERROR (49)      ── single event is trigger
                       │   ├── EV_MIC_ERROR (17)      ── single event is trigger
                       │   ├── EV_UNICAST_DECODE_ERROR (50)  ── single event is trigger
                       │   └── EV_MULTICAST_DECODE_ERROR (51) ── threshold (3 within 5 s)
                       │
                       ├── "supplicant failure class" → pend_rejoin
                       │   └── EV_PSK_SUP status∈{4,8,10} reason=15 (timeout)
                       │
                       ├── "firmware distress" → pend_rejoin + log warn
                       │   └── EV_PSM_WATCHDOG (41)
                       │
                       ├── "RSN mismatch" → pend_rejoin_wpa (WPA1 fallback) + pend_rejoin
                       │   └── EV_PRUNE status=0 reason=8
                       │
                       ├── positive heartbeat — NO state change, update health counter
                       │   └── EV_GTK_PLUMBED (84): group key successfully installed
                       │
                       └── IGNORE — must not trigger rejoin
                           ├── EV_PSK_SUP reason=14 (CCX_FAST_ROAM — common during roams)
                           └── EV_AUTH status=UNSOLICITED (unsolicited auth packet noise)

                  [rejoining]
                       | pend_rejoin_wpa processed first (if set) → set_wpa_auth(WPA1)
                       | pend_rejoin processed → WLC_SET_SSID w/ cached last_ssid_joined
                       | backoff_index++ if reconnect.enabled; otherwise [failed_general]
                       ▼
                  [associating]   (reconnect flow)

     Special: AUTH bad→good recovery
        EV_AUTH status=SUCCESS while state==failed_badauth → back to associating, preserve flags.

     Special: EV_SET_SSID decoding at top level
        status=SUCCESS                        → join_ok=true (open/WPA3)
        status=NO_NETWORKS (3) reason=0       → [failed_nonet]
        any other status                      → [failed_general]

Event-to-transition table (authoritative; every row is specified in the event decoder per §6.4 ban on else => {}):

Event Status Reason Flags Auth type Action
AUTH SUCCESS auth_ok = true; if in failed_badauth → associating
AUTH FAIL 16 3 WPA3-specific: clear flags, link down
AUTH FAIL failed_badauth
AUTH UNSOLICITED (6) IGNORE (noise)
JOIN SUCCESS join_ok = true
SET_SSID SUCCESS (join flow ok; open/WPA3 path)
SET_SSID NO_NETWORKS (3) 0 failed_nonet
SET_SSID other failed_general
LINK SUCCESS 0 (down) link-down class: clear flags, rejoin or [idle]
LINK SUCCESS 1 (up) (handled by LinkState §6.2)
DEAUTH SUCCESS link-down class
DEAUTH_IND 2 (bad pw) pend_disassoc = true
DEAUTH_IND other link-down class
DISASSOC link-down class
DISASSOC_IND link-down class (same handling)
PSK_SUP UNSOLICITED (6) 0 0 keyed = true (if auth_ok)
PSK_SUP 14 IGNORE (CCX_FAST_ROAM, roam noise — Embassy runner.rs:1216)
PSK_SUP 4, 8, 10 15 pend_rejoin (edge-of-cell timeout)
PSK_SUP other failed_badauth
PRUNE 0 8 (RSN mismatch) pend_rejoin_wpa + pend_rejoin
PRUNE other log, no state change
ICV_ERROR (49) pend_rejoin (crypto drift)
MIC_ERROR (17) pend_rejoin (crypto drift)
UNICAST_DECODE_ERROR (50) pend_rejoin (crypto drift)
MULTICAST_DECODE_ERROR (51) threshold-triggered pend_rejoin (3 within 5 s — see §6.1.1)
PSM_WATCHDOG (41) Loud log + stats; pend_rejoin — firmware distress
GTK_PLUMBED (84) Positive health signal: update last_healthy_ms counter, no state change
BCNLOST_MSG (31) Link degraded (LinkState §6.2 degraded transition)
PMKID_CACHE (21) Firmware-originated PMKSA change; §5.9.3 cache_in_boot sync trigger
ESCAN_RESULT PARTIAL (8) Dispatch to scan sink (onResult)
ESCAN_RESULT SUCCESS (0) Scan complete (onComplete)
CSA_COMPLETE_IND (80) Channel switch complete; log
ASSOC_REQ_IE (87), ASSOC_RESP_IE (88) Log only
ROAM_PREP (32), ROAM_START (37), JOIN_START (36), ASSOC_START (38), RESET_COMPLETE (35) Log only
(unknown) EventLog.record(UnknownEvent{…}, …) — see §6.1.4 authoritative spec

6.1.1 Multicast decode-error rate limiting

Embassy comment (paraphrased): single mcast decode failures occur naturally on mixed-client networks (different clients using different encryption, broadcast traffic encrypted for the original connecting client). A single event is not a reliable rejoin trigger. Implementation:

// Simple token-bucket style: 3 errors within 5 s triggers pend_rejoin.
mcast_err_count: u8 = 0,
mcast_err_window_start_ms: u32 = 0,

fn onMulticastDecodeError(self: *Driver, now_ms: u32) void {
    if (now_ms - self.mcast_err_window_start_ms > 5_000) {
        self.mcast_err_count = 1;
        self.mcast_err_window_start_ms = now_ms;
    } else {
        self.mcast_err_count +%= 1;
        if (self.mcast_err_count >= 3) {
            self.pend_rejoin = true;
            self.mcast_err_count = 0;
        }
    }
}

Unicast decode errors DO trigger pend_rejoin on first event — they indicate our session key is wrong for traffic directed specifically at us.

6.1.2 PSM_WATCHDOG handling

Firmware's internal Protocol State Machine watchdog fires when the microcode is stuck. None of Embassy/soypat/pico-sdk handle this event explicitly — all three define it in their enum and ignore it. Our plan treats it as a loud diagnostic AND a pend_rejoin trigger:

// Increment stats counter, log at WARN level with uptime.
stats.psm_watchdog_events += 1;
log.puts("[cyw43] WARN: firmware PSM_WATCHDOG event — forcing rejoin\n");
self.pend_rejoin = true;

Rationale: if firmware has hit its own watchdog, we cannot rely on it recovering on its own. A forced rejoin is the safest response.

6.1.3 GTK_PLUMBED as positive health signal

This event fires after every successful group-key rekey. Most embedded drivers don't track it, but it is the single most reliable "firmware and AP are still talking correctly" signal available. Track as:

last_healthy_ms: u32 = 0,  // updated on GTK_PLUMBED or successful ioctl

fn onGtkPlumbed(self: *Driver) void {
    self.last_healthy_ms = self.hooks.ticksMs();
    self.stats.gtk_rekeys += 1;
}

The last_healthy_ms counter complements the crypto-error-class events — if we see MIC/ICV errors but last_healthy_ms is recent, the error might be an ordinary glitch; if last_healthy_ms is old, the errors are more concerning.

6.1.4 Unknown-event logging mechanism (authoritative spec)

This subsection is the single source of truth for how the driver handles events not in the §2.2 specific-handler table. It supersedes earlier scattered fragments in §3.2, §3.4, §4.1, §5.8, §6.1 event-table's last row, §8.2 P2 validation, and §11.3 go/no-go — those remain valid pointers, but the spec lives here.

Design goals (ranked):

  1. Never drop an event silently — every unknown arrives somewhere observable.
  2. Never let unknown events flood UART (they carry the potential to cause the issue they're meant to diagnose, per ISSUES.md #25's 180 s UART-corruption hypothesis).
  3. Make ISSUES.md #25 resolvable: the next session's Phase 2 soak must be able to identify which event fires at ~180 s cadence by running one shell command.
  4. Integrator-controllable: pico may want to push events to JS/Ruby, persist to flash, or dump on shell command. The driver records locally and exposes a query API.
Event.unknown struct (single source of truth)
/// Replaces the earlier §3.2 placeholder with a full field set.
pub const UnknownEvent = struct {
    event_type: u32,           // raw BE→host-decoded event type value
    event_name: []const u8,    // from 89-entry name table; "unknown" if >= 89
    status: u32,
    reason: u32,
    flags: u16,
    auth_type: u32,
    ifidx: u8,                 // interface index from the event wrapper
    bsscfgidx: u8,             // bsscfg index if present; 0 otherwise
    payload_len: u16,          // length of wrapper payload past the 32-byte header
    payload_prefix: [16]u8,    // first 16 bytes of wrapper payload (zero-padded)
};

pub const Event = union(EventKind) {
    // ... specific variants for handled events ...
    unknown: UnknownEvent,
};

Every unknown event gets this full field set. The fields match exactly what GPT-5.4's peer review (turn 1, landmine D) specified as the minimum for resolving R14 + ISSUES.md #25.

The 89-entry event-name table

Lives in ll/events.zig:

pub const event_names: [89]?[]const u8 = blk: {
    var t = [_]?[]const u8{null} ** 89;
    t[0]  = "SET_SSID";      t[1]  = "JOIN";             t[3]  = "AUTH";
    t[5]  = "DEAUTH";        t[6]  = "DEAUTH_IND";       t[7]  = "ASSOC";
    t[11] = "DISASSOC";      t[12] = "DISASSOC_IND";     t[16] = "LINK";
    t[17] = "MIC_ERROR";     t[21] = "PMKID_CACHE";      t[23] = "PRUNE";
    t[31] = "BCNLOST_MSG";   t[41] = "PSM_WATCHDOG";     t[46] = "PSK_SUP";
    t[49] = "ICV_ERROR";     t[50] = "UNICAST_DECODE_ERROR";
    t[51] = "MULTICAST_DECODE_ERROR";                    t[69] = "ESCAN_RESULT";
    t[80] = "CSA_COMPLETE_IND";                          t[84] = "GTK_PLUMBED";
    t[87] = "ASSOC_REQ_IE";  t[88] = "ASSOC_RESP_IE";
    // Descriptive-only entries (not actively handled; named for log clarity):
    t[19] = "ROAM";          t[20] = "TXFAIL";           t[32] = "ROAM_PREP";
    t[36] = "JOIN_START";    t[37] = "ROAM_START";       t[38] = "ASSOC_START";
    t[40] = "RADIO";         t[44] = "PROBREQ_MSG";      t[54] = "IF";
    t[71] = "PROBRESP_MSG";  t[47] = "COUNTRY_CODE_CHANGED";
    // ... (full table from §12.3 mask-layout and reference cyw43_ll.h) ...
    break :blk t;
};

fn lookupEventName(event_type: u32) []const u8 {
    if (event_type >= event_names.len) return "unknown";
    return event_names[event_type] orelse "unnamed";
}
Rate-limiter and ring buffer

Lives in ctrl/state.zig as part of Driver:

pub const EventLog = struct {
    pub const CAP = 16;                 // max distinct {type,status,reason} tuples
    pub const WINDOW_MS: u32 = 5_000;   // rate-limit coalesce window

    pub const Entry = struct {
        unknown: UnknownEvent,          // full data from the first occurrence
        first_seen_ms: u32,
        last_seen_ms: u32,
        count: u16,                     // total occurrences (including first)
        emitted_in_window: bool,        // true after the first-seen log line fired
    };

    entries: [CAP]Entry = undefined,
    count: u8 = 0,
    total_unknown_count: u32 = 0,       // lifetime; never reset
    window_start_ms: u32 = 0,

    pub fn record(self: *EventLog, ev: UnknownEvent, now_ms: u32, log: Logger) void;
    pub fn rollWindow(self: *EventLog, now_ms: u32, log: Logger) void;
    pub fn dumpAll(self: *const EventLog, log: Logger) void;
    pub fn clear(self: *EventLog) void;
};

Dedupe key — precise definition:

const DedupeKey = struct {
    event_type: u32,
    status: u32,
    reason: u32,
    auth_type: u32,
    ifidx: u8,
};
// flags, bsscfgidx, payload_prefix are NOT part of the key.
// Rationale: flags/bsscfgidx may vary slightly across otherwise-identical
// events; payload bytes definitely vary. Including them would defeat
// coalescing. Auth_type IS included because a WPA2 vs WPA3 event with
// otherwise-identical fields is meaningfully different diagnostic data.
// Ifidx is included because STA vs AP events should never coalesce.

record() algorithm:

1. total_unknown_count += 1
2. Compute key = DedupeKey{ event_type, status, reason, auth_type, ifidx }.
3. Search entries[0..count] for a matching key:
   - match found:
     - entry.last_seen_ms = now_ms
     - entry.count += 1 (saturating at u16 max)
     - (silent — no log line for duplicates within window)
   - no match, and count < CAP:
     - Append new entry with first_seen_ms = last_seen_ms = now_ms, count = 1
     - Store the full UnknownEvent (including flags/bsscfgidx/payload_prefix)
       for later inspection — only the KEY is shared across duplicates
     - Emit first-seen log line (see format below)
     - entry.emitted_in_window = true
   - no match, and count >= CAP:
     - Emit one "log full" line once per window (guarded by total_unknown_count)
     - Increment total_unknown_count but don't store

rollWindow() algorithm (called once per Driver.pollOnce()):

1. If (now_ms - window_start_ms) < WINDOW_MS: return.
2. For each entry with count > 1 AND emitted_in_window:
     Emit summary line: "coalesced (<count>×) over last 5s"
3. Reset all entries (or: compact — keep high-count entries for visibility).
4. window_start_ms = now_ms.

Emitted log-line format (single-line, ≤ 112 chars):

[cyw43] evt ??? type=41(PSM_WATCHDOG) status=0 reason=0 flags=0x0000 ifidx=0 bsscfg=0 plen=6 payload=01 00 00 00 ab cd ...

For unnamed events:

[cyw43] evt ??? type=107(unknown) status=0 reason=0 flags=0x0000 ifidx=0 bsscfg=0 plen=0 payload=(empty)

Coalesced summary:

[cyw43] evt ??? type=41(PSM_WATCHDOG) coalesced 3× over last 5s

Ring-buffer full:

[cyw43] evt ??? event log full (16 distinct tuples); total unknowns this boot: 142
Integration points
  • Logger: record() always emits to Logger if provided; otherwise silently accumulates in the ring buffer. No-logger mode is supported (freestanding-strict builds may skip the pretty-formatter to save flash).
  • HostHooks.onEvent: fires for every decoded event including Event.unknown. Receives the full UnknownEvent struct. Integrators who want richer handling (persist to flash, push to MQTT, expose via HTTP) hook here.
  • Driver query API:
    pub fn getEventLog(self: *const Driver) *const EventLog { ... }
    pub fn clearEventLog(self: *Driver) void { ... }
  • Pico UART shell command (Phase 3 deliverable): wifi events dumps the ring buffer via EventLog.dumpAll(). Implementation goes in src/bindings/wifi.zig (not the driver itself — the driver exposes the API; the shell is pico-integration layer).
Specific use in ISSUES.md #25 resolution

The 180 s UART-corruption burst is hypothesized to be an unhandled event firing periodically. With §6.1.4 in place, Phase 2 validation becomes mechanical:

  1. Run 30-min soak under active TCP workload (reproduces pico-sdk ISSUES.md #25). Logger enabled.
  2. Issue wifi events shell command after soak.
  3. Look at ring buffer for entries with count in the range 5–10 (for a 30-min run at 180 s cadence).
  4. Those entries' type fields are the candidate events.
  5. Cross-reference against event_names table to identify. Likely candidates based on research: ROAM (19), BCNLOST_MSG (31), PSM_WATCHDOG (41), or TXFAIL (20).
  6. Logger-disabled confirmation run (per GPT-5.4 turn-6 review, addresses the circularity of using UART to diagnose UART corruption): a. Rerun the same 30-min soak with Logger set to a no-op but EventLog still recording. b. Observe: does the UART still corrupt at 180 s cadence? (Note: something other than our event decoder is using UART — puts from pico-level superloop, etc. — so UART still carries normal traffic we can observe for bursts.) c. If corruption persists: the event exists (ring buffer has the data) but it is not the cause of UART corruption — SOMETHING ELSE is happening on the SPI bus around the same cadence. Pivot to alt-instrumentation paths below. d. If corruption disappears: our own log-line emission at first-seen was the proximate cause. Ring buffer still tells us which event; we decide whether to log it differently or ignore it silently.
  7. Once identified, decide: (a) decode specifically and ignore, (b) decode specifically and act, or (c) disable via event-mask bit clear. Document decision in UPSTREAM.md.

If the ring buffer is empty after soak, ISSUES.md #25's "unhandled-event hypothesis" is falsified — the burst is something else (e.g. SPI bus re-sync pattern, flash-XIP contention with CYW43 traffic, UART DMA underrun). Phase 2 gate then flips from "identify the event" to "identify the actual cause via alternative instrumentation." Either outcome is progress.

Invariants a reviewer must check
  • Decoder never falls through to else => {}. Unknown events always land in EventLog.record().
  • record() never allocates, never blocks, never calls back into the driver. Safe to call from any decode context.
  • Ring buffer is bounded by CAP — cannot overflow memory regardless of event flood rate.
  • Rate-limiter is bounded by WINDOW_MS — UART output rate ≤ 1 line per distinct tuple per 5 s window + 1 coalescing summary per window.
  • getEventLog() returns a const pointer — consumers cannot mutate (only clearEventLog() mutates).

6.1.5 Rejoin-storm coalescing

With the event-handler expansion in §6.1 (MIC_ERROR, ICV_ERROR, UNICAST_DECODE_ERROR, MULTICAST_DECODE_ERROR threshold, PSM_WATCHDOG, PSK_SUP timeout, and multiple link-down-class events), a flaky link can fire many pend_rejoin triggers in quick succession. Without coalescing, each trigger could reset the backoff timer or stack another pending rejoin — livelocking the reconnect policy under noise.

Rule: while join_state == .rejoining, additional pend_rejoin triggers are coalesced, not stacked. Concretely:

fn requestRejoin(self: *Driver, trigger: RejoinTrigger) void {
    self.stats.rejoin_triggers_by_class[@intFromEnum(trigger)] += 1;
    self.last_rejoin_trigger = trigger;     // for diagnostics
    if (self.pend_rejoin) return;           // already pending; coalesce
    if (self.join_state == .rejoining) return;  // already processing
    self.pend_rejoin = true;
    // Backoff timer is NOT reset by this call — it only advances when the
    // previous rejoin attempt actually completes (success or failure).
}

pub const RejoinTrigger = enum {
    icv_error, mic_error, unicast_decode_error, multicast_decode_error_threshold,
    psk_sup_timeout, psm_watchdog, deauth, disassoc, ap_reboot_detected,
    app_requested,
};

What this preserves:

  • stats counters still record EVERY trigger event (so diagnosis can see "we got 47 ICV_ERRORs in the last 5 minutes" even if only one rejoin was attempted).
  • last_rejoin_trigger shows which event class caused the currently-processing rejoin — useful for UPSTREAM.md M-entries when we see new failure patterns.
  • Backoff schedule (§6.1 ReconnectPolicy.backoff_ms) advances cleanly: one entry per completed rejoin attempt, not one per trigger event.

What this prevents:

  • Livelock under noisy crypto errors.
  • Backoff timer getting repeatedly stomped back to 1 s.
  • Multiple pend_rejoin being "true" when only one rejoin is semantically possible at a time.

Test matrix §7.1 item 4 regression: drive a sequence of 10 ICV_ERROR events in rapid succession and assert exactly ONE rejoin attempt issued, with stats.rejoin_triggers_by_class[icv_error] == 10.

Transition-order discipline. Success flags (auth_ok, join_ok, keyed) are order-independent — the state machine accepts them in any order and promotes to joined when the composite condition is met. Failure classification and deferred-action scheduling remain event-specific: a DEAUTH_IND reason=2 always maps to pend_disassoc, not pend_rejoin; a PRUNE reason=8 always sets pend_rejoin_wpa first. Test matrix §7.1 item 4 includes ordered, out-of-order, AUTH-recovery, crypto-drift, and roam-noise (reason=14) sequences.

Timeouts. join() with a timeout_ms (default 15 s) polls until joined or failed_*, then returns. Internally, assoc_pending + rejoin_pending both count against the same timeout budget. Three rejoin_pending entries inside a single join() call exhausts auto-retry; further recovery is only triggered by new events from within another join() call.

Auto-reconnect policy. Fully configurable:

pub const ReconnectPolicy = struct {
    enabled: bool = true,
    backoff_ms: []const u32 = &.{ 1_000, 2_000, 4_000, 8_000, 16_000 },  // retry schedule
    retry_transient: bool = true,  // DEAUTH_IND, DISASSOC, ICV_ERROR, PSK timeout → retry
    retry_badauth: bool = false,   // permanent auth failure — default OFF to avoid retry spam
    retry_nonet: bool = false,     // SSID not found — default OFF (probably out of range)
};
// in Config:
pub reconnect: ReconnectPolicy = .{},

Backoff reset triggers:

  • Any successful transition to joined clears the backoff index to 0.
  • A manual join(...) call with a new SSID clears the index (treated as a fresh session).
  • A manual leave() → future join(...) clears the index.

Special-cased failures:

  • failed_badauth / failed_nonet: gated behind retry_badauth / retry_nonet respectively. Default OFF — a permanent auth failure should not spin the CPU retrying. User code can respond to the onLinkDown callback with explicit reconfiguration.
  • failed_general: retried on the standard backoff schedule (transient IO/timing faults).
  • Exhausted backoff (past last entry): driver stays in reconnecting, heartbeat-pings once per last-backoff-interval. Host retains ability to trigger manual recovery.

6.2 Link state machine

                     [down]
                        |  join() triggers internal state machine
                        ▼
                  [associating]
                        |  join state machine reaches `joined`
                        ▼
              [associated_no_ip] ─── onLinkUp(STA)
                        |  host TCP/IP layer calls driver.markIpReady()
                        ▼
                      [up]
              ┌─────────┼──────────┐
              |         |          |
        beacon_loss  signal_low  EV_DEAUTH_IND / EV_DISASSOC
              |         |          |
              ▼         ▼          ▼
         [degraded]  [degraded]  [reconnecting]
              |          |          |
              | rx frame |          | join state machine re-runs
              | signal recovers     | (uses cached credentials)
              ▼          ▼          ▼
            [up]      [up]      [associating] on success
                                       ─► [up]
                                       fail: [down]

Triggers:

  • associating → associated_no_ip: internal join_state == .joined AND flags.isJoined(secure_network) (see §6.1 three-flag model).
  • associated_no_ip → up: host calls markIpReady().
  • up → degraded: EV_BCNLOST_MSG (31) — beacon-loss event from firmware. This is the specific trigger (was "(future) heuristic" in earlier drafts; research identified the concrete event).
  • degraded → up: any successful RX frame OR EV_GTK_PLUMBED event (positive health signal per §6.1.3).
  • up|degraded → reconnecting: any "link-down class" event (§6.1 table) with auto_reconnect enabled.
  • * → down: leave() called, or reconnect exhausted and auto_reconnect disabled.

6.3 SDPCM TX queue state machine

         [idle_no_credits]  (tx_seq == last_credit)
              | event/data RX updates last_credit
              ▼
         [has_credits]
              | cyw43_send_ioctl / _send_ethernet called
              | increment tx_seq
              | write frame to WLAN_FUNCTION
              ▼
         [waiting_credits_or_flow]
              | poll_device drains RX packets
              | each RX updates last_credit and wlan_flow_control
              ▼
         [has_credits]  (if last_credit != tx_seq and !wlan_flow_control)
              |
              ▼
         [idle]

Stall recovery. If waiting_credits_or_flow exceeds 1 s, the stall times out with WifiError.SdpcmCreditStall. During the wait, RX drain processes events only — no data RX callback invocation (reentrancy hazard — §2.4.6).

Credit arithmetic. tx_seq and last_credit are u8 with wraparound. hasCredit() = (last_credit -% tx_seq) != 0. The "accept only if credit_delta <= 20" check in the reference (cyw43_ll.c:845-848) protects against stale/misordered credit headers; replicate it.

6.4 IOCTL dispatch state machine

              [send]
                 |  build SDPCM+CDC+payload, increment ioctl_id
                 |  wait for credits (§6.3)
                 |  transmit via bus.writeBytes
                 ▼
              [pending_response]
                 | poll_device in 1 ms tick up to 500 ms
                 |
                 ├─ RX CONTROL with matching ioc_id ─► copy response, [complete]
                 ├─ RX CONTROL with mismatch ─────── ─► drop, keep waiting
                 ├─ RX ASYNCEVENT ──────────────────► dispatch via events module
                 ├─ RX DATA ────────────────────────► dispatch via eth RX hook
                 └─ timeout 500 ms ────────────────── ─► [timeout]
                 ▼
              [complete] or [timeout]

Reentrancy discipline. The event dispatch inside pending_response is allowed to update driver state (including pend_rejoin flags) but must not call doIoctl synchronously. Reentrant-ioctl is forbidden. Events that want to trigger an ioctl use the pend-flag mechanism; the poll loop processes pends after returning from the outer ioctl.


Section 7 — Testing strategy

Three-layer test strategy: host, mock-transport, hardware.

7.1 Host-side tests (zig test)

Target host, not firmware. Exercises the decoder/encoder logic that doesn’t require hardware:

  1. Endianness round-trips. For each header kind (SDPCM, CDC, BDC, event wrapper), encode a struct → bytes → decode → compare fields. Specifically:
    • SDPCM header with size=0x1234: bytes should be {0x34,0x12,0xcb,0xed,...} (size_com = ~size & 0xffff).
    • CDC header: cmd=263, len=0x18, flags=0xABCD0002, status=0.
    • Event wrapper: event_type=69 encoded as BE {0x00,0x00,0x00,0x45}.
  2. IE walker. Feed: (a) empty IE list, (b) single RSN IE (type=48), (c) WPA vendor-specific IE (type=221 with \x00\x50\xF2\x01), (d) WEP via capability bit, (e) malformed length field that overruns buffer. Verify auth_mode output matches reference bit-encoding.
  3. Credit arithmetic. hasCredit(tx_seq, last_credit) for: (0,0)=false, (0,1)=true, (255,0)=true, (100,120)=true, (100,90)=false (requires the 20-delta accept rule). Named wrap tests (mandatory; separate from the basic table): credit_wrap_forward exercises tx_seq=0xFE, then four successive sends — credits must be accepted as {0xFF, 0x00, 0x01, 0x02} in sequence without a false-stall. credit_wrap_stale exercises a delivered credit of 0x05 when last_credit=0xFD — the credit_delta = 0x08 <= 20 rule accepts it. credit_wrap_stale_reject exercises a delivered credit of 0x10 when last_credit=0xF0 and tx_seq=0x00 — the delta is 0x20 > 20 and must be rejected.
  4. Join state machine. Drive a synthetic event stream and verify JoinState + three flags (auth_ok, join_ok, keyed) match the §6.1 event-to-transition table. Sequences to test:
    • Happy path: AUTH ok → JOIN ok → PSK_SUP KEYED → joined.
    • BADAUTH then AUTH ok recovery.
    • PRUNE RSN-mismatch → WPA1 fallback → rejoin.
    • DEAUTH_IND reason=2 → pend_disassoc → disassoc issued.
    • PSK_SUP timeout (status=4 r=15) → pend_rejoin.
    • Out-of-order: JOIN arrives before AUTH.
    • PSK_SUP reason=14 (roam noise) — must NOT trigger rejoin. Regression guard against the bug Embassy documented at runner.rs:1216.
    • WPA3 AUTH FAIL reason=16 auth_type=3 (with wpa3_mode != .off) — link-down, not badauth-retry. Under .auto + wpa3_wpa2_aes_psk AuthType, triggers WPA2 fallback attempt per §5.10.
    • Crypto drift class: single ICV_ERROR, MIC_ERROR, UNICAST_DECODE_ERROR each independently trigger pend_rejoin. 3 × MULTICAST_DECODE_ERROR within 5 s triggers pend_rejoin; 2 events within 5 s does NOT (boundary).
    • PSM_WATCHDOG single event triggers pend_rejoin + stats increment.
    • GTK_PLUMBED updates last_healthy_ms counter but does not change JoinState or flags.
    • BCNLOST_MSG while updegraded; next successful RX → up.
  5. Event decoder on captured payloads. 10–20 hex-encoded event payloads captured from real hardware, decoded and asserted. Tests belong in src/cyw43_new/tests/events.zig. Because these captures don't exist at Phase 1 start, §7.1.1 below specifies the procedure for producing them in Phase 2.

§7.1.1 Event-payload capture procedure (Phase 2 deliverable)

Once Phase 2 lands the event pipeline (§6.1 decoder), shadow-mode logging on real hardware produces the hex captures that §7.1 item 5 needs for regression testing.

Capture tool: add to src/cyw43_new/tests/hardware_bringup.zig (the Phase 1 -Dcyw43=new_shadow test exe) an event-dump mode:

// Compile with -DCAPTURE_EVENTS=1 to enable.
// Dumps each event as a hex line over UART:
//   EVT_HEX: aabbccdd... <event_name> <status=N reason=M flags=F>

fn onEventDump(ctx: *anyopaque, ev: *const Event, raw_wrapper_bytes: []const u8) void {
    _ = ctx;
    console.puts("EVT_HEX: ");
    for (raw_wrapper_bytes[0..@min(raw_wrapper_bytes.len, 48)]) |b| {
        console.putHex8(b);
    }
    console.puts(" ");
    console.puts(@tagName(std.meta.activeTag(ev.*)));
    console.puts("\n");
}

Capture scenarios (run each, save picocom UART log):

Scenario Expected events observed
Scan-only ESCAN_RESULT × N, CSA_COMPLETE_IND
WPA2 clean join SET_SSID, AUTH, JOIN, LINK, PSK_SUP KEYED
WPA2 wrong password SET_SSID, AUTH FAIL, DEAUTH_IND reason=2
SSID not found SET_SSID status=3 reason=0
WPA3 join AUTH (SAE), SET_SSID, JOIN, LINK, PSK_SUP KEYED
Router power-cycle DEAUTH / DISASSOC, BCNLOST_MSG, then reconnect sequence
Forced hostapd DEAUTH DEAUTH_IND reason (varies by AP)
Power-save ICV_ERROR repeated ICV_ERROR (49), MIC_ERROR possibly, after induced rekey-miss
Roam between APs ROAM, ROAM_PREP, ROAM_START, possibly PSK_SUP reason=14

Each scenario produces one or more EVT_HEX: lines. Extract into tests/fixtures/events/*.hex:

# tests/fixtures/events/wpa2_join_set_ssid_ok.hex
# Event: SET_SSID, status=0 (success)
# Captured: 2026-05-15, Pico W against Asus RT-AX88U running WPA2-PSK
00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...

Regression tests in src/cyw43_new/tests/events.zig then:

test "SET_SSID success decodes cleanly" {
    const hex = @embedFile("fixtures/events/wpa2_join_set_ssid_ok.hex");
    const bytes = try parseHex(hex);
    const ev = try events.decode(bytes);
    try std.testing.expect(ev == .set_ssid);
    try std.testing.expectEqual(@as(u32, 0), ev.set_ssid.status);
}

This gives us a regression net over every real-world event shape we've observed. Captures accumulate across phases; by end of Phase 3 we have a corpus covering every event in §2.2's handler table. 6. SPI command word packing. packCmd(true, true, 2, 0, 64) should produce 0xC0000040; byte-swapped form should produce {0x00,0x40,0x00,0xc0} (half-word swap).

All of these live in their respective .zig module’s test "..." blocks. The host build already supports zig build test and has the 78+ nanoruby tests as baseline (§11 gates).

7.2 Mock-transport integration tests

A Zig mock implementation of the Spi interface in §5.1 that plays back canned transaction logs. This is the wire-format conformance gate parallel to byte-identity.

7.2.1 Log format

Plain-text, line-oriented, #-prefixed comments. Each line is one SPI transaction seen from the host side:

# format: <direction> <fn> <addr>  <len>  <hex-payload>   [# comment]
tx  0   0x0014    4   00000000                # read SPI_TEST_REG command word (pre-mode-swap)
rx  0   0x0014    4   adbeedfe                # device returns FEEDBEAD (byte-swapped)
tx  0   0x0000    4   40000120                # SPI_BUS_CONTROL write (mode switch)
tx  1   0x1000e   1   08                      # backplane: request ALP
rx  1   0x1000e   1   48                      # readback: ALP available (bit 6)

Conventions:

  • direction: tx (host → CYW43) or rx (CYW43 → host).
  • fn: function 0/1/2 per §2.4.1.
  • addr: hex, 0x-prefixed.
  • len: decimal bytes.
  • hex-payload: lowercase hex, no spaces, no 0x prefix. For tx this is what the host is expected to send; for rx it is what the mock will return.
  • Lines beginning with # or blank: ignored.

Groups of transactions representing a single logical operation are separated by a ## <label> line:

## bus_init: check test register
tx  0  0x0014  4  00000000
rx  0  0x0014  4  adbeedfe

## bus_init: mode switch
tx  0  0x0000  4  40000120

Labels appear in assertion-failure messages so diff output is navigable.

7.2.2 Golden transcripts needed

Path Captured from Scope
tests/golden/boot_bus_init.log old driver SPI test reg through CLM load + WLC_UP
tests/golden/scan_escan.log old driver escan iovar + all-channels results
tests/golden/join_wpa2_psk.log old driver Full WPA2 join: passphrase → SET_SSID → PSK_SUP KEYED
tests/golden/join_wpa3_sae.log new driver WPA3 SAE join (Phase 3 deliverable; captured from new driver in shadow mode)
tests/golden/reconnect_icv_error.log new driver Induced power-save ICV_ERROR + auto-rejoin
tests/golden/reconnect_after_deauth.log new driver Hostapd-injected DEAUTH + auto-rejoin
tests/golden/pmksa_clear_on_boot.log new driver pmkid_info iovar at end of wifiOn
tests/golden/eth_tx_tcp_ack.log old driver Single TCP ACK via BDC

Phase 1 produces the first three. Phase 2/3 produce the rest.

7.2.3 Logic-analyzer capture procedure

Hardware: Saleae Logic 2 (16-channel, 24 MHz minimum) is the reference tool. DSLogic, sigrok with fx2lafw, or any 4+ channel analyzer ≥ 50 MS/s also works.

Pin mapping on Pico W (CYW43 SPI + control lines):

Signal GP# Analyzer channel
WL_CLK GP29 (adapt if board differs) Ch0
WL_DIO (bidirectional) GP24 Ch1
WL_CS GP25 Ch2
WL_IRQ (= WL_SDIO_1) GP24 when CS high (shared with DIO)
WL_REG_ON GP23 Ch3
GND probe ground

Sample rate: 24 MHz minimum. Pico W PIO-SPI runs at 33 MHz theoretical max but 24 MHz + digital-filter-in-Logic is sufficient.

Trigger: rising edge on WL_REG_ON channel. Captures the full bring-up from chip power-on.

Capture duration:

  • boot_bus_init: ~600 ms after trigger (covers firmware upload and CLM load).
  • scan_escan: ~3 s (active scan + 2.4G result delivery).
  • join_wpa2_psk: ~5 s (association + 4-way handshake).
  • reconnect_*: ~30 s with induced disruption in middle.

Export: File → Export data → CSV, "Time" + all channels, "Full precision time". Column order Time [s], WL_CLK, WL_DIO, WL_CS, WL_REG_ON.

7.2.4 tools/spi_trace_to_mock.py specification

Phase 1 deliverable. Python 3.9+, standard library only. CLI:

spi_trace_to_mock.py <input.csv> <output.log>
  [--start-offset <seconds>]    # skip initial noise
  [--label <prefix>]            # prefix every ## group label

Logic:

  1. Parse CSV, build a time-sorted event list.
  2. Detect gSPI command frames by CS-low + CLK sequences. The first 4 bytes on DIO after CS-low are the command word per §2.4.2.
  3. Parse command word: {write, incr, fn, addr, sz}. Follow with sz bytes of data (TX if write, RX if read — direction inferable from cmd-word bit 31).
  4. Emit a log line per transaction. Group with ## <label> at natural SPI inactivity gaps (>1 ms).
  5. Special-case pre-mode-switch byte-swap: if the SPI_BUS_CONTROL write hasn't happened yet, apply the {b1,b0,b3,b2} swap so the log reflects logical values, not wire-byte-order.

The tool is intentionally one-way (capture → log). Reverse direction ("log → test-driver") is the mock SPI framework itself, §7.2.5.

7.2.5 Mock SPI framework sketch (Zig)

Concrete skeleton for the Phase 1 test harness. src/cyw43_new/tests/mock_spi.zig:

pub const MockSpi = struct {
    log_lines: []const LogLine,
    cursor: usize = 0,
    rx_buf_next: ?[]const u8 = null,  // staged for the next transferRx

    pub const LogLine = struct {
        direction: enum { tx, rx },
        fn_id: u8,
        addr: u32,
        len: u32,
        bytes: []const u8,
        label: []const u8 = "",
        source_line: u32,  // for assertion-failure messages
    };

    /// Implements the Spi vtable interface from §5.1.
    pub fn transferTx(ctx: *anyopaque, cmd_word: u32, payload: []const u8) WifiError!void {
        const self: *MockSpi = @ptrCast(@alignCast(ctx));
        if (self.cursor >= self.log_lines.len) return error.MockExhausted;
        const expected = self.log_lines[self.cursor];
        if (expected.direction != .tx) {
            std.debug.panic("mock: expected rx at line {d} ({s}), got tx",
                .{ expected.source_line, expected.label });
        }
        // Parse cmd_word; confirm fn/addr/len match expected.
        // Confirm payload bytes match expected.bytes; emit hex diff on mismatch.
        self.cursor += 1;
        // If next line is an rx paired with this tx (read command), stage its bytes.
        if (self.cursor < self.log_lines.len and self.log_lines[self.cursor].direction == .rx) {
            self.rx_buf_next = self.log_lines[self.cursor].bytes;
            self.cursor += 1;
        }
    }

    pub fn transferRx(ctx: *anyopaque, out: []u8) WifiError!void {
        const self: *MockSpi = @ptrCast(@alignCast(ctx));
        const staged = self.rx_buf_next orelse return error.MockUnexpectedRx;
        if (staged.len != out.len) return error.MockLenMismatch;
        @memcpy(out, staged);
        self.rx_buf_next = null;
    }

    pub fn asSpi(self: *MockSpi) Spi {
        return .{ .ctx = self, .vt = &mock_vtable };
    }
};

Usage in a test:

test "bus_init matches golden trace byte-for-byte" {
    const log = try parseLog(@embedFile("golden/boot_bus_init.log"));
    var mock = MockSpi{ .log_lines = log };
    const config = Config{
        .transport = mock.asSpi().ctx,
        .transport_vt = mock.asSpi().vt,
        // minimal hooks
    };
    var drv = try Driver.init(testing.allocator, config);
    defer drv.deinit();
    try drv.wifiOn(.worldwide);
    try std.testing.expect(mock.cursor == mock.log_lines.len);  // all transactions consumed
}

Failure-mode assertion messages include the golden-log source line, the expected hex, the actual hex, the first byte position where they differ, and the preceding group label:

MockSpi mismatch at tests/golden/boot_bus_init.log:47 ("bus_init: mode switch")
  expected (host→chip, fn=0, addr=0x0000, len=4):
    40 00 01 20
  actual:
    40 00 01 40
                                  ^ diff at byte 3 (expected 0x20, got 0x40)

7.2.6 Fixture metadata schema (mandatory sidecar)

Every golden trace and event-payload fixture carries a sidecar metadata file so future contributors can tell whether the fixture is still valid after a blob upgrade or hardware change.

Path convention: alongside the fixture, with .meta suffix.

tests/golden/boot_bus_init.log
tests/golden/boot_bus_init.log.meta

Metadata schema (plain-text key=value, parseable by trivial tools):

# tests/golden/boot_bus_init.log.meta
capture_date = 2026-06-02
scenario = boot_bus_init
firmware_sha256 = <sha256 of src/cyw43/firmware/43439A0_combined.bin at capture time>
firmware_version = 7.95.61
nvram_sha256 = <sha256 of src/cyw43/firmware/43439A0_nvram.bin>
clm_blob_len = 984
board = pico_w_revA
board_rev = 1.3
host_cpu_freq_mhz = 125
logic_analyzer = Saleae Logic 2 Pro 16, sw v2.4.17, HW rev E
sample_rate_mhz = 24
capture_duration_ms = 620
trigger = rising_edge on WL_REG_ON
driver_source_sha = <git rev-parse HEAD of pico repo when capture taken>
driver_config = -Dcyw43=old -Dengine=js  (or whatever was running)
logger_enabled = true
notes = First Phase 1 capture; baseline boot sequence.

Mandatory fields (test harness refuses to load a fixture without these): capture_date, firmware_sha256, nvram_sha256, driver_source_sha, scenario, driver_config, logger_enabled.

Validation on replay: before running a test that uses a fixture, the harness reads .meta and asserts firmware_sha256 / nvram_sha256 match current blob SHAs. If a blob was upgraded without regenerating fixtures, the test fails loudly with a clear message (not silently passes against stale data).

This closes the R17 feedback loop: blob upgrades CANNOT silently invalidate wire-format tests. Also makes the corpus readable for the next maintainer 2 years from now.

7.2.7 Why this is the primary wire-correctness gate

Byte-identity (§7.4) only catches regressions in the -Dcyw43=old build — it doesn't validate the new driver. The mock replay validates:

  • Every ioctl's wire payload bytes exactly.
  • Every iovar name + suffix bytes.
  • Every SDPCM/CDC/BDC header field.
  • Endianness at every layer.
  • Sequence-number and credit arithmetic.
  • Command-word packing (including pre-mode-switch byte-swap).

If a Zig translation has an off-by-one in any of these, replay fails loudly with hex diff. This is the gate that turns R1 (translation bug) from "medium-probability silent failure" into "low-probability loud failure" — dramatically cheaper to debug.

7.3 Hardware validation matrix

Must-pass scenarios before flipping default (§8.4). Run each on real Pico W with the NVRAM/CLM version committed to this repo. All must pass clean (no UART corruption beyond ISSUES.md #25 tolerance; no watchdog reset; no driver panic).

# Scenario Steps Pass criteria
H1 Fresh join Power-cycle Pico; join(ssid, key); DHCP up within 15 s; MQTT broker ping works
H2 Re-join after router reboot H1 succeed; reboot router; wait for AP to come back Driver auto-reconnects within 60 s of AP return
H3 Re-join after explicit DEAUTH H1 succeed; send DEAUTH from hostapd test script Auto-reconnect within 10 s
H4 Sustained traffic soak H1 + MQTT pub every 10 s for 60 min No UART burst beyond ISSUES.md #25 tolerance; MQTT uninterrupted
H5 DHCP lease renewal H1 + let lease expire and renew (~50% of lease time) Lease renews without reconnect
H6 Roam between APs on same SSID Two APs same SSID, power one off mid-session Driver roams or at worst reconnects
H7 Bad password join(ssid, "wrong") Returns JoinBadAuth within 5 s; no retry spam
H8 SSID not found join("nonexistent", "x") Returns JoinNoNetwork within 10 s
H9 Scan during idle scan(.{}) with running TCP session Scan completes without disrupting session
H10 Power-save cycle Set PM2, idle 30 s, then TX First TX after idle wakes chip cleanly
H11 WPA3 (if enabled) H1 but WPA3 SSID up within 15 s
H12 ISSUES.md #25 diagnostic H4 + exhaustive event logging Event type driving the burst is logged

7.4 Regression gates (parallel to the matrix)

Every commit in the Phase 1–4 migration must pass:

  • -Dengine=js UF2 byte-identical to .preflight-baseline/pico-preintegration.uf2. Verify: sha256sum zig-out/firmware/pico.uf2 against the pinned hash.
  • strings zig-out/firmware/pico.uf2 | grep -cE 'std\\.debug\\.print|std\\.Io\\.File|std\\.Io\\.Threaded' returns 0.
  • All 78+ existing nanoruby host tests still pass.
  • zig build test exercises new driver’s host tests.
  • zig build test-hal, zig build test-uart, zig build test-main still green.

Byte-identity scope statement (per GPT-5.4 landmine A). Byte-identity holds when:

  • Zig version: exactly 0.16.0 per build.zig.zon.
  • -Dcyw43=old is the default during Phase 1–3 (so the "default" shipping build keeps hitting the old tree).
  • Build-selection isolation: the -Dcyw43=old -Dengine=js build traverses the same old-driver source set as the pre-rewrite baseline and produces identical UF2 bytes. New-driver code (src/cyw43_new/) must be unreachable from the old-driver build selection — not merely guarded behind an if, but not present at all in the root source graph of the old build.
  • Linker script, boot2, firmware blobs unchanged.

The build-selection isolation rule is the enforceable invariant. Any mechanism that achieves it (separate root files per build flag, module gating, or anything equivalent to how docs/NANORUBY.md §A2 separates -Dengine=js from -Dengine=ruby) is acceptable.

Byte-identity does not hold (and is not expected to hold) for:

  • -Dcyw43=new build (that’s the point).
  • Any engine other than js.
  • Builds with different -D flags (-DSSID, -DUSB_HOST, etc.) — those have their own baselines (see M2 entry in src/ruby/nanoruby/UPSTREAM.md).

Section 8 — Migration plan

Four phases. Each phase is landable as a small sequence of commits; inter-phase boundaries are hardware-verified checkpoints.

8.0 Cross-phase infrastructure and peer-review discipline

Before Phase 1 starts, the next session should understand the infrastructure decisions that apply across all phases.

8.0.1 build.zig integration — concrete pattern

The -Dcyw43=old|new|new_shadow flag is added with the same root-source-file selection pattern that docs/NANORUBY.md M2 used for the engine gate. Build-selection isolation (per §7.4) is achieved by making the root source file depend on the flag, not by conditional imports inside a shared root file — this is what guarantees byte-identity under -Dcyw43=old.

Concrete sketch to adapt (not literal final code — build.zig is 22 KB and has existing structure):

// build.zig additions

const Cyw43 = enum { old, new, new_shadow };

const cyw43_sel = b.option(
    Cyw43,
    "cyw43",
    "CYW43 driver selection: old (shipping), new (Phase 4 cutover), new_shadow (both compiled for dev)"
) orelse .old;

// Expose to source as a build_config import (mirror of nanoruby pattern)
const build_options = b.addOptions();
build_options.addOption(Cyw43, "cyw43", cyw43_sel);
fw_mod.addImport("build_config", build_options.createModule());

// Root source file selection — mirrors nanoruby -Dengine gate.
// This is what preserves byte-identity for -Dcyw43=old.
const fw_root = switch (cyw43_sel) {
    .old => b.path("src/main.zig"),              // current, untouched
    .new => b.path("src/main_cyw43_new.zig"),    // wires Driver to bindings/wifi.zig
    .new_shadow => b.path("src/main.zig"),       // default root; new tree is reachable
                                                 // only via a dedicated test exe, see below
};

// For .new_shadow, add a dedicated test exe that instantiates the new driver
// without swapping the production path:
if (cyw43_sel == .new_shadow) {
    const test_new = b.addExecutable(.{
        .name = "test-cyw43-new",
        .root_module = b.createModule(.{
            .root_source_file = b.path("src/cyw43_new/tests/hardware_bringup.zig"),
            .target = fw_target,
            .optimize = fw_optimize,
        }),
    });
    b.installArtifact(test_new);
}

Consumer pattern (in source code):

const build_config = @import("build_config");
// compile-time branch; zero runtime cost
if (build_config.cyw43 == .new) {
    // use new driver
}

Key rules:

  1. The .old arm's source graph is identical to pre-rewrite — no if (cyw43 == .old) branches inside main.zig itself. The flag only gates root-source-file at the build step. This is what makes byte-identity trivially preserved.
  2. .new_shadow keeps .old as the production root and adds a dedicated test executable. Both drivers get compiled (for their respective trees), but the shipping UF2 still uses old.
  3. .new switches the root source file to a version that wires the new driver to bindings/wifi.zig. Only flipped at Phase 4 cutover.

8.0.2 Peer-review checkpoints

The user-ai MCP's discuss tool with conversation_id: pico-cyw43-rewrite-plan-2026 is the designated peer review channel (see §1 Companion references). Invoke the peer at every checkpoint below, not just when stuck. A quick 2–3-sentence status summary is enough to trigger a sanity check; the peer has the full plan history in the conversation.

Checkpoint Requirement What to review
Before Phase 1 starts REQUIRED Scope-interpretation sanity check. Does the next AI understand §8.1 as intended?
Before Phase 1 PR merge Conditional (invoke if any deviation from §2.4.4 bring-up sequence or new-hardware-state observed) Transport + boot working on hardware.
Before Phase 2 starts Conditional (invoke if mock-framework design differs from §7.2.5 sketch) Strategy for event-decoder tests with Phase-2 captures (§7.1 item 5).
Before Phase 2 PR merge REQUIRED ISSUES.md #25 findings; the unknown-event type identified (or falsification).
Before Phase 3 starts REQUIRED Three-flag model interpretation; reconnect policy defaults sane; rejoin-storm coalescing §6.1.5 implemented.
After PMKSA research re-read (P3b start) Conditional (invoke if iovar returns non-zero on our blob — §5.9.4 fallback kicks in) docs/CYW43-PMKSA-RESEARCH.md re-read; iovar response on our blob tested first.
After WPA3 wire-spec cross-ref (P3c start) Conditional (invoke if deviating from Embassy PR #3323 pattern) §5.10 wire-level spec cross-checked against Embassy.
Before Phase 4 cutover REQUIRED Full hardware matrix §7.3 complete. All go/no-go §11 checkboxes green. Last-look.
On any unexpected hardware behavior Conditional (invoke BEFORE adding a workaround) Post the symptom and proposed workaround to the peer.

Required checkpoints (4): Phase-1 start, Phase-2 PR, Phase-3 start, Phase-4 cutover. Conditional (5): PR merges without issues, research re-reads aligning with prior findings, and hardware surprises. Required checkpoints are hard gates; conditional ones are escape hatches when risk is elevated.

Peer review is cheap; it has caught real bugs in this plan (the PSK_SUP reason=14 bug, the time-boxed-PMKSA ambiguity, the too-brittle byte-identity rule, the WPA3 bool-vs-enum conflation, the rejoin-storm livelock risk). Budget ~20 min per required checkpoint.

8.0.3 PIO SPI carryover

Our existing src/cyw43/transport/pio_spi.zig (406 lines) is hardware-proven — the chip bring-up sequence in §2.4.4 works on real Pico W through this transport. Phase 1 ports it with cleanups, not a rewrite:

  • Keep verbatim: PIO program assembly (the CYW43 gSPI timing is fixed; our PIO instructions are known-good).
  • Keep verbatim: Pin-mapping constants and GPIO setup.
  • Clean: any std.debug.print → Logger; anyerror returns → explicit WifiError; raw pointers → slices per §3.1.
  • Rename: the new home is src/cyw43_new/transport/pio_spi.zig with the same public surface (transferRx/transferTx/setPolarity/reset) but accessed through the Spi vtable interface (§5.1) rather than as module-global functions.
  • Preserve: the specific setPolarity(self, 0) call after mode-switch (cyw43_spi.c:84 equivalent). Do not optimise away — this is part of the documented bring-up.

This is intentionally ~80% line-for-line preservation, not a clean-room reimplementation. The PIO timing is firmware-and-hardware-defined; the existing code is the canonical Zig version of it. Rewriting from scratch would invite regressions in a well-tested component.

8.0.4 No optimization before parity

Rule: through end of Phase 2 (wire-byte parity established via golden-trace regression), no throughput or latency micro-optimizations are merged.

Allowed in P1/P2:

  • Carrying forward proven optimizations from the existing tree (e.g. the PIO SPI program — hardware-tuned, don't touch).
  • Compile-time constant folding that falls out naturally from comptime usage.
  • Dead-code elimination from unused feature flags.

NOT allowed in P1/P2:

  • SDPCM credit prediction / speculative TX.
  • Backplane-window batching beyond what §2.4.3 specifies.
  • Bus bypass shortcuts that skip ioctl-response wait.
  • Custom-written memcpy/memset variants (Zig @memcpy/@memset only).
  • Rewriting the PIO SPI program for "faster" timing.

Rationale: a bug introduced by a performance optimization is at least 10× harder to debug than one introduced by a protocol translation. P1/P2 is about establishing correctness; P3+ can consider perf if measurement demands it. In practice, the reference C driver's perf profile is adequate for Pico W workloads and our Zig port should match it; further optimization is premature.

P4 soak may reveal specific bottlenecks — at that point a performance PR is acceptable with: (a) benchmarks showing the delta, (b) golden trace regression still passing, (c) no change to semantic behavior.


8.1 Phase 1 — Parallel tree scaffolding + transport + boot

Scope:

  • Create src/cyw43_new/ tree.
  • Add -Dcyw43=old|new|new_shadow build option; old is default.
    • old = current src/cyw43 path; must remain byte-identical.
    • new_shadow = both drivers compiled; old still wired to bindings/wifi.zig; new driver exposed via zig build test-cyw43-new for off-line exercise.
    • new = new driver wired to bindings/wifi.zig (reserved for Phase 4).
  • Implement transport/spi.zig, transport/pio_spi.zig, bus/regs.zig, bus/cmd.zig, bus/bus.zig, bus/backplane.zig, hal.zig, config.zig, errors.zig, firmware.zig, types.zig.
  • Implement ll/boot.zig + ll/clm.zig + ll/power.zig through the CLM-load point.
  • Implement mock SPI transport (host-side, under src/cyw43_new/tests/).
  • Capture golden SPI traces from old driver: boot/bus_init.
  • Host-interrupt-pin behavioral verification (added per GPT-5.4 turn-6 review — this is exactly where "works in happy path" drivers get flaky later). Phase 1 validates on real hardware:
    • Polarity: on Pico W, WL_HOST_WAKE (via CYW43_PIN_WL_IRQ in reference; effectively a shared line with WL_DIO when CS is high) is active at what logical level? Reference says host_interrupt_pin_active = 0 for SPI mode (cyw43_ll.c:86). Verify via scoped capture of WL_IRQ during ioctl response: interrupt assertion should be clearly observable with the documented polarity.
    • Level vs edge: our Zig driver polls via cb_read_host_interrupt_pin() returning the current level (not edge-triggered). Verify that polling behavior tolerates missed transitions — i.e. if WL_IRQ goes active and back to idle between two has_work() checks, the subsequent poll must still discover the pending packet via SPI_INTERRUPT_REGISTER / SPI_STATUS_REGISTER read. This is what cyw43_ll.c:1007-1063 handles via the spi_int check when the pin itself isn't currently asserted. Our port must preserve this fallback.
    • Missed-interrupt tolerance: deliberately induce a scenario where WL_IRQ fires between polls (e.g. rapid ioctl sequence). Confirm the had_successful_packet + SPI_INTERRUPT_REGISTER fallback pattern recovers without packet loss. Reference at cyw43_ll.c:1014-1064.

Validation:

  • zig build -Dcyw43=old byte-identical to baseline.
  • zig build -Dcyw43=new_shadow compiles + links for firmware target.
  • zig build test includes new driver host tests; all pass.
  • Mock SPI replay of boot/bus_init matches golden byte-for-byte.
  • On hardware (via a dedicated test entry, not via bindings/wifi.zig): new driver can execute busInit end-to-end on Pico W, print OK, and halt. Takes ~400 ms.

Rollback: delete src/cyw43_new/ tree, remove build-option branch; -Dcyw43=old remains unchanged.

Commit size: ~1,300 LOC added across ~10 files. 3–4 commits.

Expected time: 1–2 coding sessions (best case ~8 h uninterrupted; realistic 12–16 h including trace-capture harness + hardware firmware-upload verification).

Dependencies: none (just the repo’s current state).

8.2 Phase 2 — Frame + ioctl + event pipeline + wifi_on

Scope:

  • Implement ll/frame.zig, ll/ioctl.zig, ll/events.zig, ll/scan.zig.
  • Implement ctrl/state.zig, ctrl/country.zig.
  • Implement wifiOn() (the full cyw43_ll_wifi_on sequence with broad event mask).
  • Implement exhaustive event decoder (all 16+ event types + unknown-logger).
  • Capture golden SPI traces: scan/escan_request, events/startup_event_stream.

Validation:

  • -Dengine=js still byte-identical (old driver untouched).

  • Mock-replay scan passes wire-for-wire.

  • New driver can: init, wifi_on, issue scan, log every event decoded.

  • Host unit tests pass for event decoder (§7.1 tests).

  • On hardware: new driver scan returns real AP list including correct auth_mode.

  • Event-mask effectiveness gate (P2-mandatory). The bsscfg:event_msgs write alone is not a guarantee of delivery. Verify by observing that each of the following known events is received by the decoder during a scripted sequence:

    • SET_SSID, AUTH, LINK, PSK_SUP (from a successful WPA2 join)
    • ESCAN_RESULT (from a scan)
    • DEAUTH_IND (induced via hostapd or iw disconnection on a test AP)
    • DISASSOC (from leave())

    If any expected event fails to arrive, debug the mask construction (bsscfg:event_msgs payload shape — name buffer length, bsscfgidx endianness, exact mask byte count for this firmware vintage) before proceeding. Broadcom iovars fail silently — this gate catches the silent-failure class.

  • Rate-limit unknown-event logging per §6.1.4 authoritative spec. Ring buffer bounded at 16 distinct {type,status,reason} tuples; 5-second coalescing window; UART output capped at ~1 line per distinct tuple per window plus one coalescing summary. Unbounded logging under a flood would itself perturb UART timing and muddy the ISSUES.md #25 diagnostic (which is exactly the kind of diagnostic the mechanism is meant to resolve — self-awareness matters here).

  • ISSUES.md #25 diagnostic progress: run a 30-min soak with new driver in shadow mode, collect event log, identify the 180-s-cadence event (likely roam-related). Findings go into UPSTREAM.md M-entry.

Rollback: -Dcyw43=new_shadow remains experimental; all production builds unaffected.

Commit size: ~1,100 LOC added across ~7 files. 3–5 commits.

Expected time: 1–2 coding sessions (best case 16 h; event-pipeline bugs absorb time, realistic 20–30 h including mask-effectiveness debug).

Dependencies: Phase 1.

8.3 Phase 3 — Join / link / reconnect / PMKSA

Scope:

  • Implement ctrl/join.zig, ctrl/link.zig, ctrl/poll.zig.
  • Implement the full join state machine (§6.1) and deferred-action processing.
  • P3 reliability deliverables (ordered by value/cost per docs/CYW43-PMKSA-RESEARCH.md):
    • P3a — ICV_ERROR handler (highest value, 3 lines). Event 49 (CYW43_EV_ICV_ERROR) sets pend_rejoin = true and schedules poll. Addresses pico-sdk #2153. This lands as part of the exhaustive event decoder commit — no separate step; it's the specific event-49 branch. Verify on hardware: reproduce the #2153 power-save-induced ICV_ERROR flood scenario; observe that the first event 49 triggers rejoin within ~5 s instead of silent flood.
    • P3b — PMKSA clear_on_boot (mandatory, ~20 LOC). At end of wifiOn() after WLC_UP, send the 356-byte pmkid_info iovar (see docs/CYW43-PMKSA-RESEARCH.md for exact payload). State transitions to wifi_up_pmksa_cleared; join() requires that state. Verify on hardware via H2 (router power-cycle) A/B. If the iovar is unsupported on our blob vintage (unexpected per research), execute §5.9.4 fallback.
    • P3c — PMKSA cache_in_boot (stretch, optional). Implement host-side cache + DEAUTH eviction per §5.9.3. If time-constrained, defer to Phase B; clear_on_boot remains the Phase 4 default either way.
  • Add compat façade in cyw43.zig (§5.10).
  • Capture golden SPI traces: join/wpa2_psk, reconnect/after_deauth, reconnect/after_icv_error, pmksa/clear_on_boot.

Validation:

  • -Dengine=js byte-identical (old driver untouched).
  • Mock-replay join/wpa2_psk matches.
  • New driver under -Dcyw43=new_shadow passes hardware matrix H1, H2, H3, H7, H8.
  • ICV_ERROR validation (P3a). Reproduce pico-sdk #2153 scenario: set PM to PM_POWERSAVE, idle for >2 minutes to trigger a rekey event, observe whether event 49 fires. New driver must detect the event, queue pend_rejoin, and re-establish the link within ~10 s. Old driver under the same setup shows the event-49 flood.
  • PMKSA validation (P3b). H2 (router power-cycle) demonstrably succeeds with the new driver AND demonstrably failed-or-degraded with the old driver on the same physical setup (A/B). Plus the clear-effectiveness negative proof from §11.3 (on-wire / AP-log / timing evidence that the clear actually took effect).
  • 60-min soak under MQTT workload: no watchdog, link stays up, auto-reconnect works.
  • Event-mask fully opened; all events decoded. Any remaining "unknown" events are logged with type+status+reason.

Rollback: revert to end-of-Phase-2 state. Hardware shipping keeps using old driver.

Commit size: ~700 LOC added across ~4 files. 3–4 commits.

Expected time: multi-session + hardware soak (best case 16 h coding + 1 week soak; realistic 2–3 sessions + 1–2 weeks hardware iteration).

Dependencies: Phase 2 complete and mock-soak-validated.

8.4 Phase 4 — Flip default, soak, retire old tree

Scope:

  • Change build default to -Dcyw43=new. bindings/wifi.zig now calls into the new driver via the compat façade.
  • Update .preflight-baseline/pico-preintegration.uf2 only when the flip happens (this is the one commit where byte-identity is intentionally broken; capture the new baseline).
  • After 2 weeks of daily use / soak validation without regressions: delete src/cyw43/ entirely.
  • At that point, the compat façade can start gradually thinning (optional Phase 5).

Validation:

  • Hardware matrix §7.3 passes against -Dcyw43=new as default.
  • 2-week soak: no regression reports, no watchdog, no reconnect failure that the old driver handled.
  • Commit adding the new baseline is reviewed against:
    • .text + .rodata size delta (expect ±5 KB; document whichever direction).
    • peak .bss delta.
  • Old src/cyw43/ deletion commit passes all gates on its own.

Rollback: revert the baseline-update commit and the default-flip commit. Old driver comes back.

Commit size: tiny (baseline update + 1-line build.zig change), but has existential risk — plan to ship on a Monday with picocom open.

Expected time: 1 coding session + 2 weeks elapsed soak.

Dependencies: Phase 3 passes all hardware matrix + 60-min soak.

8.5 Migration summary table

Phase Added LOC Time (realistic) Default build Risk
1 ~1,300 1–2 sessions old Low (parallel tree, no cutover)
2 ~1,100 1–2 sessions old Low
3 ~700 2–3 sessions + 1–2 weeks hardware soak old Medium (new driver on hardware soak)
4 ~50 1 session + 2 weeks soak new High (cutover moment)
5 (opt) -2,200 (delete old) 1 session new Low (cleanup)

Hour estimates are best-case / uninterrupted. Real elapsed time stretches for bus-bringup debugging, event-mask-effectiveness debugging, and hardware iteration windows — these are the three known time-sinks in a driver rewrite.


Section 9 — Risk register

Ranked by probability × impact. Each has a mitigation and a detection strategy.

ID Description P I Mitigation Detection
R1 C→Zig translation bug that compiles but fails at runtime (silent wire-format corruption) M H Mock-replay golden SPI traces (§7.2) at each phase; host unit tests for encoders/decoders Replay byte diff
R2 Endianness mismatch in event wrapper (BE) vs SDPCM header (LE) M H Explicit endianness matrix §2.4.2 embedded in code comments; std.mem.readInt(T, slice, .big|.little) called with explicit endianness Host tests on captured payloads
R3 Alignment hard-fault on Cortex-M0+ from struct-cast over RX buffer M H §3.5 forbids packed-struct casts; reviewer enforced Crash on first event; easy to catch
R4 Credit arithmetic wrap bug (u8 modular) L H Tests §7.1 item 3; mirror reference's credit_delta <= 20 accept rule Credit-stall timeout; test case
R5 150 ms startup timing miss → intermittent boot hang M M Preserve exactly; mark as "known-required timing", do not optimise away Hangs in boot on ~10% of power-cycles
R6 PMKSA iovar returns error on our blob vintage → clear_on_boot cannot be implemented as specified Very Low M Pre-Phase-3 research completed (docs/CYW43-PMKSA-RESEARCH.md): iovar pmkid_info is confirmed for CYW43439 family (CY_CC_43439_CHIP_ID explicit in brcmfmac). Legacy API applies to our blob. Our actual firmware is 7.95.61 (2023-01-11, identical to Embassy's). Residual risk very low; validated via <30-min hardware test early in P3b. If iovar returns error: blob upgrade to soypat's 7.95.62 (triggers R17) OR §5.9.4 documented-alternate-primitive fallback. H2 (router power-cycle) comparison. Firmware error response on the iovar call is observable.
R6b PMKSA cache semantics wrong → stale PMKID used on reconnect, failure worse than no cache L M cache_in_boot mode must evict on DEAUTH_IND, DISASSOC_IND, ICV_ERROR (§5.9.3); unit test covers eviction; cache_in_boot ships only after H3 (forced DEAUTH) demonstrates clean recovery H3 fails on cache_in_boot but not on clear_on_boot
R7 Firmware blob version mismatch with the reference's tested vintage L M Pin firmware SHA to firmware/UPSTREAM.md; sanity check runs on every boot cyw43_check_valid_chipset_firmware-equivalent fails loudly
R8 Undocumented Broadcom iovar used by reference we don't understand (e.g. apsta, ampdu_*) L M Reproduce the reference's iovar sequence line-for-line in ll/boot.zig; don't omit unknowns wifiOn returns success but device is deaf
R9 Event ordering differs from reference's assumptions M M State machine §6.1 handles flag-bit sets independently (order-agnostic); explicit test with out-of-order sequences Join succeeds or fails non-deterministically
R10 Reentrancy bug: event handler triggers ioctl synchronously L H Architectural invariant §2.4.7 + §6.4; pend-flag mechanism only way to trigger ioctl from event Stack corruption; intermittent crash
R11 Backplane window cache drift (raw write to HIGH/MID/LOW somewhere) L H Single-owner invariant §2.4.3; grep-enforceable in CI (no SDIO_BACKPLANE_ADDRESS outside bus/backplane.zig) FW upload corruption; CLM load timeout
R12 LICENSE.RP attribution gap for a translated file L M Per-file header rule §10.3; UPSTREAM.md mapping table; CI lint that every file under src/cyw43_new/ll/ and ctrl/ has the header Manual audit in Phase 4
R13 Hidden dependency on pico-sdk RTOS primitives (mutex/semaphore) L M §2.4 audit confirms reference uses CYW43_THREAD_ENTER/EXIT only as opt-in mutex; our cooperative single-poll model is equivalent (poll serialises access) Concurrency bug appears only under TCP retransmit + scan in parallel
R14 ISSUES.md #25 root cause is not an unhandled event (hypothesis wrong) M L Phase 2 event logging gives us signal either way; if hypothesis is wrong, we at least know what IS happening at 180 s Compare before/after burst pattern with exhaustive event log
R15 Byte-identity gate slips on an unintended import during new-tree development L H CI check run on every commit; fail-fast sha256sum mismatch at commit time
R16 P2P/WFD action frame events appear unexpectedly even in STA-only L L Event decoder's Event.unknown branch handles gracefully; no crash Log output
R17 Future firmware blob / NVRAM / CLM revision swap invalidates wire-format assumptions (event layout, mask length, iovar payload shape) L H All on-wire format assumptions in this plan are validated only against the specific firmware/NVRAM/CLM revisions committed under src/cyw43/firmware/ (and mirrored to src/cyw43_new/firmware.zig unchanged). Changing blobs requires rerunning §7.2 golden-trace regression fixtures and §7.3 H1/H2/H3/H12 hardware scenarios. A blob swap is treated as a Phase-3-equivalent re-validation, not a drop-in replacement. Regression: golden-trace byte mismatch on fresh capture

Highest-residual-risk items (P+I both at least M): R1, R2, R3, R9, R14. Each of these has a specific gate in §7 that must pass before the next phase ships.


Section 10 — Attribution & upstream-tracking strategy

10.1 License compliance

LICENSE.RP (reproduced in full at src/cyw43_new/LICENSE-REFERENCE.md) grants use on RP2040-family semiconductors. Key obligations for redistributions in source:

  1. Retain the George Robotics / Raspberry Pi Ltd copyright notice.
  2. Retain the list of conditions + disclaimer.
  3. Redistributions in binary form: reproduce the notice in documentation/materials accompanying the binary.

Our rewrite is a derivative work informed by protocol behavior, control-flow, and reference identifiers. It does not incorporate verbatim copies of reference C code or translated bytecode. The Zig implementation itself is the pico project’s work and is licensed under this project’s license (pending license selection; currently unstated — flag for repo owners: choose a license before Phase 4 ship).

10.2 Top-level artifacts

At the root of src/cyw43_new/:

  • LICENSE-REFERENCE.md — full text of misc/pico-sdk/lib/cyw43-driver/LICENSE.RP, verbatim, with a preamble explaining that the file is reproduced for compliance under terms 2/3.
  • UPSTREAM.md — reference SHA (dd7568229f3bf7a37737b9e1ef250c26efe75b23), snapshot date, and the running log of intentional algorithmic deviations from the reference (M1, M2, ...). Same style as src/ruby/nanoruby/UPSTREAM.md. Phase 1 lands M1; Phases 2–4 add entries as they introduce local behaviors.

10.3 Per-file provenance header

Every Zig file under src/cyw43_new/ll/ and src/cyw43_new/ctrl/ (the modules whose algorithms are directly traceable to specific C files) begins with:

// This file is part of pico's pure-Zig CYW43 driver.
//
// Implementation informed by the protocol behavior and control-flow of
// cyw43-driver's <reference_file>.c. Any direct borrowings of identifiers,
// constants, or comments are attributed inline.
//
// Reference driver: https://github.com/georgerobotics/cyw43-driver
// Reference SHA:    dd7568229f3bf7a37737b9e1ef250c26efe75b23
// Reference file/function lineage: see UPSTREAM.md § "Source mapping".
//   (A single Zig file may draw from multiple reference files when the
//    underlying logic is cross-cutting; the lineage table enumerates
//    each borrowing.)
//
// Copyright (C) 2019-2022 George Robotics Pty Ltd (reference driver;
// licensed under LICENSE-REFERENCE.md terms)
// Copyright (C) 2026 pico project contributors (this Zig translation)

Files under src/cyw43_new/transport/, bus/, and utility modules (config.zig, errors.zig, firmware.zig, hal.zig, types.zig) carry a reduced header that cites LICENSE-REFERENCE.md without per-file lineage (they are ports of constants and port-abstractions, not translated algorithms).

10.4 Lineage mapping appendix in UPSTREAM.md

A table mapping every Zig function with algorithmic lineage → reference C function + line range. Example fragment:

| Zig symbol                        | Reference symbol                   | Ref lines    |
|-----------------------------------|------------------------------------|--------------|
| ll/boot.zig :: busInit            | cyw43_ll.c :: cyw43_ll_bus_init    | 1424-1794    |
| ll/clm.zig  :: clmLoad            | cyw43_ll.c :: cyw43_clm_load       | 1351-1396    |
| ll/ioctl.zig :: doIoctl           | cyw43_ll.c :: cyw43_do_ioctl       | 1154-1185    |
| ll/frame.zig :: packSdpcmHeader   | cyw43_ll.c :: cyw43_sdpcm_send_common header-fill | 700-717 |
| ll/events.zig :: decode           | cyw43_ll.c :: cyw43_ll_parse_async_event + cyw43_ctrl.c :: cyw43_cb_process_async_event | 592-621, 333-439 |
| ll/scan.zig :: parseResult        | cyw43_ll.c :: cyw43_ll_wifi_parse_scan_result | 538-590 |
| ctrl/join.zig :: joinWpa2         | cyw43_ll.c :: cyw43_ll_wifi_join   | 2051-2177    |
| ctrl/join.zig :: rejoin           | cyw43_ll.c :: cyw43_ll_wifi_rejoin | 2184-2187    |
| ctrl/poll.zig :: pollOnce         | cyw43_ctrl.c :: cyw43_poll_func    | 218-271      |
| ctrl/country.zig :: tbl           | cyw43_country.h (whole)            | whole file   |

The appendix is populated incrementally — each phase’s commits update it as functions land.

10.4.1 Additional consulted references (beyond the primary pico-sdk reference)

These are non-primary references consulted for cross-verification, interop details, and alternative-implementation comparison. They are not directly derived from; the Zig code is written from behavioral understanding + the cross-referenced specs. Citations appear in UPSTREAM.md M-entries where a specific claim was supported by one of these sources.

Reference SHA / source License Primary use
pico-sdk cyw43-driver dd7568229f3bf7a37737b9e1ef250c26efe75b23 (April 2024 vintage) — misc/pico-sdk/lib/cyw43-driver/ LICENSE.RP (restrictive) Primary reference. Behavioral port target. §2.
Linux brcmfmac master (read from torvalds/linux via WebFetch on 2026-04-20) — paths drivers/net/wireless/broadcom/brcm80211/brcmfmac/{cfg80211.c, fwil_types.h, feature.c} GPL-2.0 (read-only; behavioral evidence) PMKSA iovar spec (pmkid_info), chip-compat confirmation for CYW43439 (CY_CC_43439_CHIP_ID), WLC-version feature gating. See docs/CYW43-PMKSA-RESEARCH.md.
Embassy cyw43 crate 3c70a9bf540802365ea42a8183d55c9e2a7a1ddb (cloned 2026-04-20) — misc/embassy/cyw43/src/ Apache-2.0 OR MIT (permissive) Second-opinion reference: event decoder coverage, SDPCM/CDC wire structs (structs.rs), WPA3/SAE handshake, RP2040 PIO SPI in misc/embassy/embassy-rp/src/. Grep-confirmed: no PMKSA support.
soypat cyw43439 driver 045049fee0b1c8812652318a4d15c85938c87c61 (cloned 2026-04-20) — misc/cyw43439/ MIT (permissive) Third-opinion reference: baremetal-heapless state-machine patterns (TinyGo), alternative PIO SPI (bus_pico_pio.go), WHD-translated protocol/event definitions under whd/. Grep-confirmed: no PMKSA support.
Infineon WHD (docs only) Public API reference at infineon.github.io/wifi-host-driver/ Apache-2.0 (docs consulted, source not cloned) Cross-reference for PMKSA (confirmed no public set_pmksa API exposed), whd_wifi_set_pmk signature.

Discipline when consulting these:

  1. License boundaries per §10.3. brcmfmac is GPL-2.0: read-only, no copying. Embassy and soypat are permissive: behavioral evidence preferred over copying, but copying with attribution is legally OK.
  2. Cite in UPSTREAM.md when a claim rests on a specific source. e.g. "M5: event-49 handler: lineage — cyw43-driver L428-432 (PR #130); cross-verified with Embassy events.rs and soypat whd/asyncevent.go."
  3. Prefer the primary pico-sdk reference when it is definitive. The other sources are tie-breakers and gap-fillers.
  4. Staleness. These are snapshots. Re-clone and diff if a future issue suggests one of them has fixed something we're hitting.

10.5 UPSTREAM.md M-log convention

The local modification log works exactly like src/ruby/nanoruby/UPSTREAM.md:

### M1 — Parallel-tree scaffolding (Phase 1)

Problem: n/a (initial port).
Edits: created `src/cyw43_new/` with the 10 Phase-1 files.
Upstream intent: n/a (fork by design).
Acceptance: -Dengine=js byte-identical; -Dcyw43=new_shadow compiles.

### M2 — Exhaustive event decoder (Phase 2)

Problem: reference driver's non-exhaustive event handling silently drops events,
blocking diagnosis of ISSUES.md #25.
Edits: `ll/events.zig` includes the full 89-name table and logs every unknown
event with type+status+reason+flags+ifidx+payload-hex-prefix (first 16 bytes).
Difference from reference: exhaustive vs. opportunistic.
Acceptance: event log from 30-min hardware soak identifies 180 s cadence event.

### M3 — PMKSA clear-on-boot (Phase 3, mandatory improvement over reference)

Problem: stale firmware-side PMKID state after AP power-cycle causes
repeated 4-way handshake failures (H2 scenario). Reference driver does
not implement PMKSA management.

Pre-work: `src/cyw43_new/notes/pmksa_research.md` cross-references
brcmfmac (Linux GPL, read-only reference) and Infineon WHD (Apache 2.0).
Identified iovar: `pmkid_info` with bsscfg prefix + struct list payload.
See research doc for full wire-format citations.

Edits: `ctrl/pmksa.zig` (new, ~180 LOC), cache data structure with
LRU eviction, iovar pack/unpack, hook in `wifiOn()` for clear-on-boot
and in the event handler for DEAUTH-triggered eviction.

Acceptance: H2 passes on new driver with `.clear_on_boot`, demonstrably
fails on `-Dcyw43=old`. `.cache_in_boot` ships with DEAUTH eviction
verified via captured 802.11 fast-reauth exchange on same-BSSID
reconnect within a boot.

10.6 Handling future upstream fixes

When the reference driver publishes a fix (e.g. a CYW43 firmware vintage bump or a new iovar sequence for an edge case):

  1. git -C misc/pico-sdk fetch && git log dd7568229f3bf7a37737b9e1ef250c26efe75b23..HEAD -- lib/cyw43-driver/ to see the delta.
  2. Per-commit: examine the diff, decide whether it affects our port.
  3. If yes: add an M-entry describing the upstream change and our port response.
  4. Update the reference SHA in UPSTREAM.md only when a batch of upstream commits has been evaluated.

Section 11 — Go/no-go criteria for the next session

The coding AI that executes this plan must verify every item below before writing code. If any fails, iterate on the plan — don’t proceed.

11.1 Plan itself

  • This document exists at docs/CYW43-REWRITE.md and was peer-reviewed by GPT-5.4 via conversation pico-cyw43-rewrite-plan-2026. Peer critique log accessible via the user-ai MCP’s discuss tool with that conversation ID.
  • Every section referenced from §1 (overview) to §11 (this checklist) is populated.
  • Sections 3 (Zig-idiom style guide) and 10.3 (per-file attribution rules) have been re-read.

11.2 Audit depth

  • Every public function in cyw43.h has a row in §2.1's table.
  • Every state transition in cyw43_cb_process_async_event (cyw43_ctrl.c:333-439) is reflected in §2.3.1 + §6.1.
  • Every event type enumerated in cyw43_ll.h:67-82 has an entry in §2.2's event table.
  • Every major SDPCM / CDC / BDC code path in cyw43_ll.c is covered in §2.4.
  • The endianness matrix §2.4.2 has entries for every on-wire field the driver touches.

11.3 Reliability deliverables

  • Auto-reconnect on DEAUTH/DISASSOC is defined in §6.1 and surfaced as a Config.reconnect policy with explicit retry_badauth / retry_nonet gates defaulting OFF.

  • PMKSA is a mandatory Phase 3 deliverable per §5.9. clear_on_boot is the default and must land before Phase 4 cutover. cache_in_boot is a Phase 3 stretch; OK to defer the full cache implementation, not OK to defer clear_on_boot.

  • PMKSA pre-work artifact present: docs/CYW43-PMKSA-RESEARCH.md exists, cites specific brcmfmac source lines, specifies the iovar name (pmkid_info), wire payload (356 bytes = __le32 npmk + 16 × 22-byte entries), chip compatibility (CY_CC_43439_CHIP_ID in brcmfmac), and API version (legacy for our blob). Produced in the planning session alongside this document; Phase 3 P3b coding can proceed directly.

  • ICV_ERROR (event 49) handler present in the event dispatcher (§6.1). On receipt: set pend_rejoin = true; pend_rejoin_wpa = false; and schedule poll. Addresses pico-sdk #2153.

  • ICV_ERROR hardware verification: reproduce the pico-sdk #2153 power-save-induced ICV_ERROR scenario; new driver recovers within ~10 s; old driver shows silent flood.

  • PMKSA hardware verification: §7.3 H2 (router power-cycle) demonstrably succeeds with new driver + clear_on_boot AND demonstrably failed-or-degraded with the old driver on the same physical setup. Captured as UPSTREAM.md M-entry evidence.

  • Three-flag link-state model in §6.1 adopted; bitmask model abandoned per decision log (2026-04-20).

  • Crypto-drift class expanded beyond ICV_ERROR. §6.1 event-to-transition table includes pend_rejoin on MIC_ERROR (17), UNICAST_DECODE_ERROR (50), PSM_WATCHDOG (41). MULTICAST_DECODE_ERROR (51) gated behind the §6.1.1 threshold. All have host-side unit tests.

  • PSK_SUP reason=14 IGNORE rule is in §6.1 table AND in §7.1 test matrix item 4. Regression guard documented.

  • WPA3-SAE implemented per §5.10. wpa3_mode: Wpa3Mode = .auto default. AUTH FAIL reason=16 auth_type=3 triggers link-down (not badauth-retry). Fallback behavior per §5.10 table: wpa3_wpa2_aes_psk falls back to WPA2; wpa3_sae_aes_psk does NOT. sae_password iovar used for WPA3 join; WLC_SET_AUTH=3 for WLC auth cmd.

  • Beacon-loss → degraded transition (§6.2) driven by BCNLOST_MSG (31) event. GTK_PLUMBED or successful RX transitions degraded → up.

  • Firmware blob version recorded. firmware.zig exposes FW_VERSION = "7.95.61", FW_DATE = "2023-01-11", FW_BUILD = "abcd531 CY", FW_SHA256 = "..." as compile-time pub consts. The on-device banner line at boot should mention the firmware version so support issues have concrete data.

  • PMKID_CACHE event wired to cache_in_boot sync (§5.9.5) if Phase 3 P3c is pursued; otherwise explicitly deferred to Phase B with UPSTREAM.md note.

  • PMKSA clear-effectiveness negative proof. "We called the iovar" is insufficient. Capture observable evidence that the clear actually took effect. At least one of:

    • On-wire evidence: a 4-way-handshake exchange captured by a 802.11 monitor (wireshark with monitor-mode NIC, or hostapd logs) after a reconnect that would otherwise have reused a PMKID. Absence of the PMKID TLV in the Association Request confirms cache was cleared.
    • AP-side evidence: hostapd log lines showing "PMKSA cache entry not found" for our BSSID on the reconnect.
    • Timing evidence (weakest): reconnect latency measurably longer than a cached-reauth would produce, consistent with full 4-way handshake, across at least 10 trials.

    At least the timing evidence path must be captured in every case, because it requires no special tooling; the on-wire or AP-side evidence is strongly preferred if a monitor-mode capture rig or AP log access is available. Paste the evidence into UPSTREAM.md M3.

  • The ISSUES.md #25 diagnostic path is explicitly called out in §2.8 (G13) and §8.2 (validation + rate-limited logging).

  • Unknown-event logging mechanism implemented per §6.1.4 (authoritative spec). Event.unknown struct has all 10 fields enumerated. EventLog ring buffer bounded at 16 distinct tuples with 5-second coalescing window. 89-entry event_names table populated. Driver.getEventLog() + clearEventLog() API exposed. wifi events UART shell command implemented at src/bindings/wifi.zig (Phase 3). Invariants (no allocation, no blocking, no driver-callback reentry) verified by reviewer.

  • Event-mask effectiveness gate (§8.2) is listed as a P2 hard requirement, not a wish.

  • P2 sign-off: at least one induced DEAUTH_IND and one induced DISASSOC have been captured on hardware and decoded with expected status/reason fields matching the reference tables (§2.2).

  • P3 sign-off: reconnect policy verified on hardware for each of the four trigger classes separately: transient loss (DEAUTH/DISASSOC), BADAUTH, NONET, and manual leave+rejoin. Each transition observed to match §6.1.

  • Wire-format gate: the first golden-trace diff review (§7.2) includes byte-level verification of (a) packCmd output for three representative read/write/backplane commands, (b) SDPCM header bytes for one CONTROL and one DATA frame, (c) CDC header bytes for one ioctl, (d) bsscfg:event_msgs payload bytes.

  • State-safety gate: no callback code path reachable from HostHooks (onEvent, onEthernetRx, onLinkUp, onLinkDown) can synchronously invoke doIoctl or sendEthernet while the driver is in an active RX/IOCTL dispatch. Audited via grep + code review.

  • Blob-coupling record: every golden trace fixture captured in §7.2 has a metadata header recording sha256(firmware/43439A0_combined.bin) and sha256(firmware/43439A0_nvram.bin) at capture time. Re-use of a trace requires current blob hashes to match.

11.4 Licensing

  • LICENSE-REFERENCE.md strategy defined (§10.2).
  • Per-file provenance header defined (§10.3).
  • Lineage mapping table scaffold present in §10.4.
  • Repo license selection flagged for repo-owner decision (§10.1). This is a true blocker for Phase 4 ship but not for Phase 1 start.

11.5 Risk register has mitigations for all H×H entries

  • R1 (translation bug): mock-replay golden traces.
  • R2 (endianness): explicit matrix + std.mem.readInt enforcement.
  • R3 (alignment): §3.5 packed-struct ban.
  • R10 (reentrancy): §2.4.7 + §6.4 queue-not-call.
  • R11 (backplane window): single-owner invariant §2.4.3.
  • R15 (byte-identity slip): CI SHA check.

11.6 Zig 0.16 idiom sanity

  • Every idiom in §3 has been cross-referenced against ZIG-0.16.0-REFERENCE.md. In particular: std.mem.readInt with explicit endianness, callconv(std.builtin.CallingConvention.c), enum(u16) with trailing _, error unions, *anyopaque + vtable pattern.
  • The freestanding-stdlib discipline is preserved: no std.debug.print, no std.Io.File, no std.Io.Threaded.

11.7 Migration plan executable

  • Phase 1 scope (§8.1) fits in ~1 coding session without requiring hardware cutover.
  • Each phase has explicit validation criteria distinct from the next.
  • Rollback plan for each phase does not require data migration.

11.8 The hard stops

Do not write code if any of these is true:

  • The byte-identity baseline .preflight-baseline/pico-preintegration.uf2 does not exist or the SHA is not recorded in this repo.
  • The -Dcyw43=old|new|new_shadow build option has not been added (that’s Phase 1 step 1 — OK if not yet, but Phase 2+ hinges on it).
  • The project license is unresolved AND we are at or past Phase 4.
  • The firmware/NVRAM/CLM blobs under src/cyw43/firmware/ have changed since the last src/cyw43/device.zig::FW_LEN was measured, without §7.2 golden traces being re-captured. Blob changes are a Phase-3-equivalent re-validation event (R17).

Section 12 — Appendix: reference quick-lookup tables

12.1 WLC commands (used)

WLC Value Set/Get Purpose
UP 2 Set Bring interface up
DOWN 3 Set Bring interface down
SET_INFRA 20 Set Set infrastructure mode (1 for STA)
SET_AUTH 22 Set Auth type (OPEN/SAE)
GET_BSSID 23 Get Read BSSID
GET_SSID 25 Get Read SSID
SET_SSID 26 Set Initiate association
SET_CHANNEL 30 Set Force channel
DISASSOC 52 Set Disassociate
SET_DTIMPRD 78 Set DTIM period
GET_PM / SET_PM 85 / 86 Both Power save mode
SET_GMODE 110 Set G-mode (2.4GHz modulation config)
SET_WSEC 134 Set Security cipher
SET_BAND 142 Set Band select (auto = 0)
GET_ASSOCLIST 159 Get AP: associated STAs
SET_WPA_AUTH 165 Set WPA/WPA2/WPA3 auth method
GET_VAR 262 Get Read iovar
SET_VAR 263 Set Write iovar
SET_WSEC_PMK 268 Set Set PMK (passphrase)

12.2 Key iovars used

Iovar Type Purpose
country 20-byte blob Set regulatory domain
clmver string Read CLM version
clmload chunked upload CLM blob upload
clmload_status u32 CLM upload status
bus:txglom u32 TX glomming (off=0)
apsta u32 AP+STA concurrent mode
ampdu_ba_wsize u32 AMPDU block-ack window size
ampdu_mpdu u32 AMPDU max MPDU
ampdu_rx_factor u32 AMPDU RX factor
cur_etheraddr 6 bytes MAC address
mcast_list list Multicast filter
pm2_sleep_ret u32 PM2 sleep retention time (ms × 10)
bcn_li_bcn u32 Beacon listen interval (beacons)
bcn_li_dtim u32 Beacon listen interval (DTIMs)
assoc_listen u32 Listen interval sent to AP
escan struct Start escan
join struct Join specific BSSID+chanspec
sae_password 2+128 bytes WPA3-SAE password
sae_max_pwe_loop u32 SAE PWE loop limit
mfp u32 Management frame protection
bsscfg:sup_wpa bsscfgidx u32 + u32 Enable supplicant
bsscfg:sup_wpa2_eapver bsscfgidx u32 + u32 EAP version (-1 = auto)
bsscfg:sup_wpa_tmo bsscfgidx u32 + u32 Supplicant timeout (ms)
bsscfg:event_msgs bsscfgidx u32 + mask Event mask
bsscfg:ssid bsscfgidx u32 + u32 + 32 bytes AP SSID
bsscfg:wsec bsscfgidx u32 + u32 AP security cipher
bsscfg:wpa_auth bsscfgidx u32 + u32 AP auth
2g_mrate u32 2.4G multicast rate
gpioout u32 + u32 CYW43 GPIO set (mask, value)
ccgpioin u32 CYW43 GPIO read

12.3 Event mask (bsscfg:event_msgs) default construction

Reference starts with all 19 bytes = 0xff, then clears 6 specific bits. Event N lives at mask byte[N/8], bit N%8. The C reference (cyw43_ll.c:1895-1902) uses:

memset(buf + 18 + 4, 0xff, 19);
#define CLR_EV(b, i) b[18 + 4 + i / 8] &= ~(1 << (i % 8))
CLR_EV(buf, 19); CLR_EV(buf, 20); CLR_EV(buf, 40);
CLR_EV(buf, 44); CLR_EV(buf, 54); CLR_EV(buf, 71);

Concrete bit positions and final 19-byte payload:

Event # Mask byte Bit Reason cleared
ROAM 19 byte[2] 3 Noise (periodic roam-attempt reports)
TXFAIL 20 byte[2] 4 Noise (per-packet failure counter, not failure-event)
RADIO 40 byte[5] 0 Rare, not action-relevant
PROBREQ_MSG 44 byte[5] 4 AP-side noise we don't use
IF 54 byte[6] 6 We manage our own interfaces
PROBRESP_MSG 71 byte[8] 7 Very noisy

After the 6 clears, the 19-byte mask payload sent with the bsscfg:event_msgs iovar is:

offset  byte  bits cleared  value  events covered
  0     0xff   —            0xff   SET_SSID JOIN START AUTH AUTH_IND DEAUTH DEAUTH_IND ASSOC
  1     0xff   —            0xff   ASSOC_IND REASSOC REASSOC_IND DISASSOC DISASSOC_IND QUIET_START QUIET_END BEACON_RX
  2     0xff   bit 3, bit 4 0xe7   LINK MIC_ERROR NDIS_LINK [ROAM] [TXFAIL] PMKID_CACHE RETROGRADE_TSF PRUNE
  3     0xff   —            0xff   AUTOAUTH EAPOL_MSG SCAN_COMPLETE ADDTS_IND DELTS_IND BCNSENT_IND BCNRX_MSG BCNLOST_MSG
  4     0xff   —            0xff   ROAM_PREP PFN_NET_FOUND PFN_NET_LOST RESET_COMPLETE JOIN_START ROAM_START ASSOC_START IBSS_ASSOC
  5     0xff   bit 0, bit 4 0xee   [RADIO] PSM_WATCHDOG CCX_ASSOC_START CCX_ASSOC_ABORT [PROBREQ_MSG] SCAN_CONFIRM_IND PSK_SUP COUNTRY_CODE_CHANGED
  6     0xff   bit 6        0xbf   EXCEEDED_MEDIUM_TIME ICV_ERROR UNICAST_DECODE_ERROR MULTICAST_DECODE_ERROR TRACE BTA_HCI_EVENT [IF] P2P_DISC_LISTEN_COMPLETE
  7     0xff   —            0xff   RSSI PFN_BEST_BATCHING EXTLOG_MSG ACTION_FRAME ACTION_FRAME_COMPLETE PRE_ASSOC_IND PRE_REASSOC_IND CHANNEL_ADOPTED
  8     0xff   bit 7        0x7f   AP_STARTED DFS_AP_STOP DFS_AP_RESUME WAI_STA_EVENT WAI_MSG ESCAN_RESULT ACTION_FRAME_OFF_CHAN_COMPLETE [PROBRESP_MSG]
  9     0xff   —            0xff   P2P_PROBREQ_MSG DCS_REQUEST FIFO_CREDIT_MAP ACTION_FRAME_RX WAKE_EVENT RM_COMPLETE HTSFSYNC OVERLAY_REQ
  10    0xff   —            0xff   CSA_COMPLETE_IND EXCESS_PM_WAKE_EVENT PFN_SCAN_NONE PFN_SCAN_ALLGONE GTK_PLUMBED ASSOC_IND_NDIS REASSOC_IND_NDIS ASSOC_REQ_IE
  11    0xff   —            0xff   ASSOC_RESP_IE ASSOC_RECREATED ACTION_FRAME_RX_NDIS AUTH_REQ (reserved) (reserved) SPEEDY_RECREATE_FAIL NATIVE
  12–18 0xff   —            0xff   reserved / future expansion

Final hex payload:

ff ff e7 ff ff ee bf ff 7f ff ff ff ff ff ff ff ff ff ff

Blob-coupling note (R17 applies here). This 19-byte mask and the event-number-to-bit mapping are validated against the committed 7.95.61 firmware blob family (Cypress/Infineon CYW43439 family). A blob upgrade (e.g. to 7.95.62 or a post-WLC-12.0 vintage) may: (a) add new event numbers past 91, (b) re-assign existing event numbers (unlikely but possible), (c) change how mask bits are interpreted. Changing blobs requires re-running §7.2 golden-trace regression and verifying the mask still disables the intended events (ROAM, TXFAIL, RADIO, PROBREQ_MSG, IF, PROBRESP_MSG). Do not cargo-cult this hex into a blob upgrade without re-verification.

Sanity check against the events we specifically handle (§2.2):

All events in our §2.2 handler table have their mask bit ENABLED after the 6 standard clears:

  • MIC_ERROR (17) → byte[2] bit 1 = 1 ✓
  • PMKID_CACHE (21) → byte[2] bit 5 = 1 ✓
  • BCNLOST_MSG (31) → byte[3] bit 7 = 1 ✓
  • PSM_WATCHDOG (41) → byte[5] bit 1 = 1 ✓
  • ICV_ERROR (49) → byte[6] bit 1 = 1 ✓
  • UNICAST_DECODE_ERROR (50) → byte[6] bit 2 = 1 ✓
  • MULTICAST_DECODE_ERROR (51) → byte[6] bit 3 = 1 ✓
  • GTK_PLUMBED (84) → byte[10] bit 4 = 1 ✓

No mask changes needed to enable our new handlers; the broad default mask already delivers them. For future events added to the handler table: check this table, and if a required bit is in the cleared set, remove the corresponding CLR_EV equivalent in our mask-construction code.

Full iovar wire format (bsscfg:event_msgs with bsscfgidx prefix):

name:                  "bsscfg:event_msgs\0"      (18 bytes including NUL)
bsscfgidx (LE u32):    00 00 00 00                (4 bytes; STA interface)
mask:                  ff ff e7 ff ff ee bf ff 7f ff ff ff ff ff ff ff ff ff ff  (19 bytes)
total payload:         41 bytes

See §2.4.11 for the iovar send sequence.

12.4 Firmware blob layout

File: firmware/43439A0_combined.bin  (~227 KB)
  offset 0            ─ WiFi firmware (CYW43_WIFI_FW_LEN bytes)
  offset ALIGN(fw,512)─ CLM blob (CYW43_CLM_LEN bytes)

File: firmware/43439A0_nvram.bin  (~742 bytes)
  offset 0 ─ NVRAM blob

Actual firmware vintage (verified 2026-04-20 via version-string dump of committed blob):

chip:       43439A0
variant:    sdio with btsdio (WiFi+BT coexist)
capability: pool p2p idsup idauth pktfilter keepalive aoe lpc swdiv
            srfast fuart btcx noclminc clm_min fbt mfp sae wowlpf tko
            nvd btsdio
version:    7.95.61
build:      abcd531 CY
crc:        4528a809
date:       Wed 2023-01-11 10:29:38 PST
ucode ver:  1043.2169
FWID:       01-7afb0879

Key capabilities relevant to the plan:

  • mfp — Management Frame Protection (required for WPA3; §5.10).
  • sae — WPA3-Personal (SAE) supported (§5.10).
  • btsdio — Bluetooth-over-SDIO. Not used in this rewrite (BT out of scope per §1).
  • wowlpf / tko / keepalive / pktfilter — power-save and offload features available for future phases.

Alternative vintages audited but not adopted:

Vintage Delta Reason not adopted
pico-sdk 7.95.49.00 Older (~2021) Behind our current blob.
pico-sdk 7.95.59 (1YN Murata variant) Older module variant Murata 1YN hardware, not Pico W.
soypat 7.95.62 (Apr 2023) Newer by 3 months, no btsdio tag Marginal improvement; cost/benefit of R17 revalidation not justified by known delta. Consider if Phase 3 hardware surfaces a 7.95.61-specific bug.

CYW43_WIFI_FW_LEN is currently hard-coded in src/cyw43/device.zig as 231077; the new firmware.zig must expose the same length as a pub const and record the version string + SHA256 of the binary as compile-time constants so the on-device banner can report what vintage is running.

12.4.1 NVRAM variant verification checklist

Our NVRAM file is src/cyw43/firmware/43439A0_nvram.bin (742 bytes). The CYW43439 ships with module-specific NVRAMs — the wrong NVRAM results in degraded RF performance, wrong MAC OUI, or failed regulatory compliance. Phase 1 must confirm we have the correct Pico-W variant.

Reference candidates (from Embassy misc/embassy/cyw43-firmware/):

NVRAM file Module Use case
nvram_rp2040.bin Pico W (Murata on-board module) This is what we should match
nvram_murata_2bc.bin Murata 2BC standalone module Different module
nvram_sterling_lwb+.bin Laird Sterling LWB+ Different vendor

Verification (Phase 1, before first boot attempt):

# 1. Verify byte-for-byte or SHA256 match:
shasum -a 256 src/cyw43/firmware/43439A0_nvram.bin \
              misc/embassy/cyw43-firmware/nvram_rp2040.bin

# If SHA256 matches: confirmed Pico-W variant. Done.

# 2. If SHA256 differs but files are both 742 bytes: likely the same variant with
#    different text-encoding (embassy's is typically the original Cypress text NVRAM
#    converted to binary form — check if ours is too).
#    Use `strings src/cyw43/firmware/43439A0_nvram.bin | head -10` to see:
#    the expected content starts with something like:
#       "NVRAMRev=$Rev$"
#       "manfid=0x2d0"
#       "prodid=0x0727"
#       "vendid=0x14e4"
#       "devid=0x43e2"       # CYW43439 device id
#       "boardtype=0x0887"   # Murata 1YN or similar

# 3. If text-encoding differs, spot-check key fields match rp2040.bin equivalent:
#    - boardtype
#    - macaddr (if baked)
#    - regulatory (country/region code)
#    - crystal freq (xtalfreq=37400 for Pico W)

Acceptance: either byte-identical to Embassy's nvram_rp2040.bin, or identical boardtype + xtalfreq + regulatory fields with documented differences in a Phase 1 UPSTREAM.md M-entry.

On a mismatch: do not boot Pico W with a mismatched NVRAM; RF tuning is module-specific and a wrong NVRAM can produce FCC-non-compliant emissions or cause PA overdrive. Use Embassy's nvram_rp2040.bin as the authoritative Pico W NVRAM.

12.5 Documented gotchas (AGENTS.md §"CYW43 Gotchas") — reaffirmed

  1. #15: SPI backplane block writes MUST use 64-byte chunks. ll/boot.zig upload loop preserves this.
  2. #16: Bulk firmware payload words must be LE-packed. bus/bus.zig writeBytes preserves this.
  3. #17: Backplane window registers WRITE-ONLY from SPI. bus/backplane.zig owns the cache (§2.4.3).
  4. #18: SDPCM credit check before every IOCTL send. ll/ioctl.zig enforces via frame.hasCredit().
  5. #19: bsscfg: iovars encode extra u32 interface index. ll/ioctl.zig::setBsscfgIovarU32 encapsulates.
  6. #20: pollDevice must drain ALL pending packets. ctrl/poll.zig loop-until-none.
  7. #21: BDC TX header must use version 2 (0x20). ll/frame.zig::packBdc hard-codes this with a comment citing the gotcha.

End of plan.

Plan produced 2026-04-20 for the pico project. Peer-reviewed by GPT-5.4 in conversation pico-cyw43-rewrite-plan-2026. Execution target: subsequent coding sessions. See §11 for pre-execution checklist.

CYW43439 gSPI Protocol Reference

Hard-won findings from bring-up on Pico W hardware, April 2026. Cross-referenced against four implementations: Pico SDK, Embassy (Rust), PicoWi (C bit-bang), and our Zig driver.

The Pico W SPI Interface

The CYW43439 on the Pico W uses a nonstandard half-duplex SPI on a single shared data line:

Signal RP2040 GPIO Function
WL_REG_ON GPIO23 Power enable (active high)
WL_D GPIO24 Shared: MOSI + MISO + IRQ
WL_CS GPIO25 Chip select (active low)
WL_CLK GPIO29 SPI clock

GPIO24 is shared via resistor network:

  • SDIO_CMD (SPI MOSI) connected directly
  • SDIO_DATA0 (SPI MISO) connected via 470 ohm protection resistor
  • SDIO_DATA1 (IRQ) connected via 10K resistor
  • SDIO_DATA2 (mode select) determines SPI vs SDIO at power-up

Power-Up Sequence (Critical)

The DATA pin state at power-up selects SPI vs SDIO mode:

  1. WL_DATA must be OUTPUT LOW before WL_ON goes high — this selects SPI mode
  2. WL_ON LOW for >= 20ms (power down)
  3. WL_ON HIGH (power up, DATA=LOW selects SPI)
  4. Wait 250ms (SDK uses 250ms, not 50ms)
  5. Switch DATA to input for SPI operation

If DATA floats high during power-up, the chip enters SDIO mode and will not respond to gSPI commands.

Source: Pico SDK cyw43_spi_gpio_setup() + cyw43_spi_reset(), PicoWi blog.

CYW43 Clock Modes (ALP and HT)

The CYW43439 has two internal clock states that gate what the host can do:

  • ALP (Active Low Power) — a slow clock sufficient for SPI bus access, register reads/writes, and backplane windowing. ALP is available shortly after power-up. With ALP, the host can read chip ID, program the backplane window, and upload firmware to RAM. But the WLAN ARM core cannot execute firmware at full speed on ALP alone.

  • HT (High Throughput) — the full-speed clock required for firmware execution, packet processing, and radio operation. HT becomes available only after the firmware has been uploaded, the WLAN core is released from reset, and the firmware successfully boots. The firmware itself switches the chip from ALP to HT and sets the HT_AVAIL bit in the chip clock CSR (0x1000E).

The host bring-up sequence interacts with these clocks as follows:

  1. After power-up, request ALP by writing ALP_AVAIL_REQ (0x08) to the clock CSR
  2. Poll until ALP_AVAIL (0x40) appears — the chip is now awake enough for bus access
  3. Upload firmware and NVRAM, write the NVRAM token
  4. Release the WLAN core from reset
  5. Poll the clock CSR for HT_AVAIL (0x80) — this means the firmware has booted
  6. Once HT is available, the firmware is running and ready for IOCTL commands

"HT-ready" in this project's documentation means: the firmware has booted and is running at full speed, ready to accept control commands.

gSPI Command Word Format

32-bit command, packed as a C bitfield on little-endian ARM:

typedef struct {
    uint32_t len:11,   // bits [10:0]  — byte count
             addr:17,  // bits [27:11] — register address
             func:2,   // bits [29:28] — function (0=bus, 1=backplane, 2=WLAN)
             incr:1,   // bit  [30]    — auto-increment address
             wr:1;     // bit  [31]    — 1=write, 0=read
} SPI_MSG_HDR;

Functions: 0 = SPI bus core, 1 = backplane (AHB), 2 = WLAN data.

Wire Byte Order (THE Key Insight)

gSPI sends command and data bytes in LITTLE-ENDIAN order (LSByte first), with MSbit-first within each byte.

For command word 0x4000A004 (read, incr, func0, addr=0x14, len=4):

  • Memory on LE ARM: [04, A0, 00, 40]
  • Wire order: 04 first, then A0, then 00, then 40
  • Each byte sent bit 7 first

This was confirmed by cross-referencing three independent implementations:

PicoWi (C bit-bang, definitive proof)

spi_write((uint8_t *)&msg, 32);  // sends raw struct bytes, byte 0 first

On LE ARM, byte 0 of a u32 is the LSByte. PicoWi's spi_write starts from byte 0, sending MSbit-first within each byte.

Pico SDK (PIO + DMA)

buf[0] = SWAP32(make_cmd(false, true, fn, reg, 4));
// DMA with BSWAP=true transfers to PIO TX FIFO
  • SWAP32 is ARM rev16 (swap bytes within each halfword)
  • DMA BSWAP is a full byte reverse for 32-bit transfers (0xAABBCCDD -> 0xDDCCBBAA)
  • Combined: rev16 then bswap32 produce the correct byte order in the PIO FIFO
  • PIO shifts MSBit-first, producing LSByte-first on wire

Critical correction: DMA BSWAP for 32-bit words is a full byte reverse, NOT rev16. This was the source of initial confusion. The RP2040 datasheet says "the two bytes of the two halfwords are each reversed" which is misleading — for word transfers it's a complete reversal.

Embassy (Rust PIO)

Uses DMA with byte-swap and shift_out.direction = ShiftDirection::Left (MSB-first). The net effect matches: LSByte-first on wire.

Implication for our Zig PIO driver

Since our PIO shifts bit 31 first (MSB-first), we must swapEndian (full byte reverse) the command word before pushing to the TX FIFO:

txPut(swapEndian(cmd));  // 0x4000A004 -> 0x04A00040 -> PIO sends 04,A0,00,40

Response Byte Order

Responses also arrive in LE byte order. The PIO captures 32 bits MSB-first into the ISR. After swapEndian, the correct host-native value is recovered.

For the test register:

  • Wire: AD BE ED FE (LSByte of 0xFEEDBEAD first)
  • PIO ISR: 0xADBEEDFE
  • After swapEndian: 0xFEEDBEAD

SPI Clock Phase (SPI Mode)

CYW43 gSPI uses CPOL=0, CPHA=0 (SPI Mode 0) with a half-duplex twist:

TX Phase (host to device)

  • Host drives data while CLK is LOW
  • CYW43 samples on CLK RISING edge

RX Phase (device to host) — SUBTLE

The device drives data on the CLK rising edge. The host must sample after the data settles:

Implementation Sample point PIO instruction
PicoWi (bit-bang) Before CLK cycle (CLK is LOW from previous) read; CLK high; CLK low
Pico SDK (high speed) CLK LOW (falling edge) in pins, 1 side 0
Embassy (low speed) CLK HIGH (after rising) in pins, 1 side 1
Embassy (high speed) CLK LOW (falling edge) in pins, 1 side 0

At low SPI speeds (~1 MHz), either edge works because data is stable for a long time. At high speeds (>30 MHz), falling-edge sampling is preferred.

Turnaround (Direction Switch)

After the 32-bit command, the host releases the DATA line and the CYW43 starts driving it for the response. The turnaround gap between TX and RX is implementation-dependent:

Implementation Built-in gap clocks Configurable?
Pico SDK spi_gap01_sample0 1 (nop side 1) No
Pico SDK spi_gap010_sample1 2 No
PicoWi (bit-bang) 0 (just a usdelay) N/A
Embassy (overclock) 2 No
Embassy (high speed) 1 No
Embassy (low speed) 1 (nop side 0) No

The CYW43's SPI_RESP_DELAY_Fx registers add additional device-side delay. These must be coordinated with the host turnaround:

  • Before bus config: RESP_DELAY defaults to 0. Use minimal host turnaround.
  • After bus config: Set RESP_DELAY to match host turnaround.

For backplane reads (function 1), additional response padding is inserted before the response data. The SDK defines CYW43_BACKPLANE_READ_PAD_LEN_BYTES = 16 for SPI (4 words), but the current proven Zig path uses 4-byte padding because the SPI_RESP_DELAY_F1 write was not yet shown to take effect reliably. Treat 16 bytes as the reference/SDK behavior, and 4 bytes as the currently working implementation detail.

PIO Pin Configuration (RP2040-specific)

Side-set drives value, NOT output enable

PIO side-set controls the pin value but does NOT set the output enable. You must explicitly set pindirs for the CLK pin:

// SET_BASE targets data_pin — set data OE
execImmediate(pioSet(DST_PINDIRS, 1));

// Temporarily retarget SET_BASE to clk_pin — set clock OE
hal.regWrite(pinctrl_addr, modified_pinctrl_with_clk_as_set_base);
execImmediate(pioSet(DST_PINDIRS, 1));
hal.regWrite(pinctrl_addr, original_pinctrl);  // restore

Without this, the CLK pin stays as input and no clock signal reaches the CYW43.

PINCTRL must include IN_BASE

The in pins, 1 instruction reads from IN_BASE, not from OUT_BASE or SET_BASE. If IN_BASE is not set to the data pin, reads sample the wrong GPIO.

FSTAT bit positions

FSTAT register for SM0:
  Bit 0:  RXFULL
  Bit 8:  RXEMPTY  <-- use this for "has data" check
  Bit 16: TXFULL
  Bit 24: TXEMPTY

Common bug: checking RXFULL (bit 0) instead of RXEMPTY (bit 8). On an empty FIFO, RXFULL=0 which makes drainRx() loop forever.

Pull Configuration

Implementation DATA pin pull
Pico SDK Pull-DOWN
Embassy No pull
PicoWi External pull-up on module

The SDK uses pull-down. The CYW43 module may have its own pull-ups. For debugging, pull-down is recommended — it distinguishes "line undriven" (reads 0) from "device driving high" (reads 1).

Register Access Patterns

Function 0 (bus core) reads

  • No response padding
  • RESP_DELAY applies directly

Function 1 (backplane) reads

  • 4 extra padding bytes before response data
  • SDK: if (func == BACKPLANE_FUNCTION) msg.hdr.len += 4; and reads 4 extra bytes

All register reads use incr=true

The SDK sets the auto-increment bit for ALL register reads, not just block transfers.

Initial Bus Handshake

  1. Read SPI_TEST_REGISTER (func 0, addr 0x14) → expect 0xFEEDBEAD
  2. Configure bus: WORD_LENGTH_32 | HIGH_SPEED | INTERRUPT_POLARITY_HIGH | WAKE_UP
  3. Set response delays to match host turnaround
  4. Enable status register

The SDK uses read_reg_u32_swap() for the initial test register read, which applies SWAP32 (rev16) on both command and response. This is because the initial bus state may have different byte ordering before WORD_LENGTH_32 is configured.

Proven Zig Bring-Up Configuration

The current proven Zig path uses two distinct SPI access modes:

  1. Pre-config (16-bit halfword mode)swap16x2 + swapEndian on commands and responses
  2. Post-config (32-bit word mode) — raw commands and raw 32-bit register access, with bulk payload words packed little-endian

The mode switch in bus.initBus() is:

  • Phase 1: readReg32Swap() reads the test register and verifies 0xFEEDBEAD
  • Phase 2: writeReg32Swap() enables WORD_LENGTH_32
  • Phase 3: all subsequent access uses raw helpers (cmdReadRaw, cmdWriteRaw)

The proven RP2040 PIO program matches the SDK spi_gap01_sample0 shape:

0: out pins, 1    side 0   ; TX bit, CLK LOW
1: jmp x--, 0     side 1   ; CLK HIGH, loop
2: set pindirs, 0 side 0   ; turnaround: DATA=input
3: nop             side 1  ; 1 gap clock
4: in pins, 1      side 0  ; RX sample, CLK LOW
5: jmp y--, 4      side 1  ; CLK HIGH, loop

Operational notes from the proven path:

  • host preloads X = tx_bits - 1 and Y = rx_bits - 1 for each transfer
  • autopull/autopush use 32-bit thresholds
  • current proven backplane read path uses 1 padding word (4 bytes)
  • backplane block writes use 64-byte chunks
  • STATUS_ENABLE remains disabled on the current Zig path because it prepends a status word to every response and complicates parsing during bring-up
  • the current proven clock pad config matches the SDK: 12 mA drive + fast slew

Key Test Values

Register Address Expected Value
SPI_TEST_REGISTER 0x14 0xFEEDBEAD
SPI_TEST_RW 0x18 Write/readback
CHIPCOMMON_CHIPID backplane 0x18000000 raw word 0x1545A9AF; low 16 bits 0xA9AF = 43439 decimal = CYW43439

Firmware Blobs Required

The current Zig build uses two embedded files:

  1. 43439A0_combined.bin (~227 KB) — combined WLAN firmware + CLM blob in the SDK/Embassy combined layout
  2. 43439A0_nvram.bin (~742 B) — board-specific config (antenna, crystal, power)

Internally, core.zig slices 43439A0_combined.bin into:

  • firmware payload
  • CLM payload

The older separate 43439A0.bin and 43439A0_clm.bin files are useful only as reference/source artifacts and are not required by the current build.

Source lineage: pico-sdk/lib/cyw43-driver/firmware/ and Embassy's matched combined blob layout.

References

8-bit Register Access in 32-bit Word Mode

In 32-bit word mode, ALL SPI transfers are 32-bit words, even for 8-bit register accesses.

Empirically working path (proven on Pico W hardware):

  • Write: cmdWriteRaw(cmd, &[_]u32{@as(u32, val)}) — value in LSByte of u32
  • Read: @truncate(result[0]) — extract LSByte from raw PIO result

The CYW43 direct backplane registers (0x1000x range) appear to handle byte-lane positioning internally. The earlier hypothesis that 8-bit values needed val << 24 (MSByte positioning) was investigated but the LSByte path works empirically for all tested registers including the backplane window bytes and clock CSR.

The critical companion fix was PIO TXSTALL wait: without waiting for the PIO shift engine to finish before CS release, write-only transactions could be truncated on the wire, causing register writes to silently fail.

The successful bring-up path ended up being:

  • ALP available (csr_raw=0x48)
  • backplane window readback low=0x00 mid=0x00 high=0x18
  • chipcommon register 0 raw word 0x1545A9AF
  • firmware verify OK (231KB Embassy-matched pair, 64-byte chunks, LE packing)
  • HT clock OK after firmware upload
  • F2 ready wait before first IOCTL
  • MAC read via cur_etheraddr iovar: 28:CD:C1:10:3E:1B
  • CLM upload via clmload iovar: status 0
  • LED blink via gpioout iovar: visually confirmed
  • Wi-Fi UP via WLC_DOWNcountryevent_msgsWLC_UP
  • Wi-Fi scan via escan iovar: 56 ESCAN_RESULT events, real SSIDs discovered

The CHIPCOMMON_CHIPID register uses the standard Broadcom Silicon Backplane format:

  • bits [15:0] = chip ID (decimal chip number as hex: 43439 = 0xA9AF)
  • bits [19:16] = chip revision (0x5 for our CYW43439)
  • bits [31:20] = package/other info

Full raw word 0x1545A9AF breaks down as: ID=0xA9AF, rev=0x5, pkg=0x154.

This encoding is standard across the Broadcom SBP family: BCM4329 stores 0x4329, BCM43438 stores 0xA99E, CYW43439 stores 0xA9AF. The marketing name 0x4373 is NOT the chipcommon register value.

Bugs Found During Bring-Up

  1. CLK pin OE not set — side-set drives value only; must explicitly set pindirs
  2. FSTAT RXEMPTY vs RXFULL — bit 8, not bit 0; wrong check causes infinite drain loop
  3. Command byte order — must be LE on wire; requires swapEndian before PIO TX
  4. swap16x2 is required before WORD_LENGTH_32 — the initial 16-bit halfword mode swaps bytes within each halfword; the test-register path must account for this.
  5. 1-bit alignment behavior differs at low speed — the SDK gap program can produce a 1-bit response shift around ~1 MHz, while Embassy's no-gap path works there. At >30 MHz, the proven path is the SDK-style gap program.
  6. DATA pin must be LOW at power-up — selects SPI mode; floating high = SDIO mode
  7. DMA BSWAP is full byte reverse — not rev16 as RP2040 docs misleadingly suggest
  8. STATUS_ENABLE prepends a status word — enabling it adds an extra 32-bit word to every response. The current proven Zig path leaves it disabled.
  9. PIO TXSTALL wait required for write-only transactions — without waiting for FDEBUG.TXSTALL, CS releases before final bits leave the wire, causing backplane window writes to silently fail. Copied from the SDK's write-only PIO path.
  10. Backplane window write order matters — must be HIGH/MID/LOW (matching SDK), not LOW/MID/HIGH.
  11. CHIPCOMMON_CHIPID uses decimal chip number in low 16 bits — CYW43439 reports 0xA9AF in low 16 bits, not 0x4373.
  12. SPI backplane block writes must use 64-byte chunks — SDK defines CYW43_BUS_MAX_BLOCK_SIZE = 64 for SPI. This is a hardware constraint of the CYW43's SPI-to-backplane bridge FIFO. Writing 512-byte blocks silently corrupts firmware uploads even if small synthetic test writes appear fine.
  13. Bulk firmware payload words must be little-endian packed — the final firmware boot blocker was a host-side byte swap inside each 32-bit bulk payload word. Full-image verification caught this at offset 0x1000: expected 0x00801BD4, got 0xD41B8000. Little-endian payload packing fixed the upload and allowed the firmware to boot.
  14. Backplane window registers are write-only from SPI — cannot read back 0x1000A/B/C to verify. Track window state in software and only write changed bytes. Force-write all three bytes after any error recovery. The SDK resets to CHIPCOMMON_BASE_ADDRESS after each backplane access.
  15. SDK documents 16-byte SPI backplane read padding, but the current proven path uses 4 bytes — keep this distinction explicit in docs until the SPI_RESP_DELAY_F1 configuration path is independently verified.
  16. Firmware and CLM must be a matched pair — using a 224KB firmware with a 984-byte CLM from a different release produced clmload_status=3 (BCME_BADOPTION). Switching to Embassy's matched pair (231KB FW + 984B CLM from the same wb43439A0_7_95_49_00_combined.h) gave status 0.
  17. F2 ready wait is required before first IOCTL — after HT clock, poll STATUS_F2_RX_READY (bit 5 of SPI status register) before sending any IOCTLs. Without this, the first IOCTL times out.
  18. Event mask must be configured before scan — set event_msgs iovar to enable ESCAN_RESULT delivery. Without this, scan events are never generated and the scan poll loop sees zero packets.
  19. pollDevice() must drain all pending packets — event and control responses can arrive back-to-back. Reading only one packet per poll loses the second.
  20. BDC TX header must use version 2 (0x20) — version 0 silently drops data-channel frames; both Pico SDK and Embassy use BDC v2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment