Date: 2026-05-31
Machine: HP ZBook Ultra G1a 14" (AMD Strix Halo / Ryzen AI Max), BIOS X89 Ver. 01.04.03 (2025-12-03)
Crashing kernel: linux-image-7.1-amd64 version 7.1~rc5-1~exp1 (Debian experimental release candidate)
Working kernel (reverted to): 6.19.14+deb14-amd64
The panics are a software bug in the in‑tree MediaTek mt7925/mt76 Wi‑Fi driver in the 7.1‑rc5 kernel — not a hardware fault. The driver corrupts an internal linked list (sta_poll_list) while processing Wi‑Fi TX‑status reports; the kernel's list‑hardening check catches the corruption and, because it happens in interrupt/NAPI context, it escalates to an unrecoverable panic.
Reverting to 6.19.14 is the correct mitigation — that kernel does not have the regression.
From the EFI‑pstore crash dump (/var/lib/systemd/pstore/…, archived copies in /tmp/panic-report/):
slab kmalloc-8k start ffff8a7ae2bf2000 pointer offset 4160 size 8192
list_add corruption. prev->next should be next (ffff8a74cfe088f8),
but was ffff8a7ae2bf3040. (prev=ffff8a7ae2bf3040).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:32!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 12 Comm: napi/phy0-0 Tainted/Not tainted 7.1-amd64 Debian 7.1~rc5-1~exp1
RIP: 0010:__list_add_valid_or_report+0xa6/0xb0
Call Trace:
__list_add_valid_or_report <- list-hardening BUG (CONFIG_DEBUG_LIST)
mt76_wcid_add_poll [mt76] <- adds station to dev->sta_poll_list
mt7925_mac_add_txs.part.0 [mt7925_common] <- while handling a TX-status report
mt7925_rx_check [mt7925_common]
mt76_dma_rx_poll [mt76]
mt792x_poll_rx [mt792x_lib]
__napi_poll → napi_threaded_poll_loop → kthread (threaded NAPI, softirq ctx)
Kernel panic - not syncing: Fatal exception in interrupt
Mechanism: mt76_wcid_add_poll() adds a station's poll_list node to sta_poll_list, but that node is already linked (prev->next points back at the node itself = a double‑add). CONFIG_DEBUG_LIST (lib/list_debug.c:32) detects it and executes ud2 (invalid opcode 0f 0b). Because the fault is inside the threaded‑NAPI RX poll (interrupt context), the Oops becomes Kernel panic — not syncing: Fatal exception in interrupt.
Why journalctl showed nothing: a panic in interrupt context never lets journald flush to disk. The only record is the firmware's EFI‑pstore, which systemd-pstore archived under /var/lib/systemd/pstore/. journalctl --list-boots only shows the 6.19 recovery boots, not the 7.1 crash boots.
Both panics correlate with Wi‑Fi roaming — repeated re‑association between two BSSIDs of the same SSID (f4:92:bf:2d:45:55 ↔ f4:92:bf:2e:45:55) immediately precedes the corruption. Station‑table / MLO‑link churn during roaming exercises the buggy TXS → poll‑list path.
- RustDesk (
mouce-library-fake-mouse,RustDesk UInput Keyboard) floods the log withx86/split lock detection … bus_lock trapwarnings and set theWtaint in one crash — harmless noise. - The clean reproduction proves the Wi‑Fi driver is the cause: a second crash was marked
Not taintedand panicked only 248 s after boot with the identical RIP andmt76/mt7925call trace — no RustDesk, no prior warning.
| Wi‑Fi | MediaTek MT7925 (Filogic 360, Wi‑Fi 7) — PCI 14c3:7925, drivers mt7925e / mt7925_common / mt792x_lib / mt76 |
| GPU | AMD Strix Halo Radeon 8050S/8060S (1002:1586) |
| NVMe | SanDisk WD_BLACK SN7100 |
| Kernel install | 7.1‑rc5 installed 2026‑05‑29 18:08, upgraded rc4→rc5, from Debian experimental |
linux-image-amd64 itself was upgraded to 7.1~rc5-1~exp1, so apt is tracking experimental — the next apt upgrade will pull another RC kernel and could reintroduce the crash.
- The bug class is known and being actively fixed, but NOT merged into 7.0 or 7.1‑rc5.
- Zac Bowling, "wifi: mt76: mt7925: MLO stability fixes" (PATCH v7 0/6, linux‑wireless/LKML, 2026‑01‑29) — patch 1/6 = "fix double wcid initialization race condition", the strongest match for this double‑add corruption.
- Out‑of‑tree fixes + DKMS: https://github.com/zbowling/mt7925 (still maintained precisely because the fixes aren't upstream yet).
- Already in 7.1 (so NOT the fix here): "do not add non‑sta wcid entries to the poll list" (AUTOSEL 6.16) and Aug‑2025 "fix list corruption" patches.
- This exact signature is undocumented: not in zbowling's
KNOWN_ISSUES.md, not OpenWrt mt76 #909 (paging fault inmt7925_mac_sta_add— different) or #1023 (firmware‑loadstrnlenoverflow — different). The cleanmt7925_mac_add_txs → mt76_wcid_add_polllist_addtrace appears to be a new data point. (Debian's RC kernels build withCONFIG_DEBUG_LIST, which is why the corruption surfaces as a precise BUG here rather than a random later crash.)
- Stay on 6.19.14 (already done). Set it as the GRUB default so a stray reboot doesn't land on 7.1.
- Stop experimental from auto‑installing RC kernels — remove 7.1 and/or pin
linux-image-amd64back to stable/trixie:sudo apt remove linux-image-7.1-amd64 linux-headers-7.1-amd64 # then fix apt sources/preferences so linux-image-amd64 tracks stable, not experimental - Confirm the match (optional, decisive): test Zac Bowling's v7 series / DKMS on 7.1‑rc5.
- Crash stops → it is the double‑wcid‑init race; report "confirmed" to help it merge.
- Crash persists → distinct unfixed variant; file a fresh upstream report.
- Reporting: don't open a generic "mt7925 crashes" bug (covered). Do:
- contribute this backtrace to the active effort (github.com/zbowling/mt7925 issue, and/or the linux‑wireless thread; CC mt76 maintainers Felix Fietkau / Lorenzo Bianconi and MediaTek Sean Wang / Deren Wu);
- file a Debian bug (
reportbug linux-image-7.1-amd64) — an experimental RC kernel hard‑panicking on common HP hardware; they can hold the RC or backport the fix.
- If you must run 7.1, the only reliable workaround until patched is to avoid the mt7925 RX path (blacklist
mt7925e, use a USB Wi‑Fi adapter). No tunable fixes the list corruption itself.
- Raw firmware crash records:
/var/lib/systemd/pstore/<epoch>/…/dmesg.txt(root‑only) - World‑readable copies:
/tmp/panic-report/crash-*.txt - Collection script:
/tmp/panic-investigate.sh - Full extracted dump:
/tmp/dump.log