Skip to content

Instantly share code, notes, and snippets.

@sritchie
Last active May 21, 2026 02:28
Show Gist options
  • Select an option

  • Save sritchie/29c39824b70a832f0c75c3563a77ab2a to your computer and use it in GitHub Desktop.

Select an option

Save sritchie/29c39824b70a832f0c75c3563a77ab2a to your computer and use it in GitHub Desktop.
OnSpeed Gen3 PERF report — sample from archived bench capture (PR #612 schema demo)

OnSpeed PERF Report — master-2026-05-20

Generated by tools/perf-report/capture_perf_report.py. Stable schema — diff this file against other PERF reports.

Capture conditions

  • Label: master-2026-05-20
  • Captured: 2026-05-20 20:27:53 -0600
  • Git commit: f0e25a68-dirty (branch feat/perf-report-skill)
  • Build env: esp32s3-v4p-perf
  • Hardware: V4P
  • Duration: 60s (60 snapshots aggregated)
  • SD card: in
  • WiFi clients: 0
  • M5 attached: no
  • IMU rate (configured): 208 Hz
  • Log rate (configured): 208 Hz
  • Notes: Baseline = master + PR #615 (TLS-fix, ArduinoLoop removed, EFIS/Boom binding restored). Idle bench, SD card in, WiFi AP up no clients, no M5.

Core summary (CPU% rollup)

One core = 1,000,000 µs/sec. Used (reliable) sums CPU% of every instrumented task whose scope honestly bounds work (excludes vTaskDelay / xRingbufferReceive blocking time). Headroom is what's left for new features at the current bench configuration.

Tasks where the PerfLoop scope still includes blocking time are reported separately as Blocking-included — their loops × avg figure overstates real CPU consumption and shouldn't be summed into Used. For honest CPU% of those tasks, look at their subsystem-level scopes (e.g. log_write + log_sync for the Log task). See issue #611 for the planned scope reshape.

Core Used (reliable) Headroom Blocking-included
Core 0 1.52% 98.48% 98.11%
Core 1 26.88% 73.12% 0.00%

Per-task CPU (work-only)

Values are median across all snapshots, with (min..max) shown in parens when the range is non-trivial. loops/s is actual measured iterations per second (not the configured rate). CPU% is loops × avg ÷ 10,000 — fraction of one core consumed by this task's instrumented work.

Task Core Loops/s Avg µs CPU% p50 p95 p99 Max Stack free Drops
Log 0 208 (111..208) 4,717 (4,697..8,917) 98.11%† 6,144 8,192 (8,192..24,576) 8,192 (8,192..32,768) 11,907 (8,909..136,899) 2,000 0
Imu 1 208 (206..210) 1,163 (1,159..1,205) 24.19% 1,536 (1,522..1,536) 1,536 (1,522..1,536) 1,536 (1,522..1,617) 1,566 (1,522..1,672) 1,172 0
Sensors 1 50 (49..51) 404 (346..451) ** 2.02%** 208 (208..224) 944 (912..1,382) 1,301 (1,277..1,385) 1,301 (1,277..1,385) 3,804 0
WebServer 0 20 (19..20) 655 (619..889) ** 1.31%** 640 (637..640) 656 (637..2,048) 656 (637..2,048) 1,120 (637..2,119) 8,720 0
Display 1 20 242 (235..253) ** 0.48%** 256 (240..256) 260 (253..288) 260 (253..288) 260 (253..480) 1,160 0
DataServer 0 20 103 (101..119) ** 0.21%** 110 (108..112) 110 (108..208) 110 (108..208) 110 (108..330) 6,440 (6,408..6,440) 0
Switch 1 92 (91..93) 17 (16..18) ** 0.16%** 28 (23..32) 28 (23..32) 28 (23..40) 28 (23..44) 4,212 0
Housekeeping 1 10 (9..10) 24 (22..37) ** 0.02%** 32 (16..32) 47 (43..64) 47 (43..64) 47 (43..137) 2,416 0
Audio 1 10 4 (3..5) ** 0.00%** 8 (5..10) 8 (5..10) 8 (5..10) 8 (5..10) 1,996 0

† PerfLoop scope includes blocking time; CPU% is overstated. See subsystem-level scopes for honest cost.

Per-subsystem timing

CPU% is total ÷ 10,000 — fraction of one core spent in this subsystem per second. Sorted high to low. Subsystems nested inside a task contribute to that task's CPU% above (not additive — these are slices of, not extras on top of, the task totals).

Subsystem Calls/s Total/s µs CPU% p50 p95 p99 Max
log_write 145 (89..146) 235,172 (224,838..514,634) 23.52% 2,048 2,048 (2,048..16,384) 4,096 (2,048..24,576) 6,658 (2,950..96,168)
ekfq.correct 208 (206..210) 90,216 (88,881..95,663) ** 9.02%** 432 (432..480) 528 (480..688) 704 (695..752) 722 (695..794)
ekfq.predict 208 (206..210) 15,119 (14,824..15,421) ** 1.51%** 80 80 (80..96) 96 (80..112) 330 (96..378)
ekfq.alpha 208 (206..210) 5,754 (5,694..6,741) ** 0.58%** 32 (32..48) 48 (32..48) 48 (39..59) 49 (39..333)
log_sync 1 3,781 (3,594..10,775) ** 0.38%** 16 16 16 3,781 (3,594..10,775)
efis_read 91 (89..93) 1,007 (970..1,087) ** 0.10%** 16 (14..16) 16 (14..17) 16 (14..27) 17 (14..27)
boom_read 91 (89..93) 933 (897..974) ** 0.09%** 15 (14..16) 15 (14..16) 15 (14..23) 15 (14..24)

SPI bus

Bus Bytes/s Xfers/s Max xfer µs
spi.aoa 100 (98..102) 50 (49..51) 19 (18..30)
spi.imu 2,952 (2,910..2,994) 436 (430..442) 67 (62..227)
spi.pitot 100 (98..102) 50 (49..51) 68 (65..80)
spi.static 416 (410..422) 208 (205..211) 201 (24..208)

System health

  • Heap free: 8,111,156 bytes
  • Heap min since boot: 8,094,756 bytes
  • Largest free block: 21,492 bytes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment