Skip to content

Instantly share code, notes, and snippets.

@pdp7
Last active May 2, 2026 13:14
Show Gist options
  • Select an option

  • Save pdp7/c253ccd475795f582cdacda51eb2169c to your computer and use it in GitHub Desktop.

Select an option

Save pdp7/c253ccd475795f582cdacda51eb2169c to your computer and use it in GitHub Desktop.
review.md
Patch sizes vs resctrl norms
┌───────────────────────────────────────────────┬──────────────────────┬───────────────────┬──────────────────┐
│ Series │ Patches │ Median │ Max lines/patch │
│ │ │ lines/patch │ │
├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤
│ Recent fs/resctrl (last 30) │ small │ ~30 │ ~50 │
│ │ fixes/refactors │ │ │
├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤
│ Reinette's own commits │ bug fixes │ ~10 │ ~55 │
├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤
│ Babu's RDT_RESOURCE_PERF_PKG telemetry │ 8-10 patches │ ~70 │ ~200 │
│ feature │ │ │ │
├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤
│ Our v4 driver patches (06-10) │ 5 patches │ ~620 │ 975 │
│ │ │ │ (cache_alloc) │
└───────────────────────────────────────────────┴──────────────────────┴───────────────────┴──────────────────┘
Driver patches are 5-10× the resctrl norm. The five-way feature-axis split was meant to address this but each piece
is still ~600-1000 lines.
Likely Reinette flags
1. Patch 10 (bw_mon) is +736 / −252. That −252 is the in-series rework I struggled with — qos_register_ctrl_domain,
qos_attach_l3_mon_domain, qos_register_cap_controller, qos_register_bw_controller,
qos_resctrl_add_controller_domain, qos_domain_update_cpu all introduced earlier in the series and removed here.
Reinette will notice and call it out as "this commit is reworking earlier patches in the series, not adding
bandwidth monitoring."
2. Patch 06 scaffolding still introduces is_cdp_l{2,3}_enabled / exposed_cdp_l{2,3}_capable globals that patch 07
removes. Same smell, smaller scale.
3. Patches 06-09 each around 500-1000 lines. Even granting the feature axis, Reinette's preference is much smaller
units. She may suggest splitting cbqri_apply_cache_config (CDP-off mirror) or cbqri_probe_cc (cache_size + mon
detection) out as separate patches.
4. Item 05 (MBA emulated controls) is still open. v4 ships RDT_RESOURCE_MB_MIN / RDT_RESOURCE_MB_WGHT as top-level
resources. Reinette's reply on v3 said she'd post a PoC for "emulated controls" that lets MB_MIN / MB_WGHT live as
child schemata under MB. The cover acknowledges this as "provisional". She may say "wait for the PoC" or "this v4
design contradicts the direction we agreed".
5. mbm_total_bytes exposed at L3 scope for a BC. The pairing assumption ("all memory traffic flows through one BC")
is in a comment but not enforced — multi-BC platforms with the matching topology would still get the wrong answer.
Reinette's preference is to refuse-rather-than-misreport. Currently the driver does refuse via
cbqri_find_only_mon_bc(), but the criterion is "exactly one mon-capable BC", not "one BC per L3" — these can differ
on real SoCs.
6. CDP-off mirror in cbqri_apply_cache_config is a behavior we added because a v3 review caught it. The fix: when
CDP is disabled but the CC supports AT, write CONFIG_LIMIT twice (DATA and CODE) per schemata write. Functionally
correct but doubles the latency on every alloc. Reinette may ask whether resctrl could be taught to invoke the arch
callback once-per-AT-half instead, or whether AT=0 should mean "both halves" in CBQRI v1.1 (the spec issue).
7. qos_resctrl_online_cpu on error breaks the loop without rolling back. If the 3rd controller fails to attach, the
first two are left attached for this cpu. resctrl-arch contract says cpuhp callbacks are best-effort but the partial
state may surface as inconsistency between cpus_list and the actually-online cpus.
8. resctrl_arch_get_cdp_enabled returns false for unknown rids; resctrl_arch_set_cdp_enabled returns -ENODEV.
Asymmetric. x86 returns false and -EINVAL (not -ENODEV).
9. max_rmid is global, narrowed across all mon-capable BCs. Reinette has historically pushed for per-resource
num_rmid (her own commit x86,fs/resctrl: Compute number of RMIDs as minimum across resources is the cross-resource
handling for x86). Our model assumes one global narrowing; she may ask for per-resource.
10. The 62-bit OVF-aware accumulator (mbm_total_bytes) deserves a kunit test. x86's MBM has equivalent code in
mbm_overflow_count(). We borrow the trick but don't have a test that proves the chunk math handles a wrap correctly.
11. pr_warn_once("CC @%pa: monitor-only controller skipped...") — the code emits this from
qos_resctrl_accumulate_caps(). monitor-only CCs are a CBQRI-spec-legal configuration (Section 4.2). Reinette may
want this either gated on a Kconfig debug or replaced with a structured limitation note in the cover and a pr_info
once.
12. qos_init_domain_ctrlval triggers the boot warning qos: resctrl: RBWB sum 820 exceeds MRBWB 819 / err=-22.
Pre-existing on this branch, not from my fixes — but it shows up in dmesg every boot, which Reinette will see.
13. DT support is "possible" but absent. The cover says "no platform drivers are included" but doesn't explain why
DT is left out. Reinette won't necessarily push back, but a DT maintainer (Conor, Rob) might.
Commit message length
┌──────────────────────────────────┬────────────┬───────────────────────────────┐
│ Patch │ Body lines │ Verdict │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ dt-bindings (02) │ 7 │ A bit terse, but tracks norm │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ Detect Ssqosid (03) │ 9 │ Same │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ srmcfg CSR (04) │ 28 │ Good │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ MB_MIN/MB_WGHT (05) │ 34 │ Good │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ Driver scaffolding (06) │ 49 │ Slightly long but justified │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ Driver alloc/mon patches (07-10) │ 42-55 │ Good — within Reinette's norm │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ ACPI/RQSC (11-13) │ 18-27 │ Good │
├──────────────────────────────────┼────────────┼───────────────────────────────┤
│ Selftests (15-17) │ 21-30 │ Good │
└──────────────────────────────────┴────────────┴───────────────────────────────┘
Commit message lengths are fine. They tend toward the longer end of resctrl norm (10-25), but that's appropriate for
new-feature commits vs bug fixes.
Cover letter length
- v3 (sent) cover body: 329 lines
- v4 cover body: 366 lines
- Reinette didn't complain about v3 cover length
Cover letter length is fine. ~10% longer than v3 because of the v4 changelog additions — expected.
Summary of items that warrant decisions before sending
In rough priority order:
- (1, 2) In-series rework (patches 06 introducing dead globals, patch 10 removing 252 lines added by 06-09). This is
the user's earlier complaint. Either accept and ship as-is, or invest in the proper sequential-rebuild I deferred
earlier.
- (4) Whether to send v4 with MB_MIN/MB_WGHT as top-level, or wait for Reinette's PoC.
- (7) Hotplug error rollback in qos_resctrl_online_cpu.
- (5) Tighten the BC pairing criterion to "one mon-capable BC at L3 scope" rather than "one mon-capable BC
system-wide".
- (10) Add a kunit test for the 62-bit accumulator.
- (8, 9) Cosmetic / arch-callback contract alignment with x86.
Want me to pursue any of these? Or any other angle?
✻ Sautéed for 3m 49s · 1 shell still running
───────────────────────────────────────────
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment