Last active
May 2, 2026 13:14
-
-
Save pdp7/c253ccd475795f582cdacda51eb2169c to your computer and use it in GitHub Desktop.
review.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Patch sizes vs resctrl norms | |
| ┌───────────────────────────────────────────────┬──────────────────────┬───────────────────┬──────────────────┐ | |
| │ Series │ Patches │ Median │ Max lines/patch │ | |
| │ │ │ lines/patch │ │ | |
| ├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤ | |
| │ Recent fs/resctrl (last 30) │ small │ ~30 │ ~50 │ | |
| │ │ fixes/refactors │ │ │ | |
| ├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤ | |
| │ Reinette's own commits │ bug fixes │ ~10 │ ~55 │ | |
| ├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤ | |
| │ Babu's RDT_RESOURCE_PERF_PKG telemetry │ 8-10 patches │ ~70 │ ~200 │ | |
| │ feature │ │ │ │ | |
| ├───────────────────────────────────────────────┼──────────────────────┼───────────────────┼──────────────────┤ | |
| │ Our v4 driver patches (06-10) │ 5 patches │ ~620 │ 975 │ | |
| │ │ │ │ (cache_alloc) │ | |
| └───────────────────────────────────────────────┴──────────────────────┴───────────────────┴──────────────────┘ | |
| Driver patches are 5-10× the resctrl norm. The five-way feature-axis split was meant to address this but each piece | |
| is still ~600-1000 lines. | |
| Likely Reinette flags | |
| 1. Patch 10 (bw_mon) is +736 / −252. That −252 is the in-series rework I struggled with — qos_register_ctrl_domain, | |
| qos_attach_l3_mon_domain, qos_register_cap_controller, qos_register_bw_controller, | |
| qos_resctrl_add_controller_domain, qos_domain_update_cpu all introduced earlier in the series and removed here. | |
| Reinette will notice and call it out as "this commit is reworking earlier patches in the series, not adding | |
| bandwidth monitoring." | |
| 2. Patch 06 scaffolding still introduces is_cdp_l{2,3}_enabled / exposed_cdp_l{2,3}_capable globals that patch 07 | |
| removes. Same smell, smaller scale. | |
| 3. Patches 06-09 each around 500-1000 lines. Even granting the feature axis, Reinette's preference is much smaller | |
| units. She may suggest splitting cbqri_apply_cache_config (CDP-off mirror) or cbqri_probe_cc (cache_size + mon | |
| detection) out as separate patches. | |
| 4. Item 05 (MBA emulated controls) is still open. v4 ships RDT_RESOURCE_MB_MIN / RDT_RESOURCE_MB_WGHT as top-level | |
| resources. Reinette's reply on v3 said she'd post a PoC for "emulated controls" that lets MB_MIN / MB_WGHT live as | |
| child schemata under MB. The cover acknowledges this as "provisional". She may say "wait for the PoC" or "this v4 | |
| design contradicts the direction we agreed". | |
| 5. mbm_total_bytes exposed at L3 scope for a BC. The pairing assumption ("all memory traffic flows through one BC") | |
| is in a comment but not enforced — multi-BC platforms with the matching topology would still get the wrong answer. | |
| Reinette's preference is to refuse-rather-than-misreport. Currently the driver does refuse via | |
| cbqri_find_only_mon_bc(), but the criterion is "exactly one mon-capable BC", not "one BC per L3" — these can differ | |
| on real SoCs. | |
| 6. CDP-off mirror in cbqri_apply_cache_config is a behavior we added because a v3 review caught it. The fix: when | |
| CDP is disabled but the CC supports AT, write CONFIG_LIMIT twice (DATA and CODE) per schemata write. Functionally | |
| correct but doubles the latency on every alloc. Reinette may ask whether resctrl could be taught to invoke the arch | |
| callback once-per-AT-half instead, or whether AT=0 should mean "both halves" in CBQRI v1.1 (the spec issue). | |
| 7. qos_resctrl_online_cpu on error breaks the loop without rolling back. If the 3rd controller fails to attach, the | |
| first two are left attached for this cpu. resctrl-arch contract says cpuhp callbacks are best-effort but the partial | |
| state may surface as inconsistency between cpus_list and the actually-online cpus. | |
| 8. resctrl_arch_get_cdp_enabled returns false for unknown rids; resctrl_arch_set_cdp_enabled returns -ENODEV. | |
| Asymmetric. x86 returns false and -EINVAL (not -ENODEV). | |
| 9. max_rmid is global, narrowed across all mon-capable BCs. Reinette has historically pushed for per-resource | |
| num_rmid (her own commit x86,fs/resctrl: Compute number of RMIDs as minimum across resources is the cross-resource | |
| handling for x86). Our model assumes one global narrowing; she may ask for per-resource. | |
| 10. The 62-bit OVF-aware accumulator (mbm_total_bytes) deserves a kunit test. x86's MBM has equivalent code in | |
| mbm_overflow_count(). We borrow the trick but don't have a test that proves the chunk math handles a wrap correctly. | |
| 11. pr_warn_once("CC @%pa: monitor-only controller skipped...") — the code emits this from | |
| qos_resctrl_accumulate_caps(). monitor-only CCs are a CBQRI-spec-legal configuration (Section 4.2). Reinette may | |
| want this either gated on a Kconfig debug or replaced with a structured limitation note in the cover and a pr_info | |
| once. | |
| 12. qos_init_domain_ctrlval triggers the boot warning qos: resctrl: RBWB sum 820 exceeds MRBWB 819 / err=-22. | |
| Pre-existing on this branch, not from my fixes — but it shows up in dmesg every boot, which Reinette will see. | |
| 13. DT support is "possible" but absent. The cover says "no platform drivers are included" but doesn't explain why | |
| DT is left out. Reinette won't necessarily push back, but a DT maintainer (Conor, Rob) might. | |
| Commit message length | |
| ┌──────────────────────────────────┬────────────┬───────────────────────────────┐ | |
| │ Patch │ Body lines │ Verdict │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ dt-bindings (02) │ 7 │ A bit terse, but tracks norm │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ Detect Ssqosid (03) │ 9 │ Same │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ srmcfg CSR (04) │ 28 │ Good │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ MB_MIN/MB_WGHT (05) │ 34 │ Good │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ Driver scaffolding (06) │ 49 │ Slightly long but justified │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ Driver alloc/mon patches (07-10) │ 42-55 │ Good — within Reinette's norm │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ ACPI/RQSC (11-13) │ 18-27 │ Good │ | |
| ├──────────────────────────────────┼────────────┼───────────────────────────────┤ | |
| │ Selftests (15-17) │ 21-30 │ Good │ | |
| └──────────────────────────────────┴────────────┴───────────────────────────────┘ | |
| Commit message lengths are fine. They tend toward the longer end of resctrl norm (10-25), but that's appropriate for | |
| new-feature commits vs bug fixes. | |
| Cover letter length | |
| - v3 (sent) cover body: 329 lines | |
| - v4 cover body: 366 lines | |
| - Reinette didn't complain about v3 cover length | |
| Cover letter length is fine. ~10% longer than v3 because of the v4 changelog additions — expected. | |
| Summary of items that warrant decisions before sending | |
| In rough priority order: | |
| - (1, 2) In-series rework (patches 06 introducing dead globals, patch 10 removing 252 lines added by 06-09). This is | |
| the user's earlier complaint. Either accept and ship as-is, or invest in the proper sequential-rebuild I deferred | |
| earlier. | |
| - (4) Whether to send v4 with MB_MIN/MB_WGHT as top-level, or wait for Reinette's PoC. | |
| - (7) Hotplug error rollback in qos_resctrl_online_cpu. | |
| - (5) Tighten the BC pairing criterion to "one mon-capable BC at L3 scope" rather than "one mon-capable BC | |
| system-wide". | |
| - (10) Add a kunit test for the 62-bit accumulator. | |
| - (8, 9) Cosmetic / arch-callback contract alignment with x86. | |
| Want me to pursue any of these? Or any other angle? | |
| ✻ Sautéed for 3m 49s · 1 shell still running | |
| ─────────────────────────────────────────── |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment