The 6 "levers" in docs/conformance/levers.md are detection infrastructure — they help find gaps faster but don't close them. Building a better microscope doesn't fix the specimen. The actual conformance improvements come from fixing known protocol gaps.
| File | Content |
|---|---|
pipeline.md |
Master spec — 6-step process: mine segment → templatize → puzzle → engine → bind IDs → hydrate+diff. 3 comparison layers. |
levers.md |
6 scaling improvements, prioritized: sequence comparison (#1), closed-loop tests (#2), seat 2 (#3), structural binding (#4), lossless puzzle gen (#5), regression signal (#6) |
debugging.md |
Proto debugging cookbook — annotation ordering, category codes, instanceId lifecycle, gsId chain, action consistency |
workflow.md |
Diagnose-fix-verify pipeline with 3 phases and 3 gates |
diagnosis-schema.md |
Structured bug diagnosis format — every claim must cite evidence |
wire-spec-schema.md |
Phase 0 artifact schema for wire specs from recordings |
sealed-wire-conformance.md |
Real analysis report from Sealed proxy session — 6 FD endpoints with field-level gaps |
~35+ test classes covering:
- Wire protocol:
DealHandConformanceTest,ActionFieldConformanceTest,ShouldStopConformanceTest,GsIdChainTest,ValidatingMessageSinkTest - Zone transitions:
ZoneTransitionConformanceTest,InstanceIdReallocTest - Annotations:
AnnotationOrderingTest,AttachmentAnnotationTest,RevealAnnotationTest,StructuralFingerprintTest - AI turn:
AiTurnConformanceTest,AiCombatAutoPassTest,AiFirstTurnShapeTest,AiLandPlayOrderTest - Integration:
CombatFlowTest,TargetingFlowTest,PvpBridgeEndToEndTest - Mechanics:
KickerTest,ScryETBFlowTest,ModalETBFlowTest,TreasureTokenTest,SbaDeathTest,CounteredSpellTest, etc. - Pipeline:
ConformancePipelineTest— engine run → JSON dump → Python binding + diff
| # | Title | Labels |
|---|---|---|
| 182 | Conformance: sequence comparison (lever #1) | conformance, enhancement |
| 183 | Conformance: ObjectTracker for recording-matched puzzles (lever #5) | conformance, tooling |
| 142 | Conformance: document and track protocol field gaps | protocol |
| 119 | Treasure token: sacrifice annotation conformance gaps | — |
| 117 | Review and evolve conformance testing strategy | — |
| 173 | Cast adventure: 5 code gaps blocking the mechanic | protocol, feat |
| 141 | Combat damage VFX shows generic projectile instead of melee hit | — |
| Mechanism | Files | Mechanics covered |
|---|---|---|
Golden .bin field coverage |
12 .bin files |
ConnectResp, DieRoll, MulliganReq, InitialGSM, SelectTargetsReq, DeclareAttackers/Blockers, ActionsAvailableReq, IntermissionReq, SetSettingsResp |
| JSON templates | 5 files | DeclareAttackers, DeclareBlockers, PlayLand, Kicker |
| Inline Kotest assertions | 4 test classes | PlayLand fields, TwoPhase targeting, Kicker CTO shape, Annotation ordering |
| Pipeline (Python) | tape conform |
DeclareAttackers, DeclareBlockers |
- DeclareAttackersReq
- DeclareBlockersReq
- DieRollResultsResp
- MulliganReq
- GroupReq (London mulligan tuck)
- SetSettingsResp
- IntermissionReq
- ShouldStop evaluation: 100% match
- Annotation shapes: 39 types, all matching
These are the actual field gaps the client sees today.
| Field path | Impact |
|---|---|
actions[].manaCost[].color[] |
Client can't show mana pip overlays on castable cards |
actions[].manaCost[].count |
Same |
inactiveActions[].actionType |
Client can't grey out unplayable cards |
inactiveActions[].grpId |
Same |
inactiveActions[].instanceId |
Same |
inactiveActions[].facetId |
Same |
inactiveActions[].manaCost[].color[] |
Same |
inactiveActions[].manaCost[].count |
Same |
Two buckets: manaCost display field missing (client has richer manaPaymentOptions but not the simpler overlay field), and inactiveActions entirely absent (unplayable cards invisible to client).
| Field path | Impact |
|---|---|
targets[].prompt.promptId |
Unknown client dependency |
targets[].prompt.parameters[].parameterName |
Unknown |
targets[].prompt.parameters[].type |
Unknown |
targets[].prompt.parameters[].numberValue |
Unknown |
targets[].targetingAbilityGrpId |
May affect targeting UI |
abilityGrpId |
May affect targeting UI |
| Field path | Impact |
|---|---|
diffDeletedInstanceIds[] |
Incremental update optimization |
gameInfo.maxPipCount |
Timer UI broken |
gameInfo.maxTimeoutCount |
Timer UI broken |
gameInfo.timeoutDurationSec |
Timer UI broken |
pendingMessageCount |
Flow control — may cause visual stalls |
timers[].elapsedMs |
Timer UI |
zones[].objectInstanceIds[] |
Zone membership shortcut |
| Field path | Impact |
|---|---|
deckMessage.deckCards[] |
Likely redundant |
greChangelist |
Cosmetic |
grpChangelist |
May affect client caching |
skins[].catalogId |
Card art cosmetics |
skins[].skinCode |
Card art cosmetics |
Recording treats mana activation as a full lifecycle:
- Separate instanceId for the mana ability
AbilityInstanceCreated/AbilityInstanceDeletedUserActionTakenwithactionType=4
Engine collapses it into the spell cast — no separate ability instance, no mana-specific UserActionTaken.
Causes 4 of 6 CastSpell annotation diffs. Affects every spell cast that taps lands.
| # | Gap | Recording | Engine |
|---|---|---|---|
| 1 | Mana ability instance ID | Separate ID | Same as spell |
| 2 | AbilityInstanceCreated affectorId | Points to mana source (Island) | Missing (0) |
| 3 | Annotation order | Tap→UserAction(mana)→ManaPaid→Delete→UserAction(cast) | ManaPaid→Delete→UserAction(cast)→Tap |
| 4 | Missing UserActionTaken | Two: actionType=4 (mana) + actionType=1 (cast) | One: actionType=1 only |
| 5 | ManaPaid details | id=3, color=2 (blue) |
id=0, color="" (empty) |
| 6 | TappedUntappedPermanent affectorId | Mana ability instance | The Island itself |
ColorProductionpersistent annotation — never emitted on land play- Treasure sacrifice annotations (#119) — token lifecycle gaps
- Adventure mechanic (#173) — 5 code gaps blocking cast-adventure
The 6 levers from levers.md make it easier to detect new gaps. They're force multipliers, not fixes.
| Lever | What it does |
|---|---|
| #1 Sequence comparison | Compare message sequences, not single frames |
| #2 Close the loop | Port binding+diff to Kotlin, CI regression gate |
| #3 Seat 2 conformance | Test AI-turn message paths |
| #4 Structural binding | Match annotations by composite key, not position |
| #5 Lossless puzzles | ObjectTracker for any recording segment |
| #6 Regression signal | Golden diffs, conformance index, CI gate |
Ranked by player-visible impact:
| Priority | What | Why | Scope |
|---|---|---|---|
| 1 | inactiveActions in ActionsAvailableReq |
Every turn, client can't show unplayable cards greyed out | Medium — need to compute what's uncastable |
| 2 | manaCost on actions |
Mana pip overlays missing on all castable cards | Small — data available, just not wired |
| 3 | Mana ability lifecycle | Every spell cast is structurally wrong (4 annotation diffs per cast) | Large — structural change to how engine reports mana |
| 4 | Timer fields in initial GSM | Timer UI completely broken | Small — static values |
| 5 | abilityGrpId in SelectTargetsReq |
Targeting UI may show wrong ability | Medium — needs Forge ability ID mapping |
| 6 | pendingMessageCount |
Client flow control — may cause visual stalls | Medium — need to track message queue depth |
| 7 | ColorProduction persistent annotation | Missing on every land play | Small — emit during land play |
| 8 | Close the loop (lever #2) | Only infrastructure lever worth doing now — makes conformance a CI regression gate | Medium — port ~100 lines of Python to Kotlin |
Fix known gaps = ship better protocol fidelity now. Direct player-visible improvements. Items 1-7 above.
Build the pipeline = find unknown gaps systematically. Lever #2 (close the loop) is the only one worth doing short-term — it prevents regressions. The rest (#1, #3-#6) are scaling infrastructure for when you have dozens of mechanics to validate.
Recommended order:
- Fix items 1-2 (ActionsAvailableReq fields) — biggest visual impact, moderate effort
- Fix item 4 (timer fields) — small effort, fixes broken UI
- Fix item 7 (ColorProduction) — small effort
- Build lever #2 (close the loop) — prevents regressions on everything above
- Tackle item 3 (mana lifecycle) — largest effort, largest conformance gain
- Then scale pipeline (levers #1, #4) as you add more mechanics