Skip to content

Instantly share code, notes, and snippets.

@delebedev
Created March 22, 2026 01:48
Show Gist options
  • Select an option

  • Save delebedev/a68ba3126d2d8c942a7e7b6e9ccdfd92 to your computer and use it in GitHub Desktop.

Select an option

Save delebedev/a68ba3126d2d8c942a7e7b6e9ccdfd92 to your computer and use it in GitHub Desktop.

Leyline Conformance Audit — 2026-03-22

TL;DR

The 6 "levers" in docs/conformance/levers.md are detection infrastructure — they help find gaps faster but don't close them. Building a better microscope doesn't fix the specimen. The actual conformance improvements come from fixing known protocol gaps.


What exists today

Docs (docs/conformance/)

File Content
pipeline.md Master spec — 6-step process: mine segment → templatize → puzzle → engine → bind IDs → hydrate+diff. 3 comparison layers.
levers.md 6 scaling improvements, prioritized: sequence comparison (#1), closed-loop tests (#2), seat 2 (#3), structural binding (#4), lossless puzzle gen (#5), regression signal (#6)
debugging.md Proto debugging cookbook — annotation ordering, category codes, instanceId lifecycle, gsId chain, action consistency
workflow.md Diagnose-fix-verify pipeline with 3 phases and 3 gates
diagnosis-schema.md Structured bug diagnosis format — every claim must cite evidence
wire-spec-schema.md Phase 0 artifact schema for wire specs from recordings
sealed-wire-conformance.md Real analysis report from Sealed proxy session — 6 FD endpoints with field-level gaps

Tests (matchdoor/src/test/kotlin/leyline/conformance/)

~35+ test classes covering:

  • Wire protocol: DealHandConformanceTest, ActionFieldConformanceTest, ShouldStopConformanceTest, GsIdChainTest, ValidatingMessageSinkTest
  • Zone transitions: ZoneTransitionConformanceTest, InstanceIdReallocTest
  • Annotations: AnnotationOrderingTest, AttachmentAnnotationTest, RevealAnnotationTest, StructuralFingerprintTest
  • AI turn: AiTurnConformanceTest, AiCombatAutoPassTest, AiFirstTurnShapeTest, AiLandPlayOrderTest
  • Integration: CombatFlowTest, TargetingFlowTest, PvpBridgeEndToEndTest
  • Mechanics: KickerTest, ScryETBFlowTest, ModalETBFlowTest, TreasureTokenTest, SbaDeathTest, CounteredSpellTest, etc.
  • Pipeline: ConformancePipelineTest — engine run → JSON dump → Python binding + diff

Open GitHub Issues

# Title Labels
182 Conformance: sequence comparison (lever #1) conformance, enhancement
183 Conformance: ObjectTracker for recording-matched puzzles (lever #5) conformance, tooling
142 Conformance: document and track protocol field gaps protocol
119 Treasure token: sacrifice annotation conformance gaps
117 Review and evolve conformance testing strategy
173 Cast adventure: 5 code gaps blocking the mechanic protocol, feat
141 Combat damage VFX shows generic projectile instead of melee hit

Three conformance mechanisms (not well integrated)

Mechanism Files Mechanics covered
Golden .bin field coverage 12 .bin files ConnectResp, DieRoll, MulliganReq, InitialGSM, SelectTargetsReq, DeclareAttackers/Blockers, ActionsAvailableReq, IntermissionReq, SetSettingsResp
JSON templates 5 files DeclareAttackers, DeclareBlockers, PlayLand, Kicker
Inline Kotest assertions 4 test classes PlayLand fields, TwoPhase targeting, Kicker CTO shape, Annotation ordering
Pipeline (Python) tape conform DeclareAttackers, DeclareBlockers

Fully covered message types (zero field gaps)

  • DeclareAttackersReq
  • DeclareBlockersReq
  • DieRollResultsResp
  • MulliganReq
  • GroupReq (London mulligan tuck)
  • SetSettingsResp
  • IntermissionReq
  • ShouldStop evaluation: 100% match
  • Annotation shapes: 39 types, all matching

Two different problems

Problem A: Known protocol gaps (fixes conformance directly)

These are the actual field gaps the client sees today.

ActionsAvailableReq — 8 missing fields

Field path Impact
actions[].manaCost[].color[] Client can't show mana pip overlays on castable cards
actions[].manaCost[].count Same
inactiveActions[].actionType Client can't grey out unplayable cards
inactiveActions[].grpId Same
inactiveActions[].instanceId Same
inactiveActions[].facetId Same
inactiveActions[].manaCost[].color[] Same
inactiveActions[].manaCost[].count Same

Two buckets: manaCost display field missing (client has richer manaPaymentOptions but not the simpler overlay field), and inactiveActions entirely absent (unplayable cards invisible to client).

SelectTargetsReq — 6 missing fields

Field path Impact
targets[].prompt.promptId Unknown client dependency
targets[].prompt.parameters[].parameterName Unknown
targets[].prompt.parameters[].type Unknown
targets[].prompt.parameters[].numberValue Unknown
targets[].targetingAbilityGrpId May affect targeting UI
abilityGrpId May affect targeting UI

Initial Full GameStateMessage — 7 missing fields

Field path Impact
diffDeletedInstanceIds[] Incremental update optimization
gameInfo.maxPipCount Timer UI broken
gameInfo.maxTimeoutCount Timer UI broken
gameInfo.timeoutDurationSec Timer UI broken
pendingMessageCount Flow control — may cause visual stalls
timers[].elapsedMs Timer UI
zones[].objectInstanceIds[] Zone membership shortcut

ConnectResp — 5 missing fields

Field path Impact
deckMessage.deckCards[] Likely redundant
greChangelist Cosmetic
grpChangelist May affect client caching
skins[].catalogId Card art cosmetics
skins[].skinCode Card art cosmetics

Mana ability lifecycle — THE biggest structural gap

Recording treats mana activation as a full lifecycle:

  • Separate instanceId for the mana ability
  • AbilityInstanceCreated / AbilityInstanceDeleted
  • UserActionTaken with actionType=4

Engine collapses it into the spell cast — no separate ability instance, no mana-specific UserActionTaken.

Causes 4 of 6 CastSpell annotation diffs. Affects every spell cast that taps lands.

# Gap Recording Engine
1 Mana ability instance ID Separate ID Same as spell
2 AbilityInstanceCreated affectorId Points to mana source (Island) Missing (0)
3 Annotation order Tap→UserAction(mana)→ManaPaid→Delete→UserAction(cast) ManaPaid→Delete→UserAction(cast)→Tap
4 Missing UserActionTaken Two: actionType=4 (mana) + actionType=1 (cast) One: actionType=1 only
5 ManaPaid details id=3, color=2 (blue) id=0, color="" (empty)
6 TappedUntappedPermanent affectorId Mana ability instance The Island itself

Other known gaps

  • ColorProduction persistent annotation — never emitted on land play
  • Treasure sacrifice annotations (#119) — token lifecycle gaps
  • Adventure mechanic (#173) — 5 code gaps blocking cast-adventure

Problem B: Pipeline levers (detection infrastructure)

The 6 levers from levers.md make it easier to detect new gaps. They're force multipliers, not fixes.

Lever What it does
#1 Sequence comparison Compare message sequences, not single frames
#2 Close the loop Port binding+diff to Kotlin, CI regression gate
#3 Seat 2 conformance Test AI-turn message paths
#4 Structural binding Match annotations by composite key, not position
#5 Lossless puzzles ObjectTracker for any recording segment
#6 Regression signal Golden diffs, conformance index, CI gate

What would drastically improve conformance

Ranked by player-visible impact:

Priority What Why Scope
1 inactiveActions in ActionsAvailableReq Every turn, client can't show unplayable cards greyed out Medium — need to compute what's uncastable
2 manaCost on actions Mana pip overlays missing on all castable cards Small — data available, just not wired
3 Mana ability lifecycle Every spell cast is structurally wrong (4 annotation diffs per cast) Large — structural change to how engine reports mana
4 Timer fields in initial GSM Timer UI completely broken Small — static values
5 abilityGrpId in SelectTargetsReq Targeting UI may show wrong ability Medium — needs Forge ability ID mapping
6 pendingMessageCount Client flow control — may cause visual stalls Medium — need to track message queue depth
7 ColorProduction persistent annotation Missing on every land play Small — emit during land play
8 Close the loop (lever #2) Only infrastructure lever worth doing now — makes conformance a CI regression gate Medium — port ~100 lines of Python to Kotlin

The endgame question

Fix known gaps = ship better protocol fidelity now. Direct player-visible improvements. Items 1-7 above.

Build the pipeline = find unknown gaps systematically. Lever #2 (close the loop) is the only one worth doing short-term — it prevents regressions. The rest (#1, #3-#6) are scaling infrastructure for when you have dozens of mechanics to validate.

Recommended order:

  1. Fix items 1-2 (ActionsAvailableReq fields) — biggest visual impact, moderate effort
  2. Fix item 4 (timer fields) — small effort, fixes broken UI
  3. Fix item 7 (ColorProduction) — small effort
  4. Build lever #2 (close the loop) — prevents regressions on everything above
  5. Tackle item 3 (mana lifecycle) — largest effort, largest conformance gain
  6. Then scale pipeline (levers #1, #4) as you add more mechanics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment