Skip to content

Instantly share code, notes, and snippets.

@jleechan2015
Created May 23, 2026 00:59
Show Gist options
  • Select an option

  • Save jleechan2015/b5318fc4ba5f4e578ede32b3c64aecc1 to your computer and use it in GitHub Desktop.

Select an option

Save jleechan2015/b5318fc4ba5f4e578ede32b3c64aecc1 to your computer and use it in GitHub Desktop.

Evidence Summary: Timeline Splicing & Content Deletion (RED Phase)

Test Execution Details

  • Test ID: timeline_integrity-001-20260522T173500
  • Total Scenarios: 1
  • Passed Scenarios: 0
  • Failed Scenarios: 1 (RED Reproduction)

Scenario Breakdown

  1. Compound Player Input Handling: FAIL
    • Expected: Narration of treaty at Annex with background delegation/messenger report of docks tracking.
    • Observed: Timeline/location splicing and content dropout. The LLM narrated the Annex paperwork signing but spliced in a flashback to the harbor prisoner assessment in the same turn, and completely dropped the instruction to tail Grog-Mar.

Claim → Artifact Map

Claim File Key Field(s)
Content collapse/deletion & splicing artifacts/repro.log Console output showing ignored Grog-Mar instruction and harbor flashback
Organic LLM request/response capture artifacts/llm_request_responses.jsonl Full raw prompt and response

What This Evidence Proves vs. Does NOT Prove

Proves:

  • gemini-3-flash-preview fails to handle compound geographically disparate location inputs correctly without strict system prompt guidance.
  • The LLM will silently collapse/delete user instructions and splice disparate timelines and locations together.

Does NOT Prove:

  • Behavior in combat or other character routes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment