Skip to content

Instantly share code, notes, and snippets.

@jleechan2015
Created May 21, 2026 01:41
Show Gist options
  • Select an option

  • Save jleechan2015/02d90a1124a68a0214c55c6fc7e388d1 to your computer and use it in GitHub Desktop.

Select an option

Save jleechan2015/02d90a1124a68a0214c55c6fc7e388d1 to your computer and use it in GitHub Desktop.
PR 6968 Evidence — level_up_finish_return_bug_red (iter_021, 2/2 PASS)

Evidence Summary: level_up_finish_return_bug_red

Test Results

  • Total Scenarios: 2

  • Scenario Validation Passed: 2

  • Scenario Validation Failed: 0

  • Scenario Validation Pass Rate: 100.0%

  • Raw LLM Layer Passed: 1/1 (100.0%)

  • Post-Processing Campaign Capture Passed: 1

  • Post-Processing Campaign Capture Failed: 0

  • Post-Processing Campaign Capture Pass Rate: 100.0%

Scenario Results

level_up_finish_return_bug_red

  • Status: ✅ PASS
  • Campaign ID: JLEQMvPrn5AactPqyVAe

EVIDENCE_SIGNATURE_GUARD

  • Status: ✅ PASS

Provenance Chain

  • Git HEAD: 918094769af813a8daf71c7fc5f9c4a097a1489b
  • Test Timestamp: 2026-05-20T21:52:49.176453+00:00
  • Server PID: 46163

Claim → Artifact Map

Claim File Key Field(s)
Scenario validation passed: 2/2 run.json scenarios[].passed, scenarios[].errors
Campaign post-processing capture passed: 1/1 run.json campaign_capture_status[*].status
Streaming evidence normalized streaming_evidence.json summary., scenarios[].chunk_count_observed
Bundle artifact inventory artifacts/collection_log.txt core_files, jsonl_captures, campaigns_dir
MCP request/response captured request_responses.jsonl Full request/response pairs
Local server HTTP request/response captured http_request_responses.jsonl http_request/http_response entries
LLM request/response stream captured llm_request_responses.jsonl request/response entries (type field)
Gemini HTTP transport captured gemini_http_request_responses.jsonl http_request/http_response/transport_error entries
Server execution log artifacts/server.log Raw server output
Git provenance metadata.json git_provenance.git_head = 91809476...

Coverage Matrix

Scenario Status Campaign ID
level_up_finish_return_bug_red ✅ Pass JLEQMvPr...
EVIDENCE_SIGNATURE_GUARD ✅ Pass N/A

Evidence Integrity

  • All files in this bundle have corresponding .sha256 checksum files

  • Checksums use local basename paths so per-file verification works from each artifact directory

  • ⚠️ Server warnings detected (see artifacts/server.log)

  • Warning: ACTION_RESOLUTION_MISSING_FIELDS

  • Warning: CRITICAL_SAFEGUARD

What This Evidence Proves vs. Does NOT Prove

Proves:

  • Core logic and scenario validation for level_up_finish_return_bug_red
  • Scenario execution pass rates (2/2)

Does NOT Prove:

  • Production server behavior (tested on local server unless otherwise noted)
  • Performance under load (single-request tests)
  • Edge cases not covered by scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment