Skip to content

Instantly share code, notes, and snippets.

@jleechan2015
Created May 23, 2026 01:32
Show Gist options
  • Select an option

  • Save jleechan2015/98db7a1ec4abd6fd17c24d448d75af8d to your computer and use it in GitHub Desktop.

Select an option

Save jleechan2015/98db7a1ec4abd6fd17c24d448d75af8d to your computer and use it in GitHub Desktop.
PR 7020 E2E comparative evidence

Evidence Package: test_campaign_upgrade_modal_lock_red_green

Package Manifest

  • Test Name: test_campaign_upgrade_modal_lock_red_green
  • Run ID: test_campaign_upgrade_modal_lock_red_green-014-20260523T013145
  • Iteration: 14
  • Bundle Version: 1.2.0
  • Collected At (UTC): 2026-05-23T01:31:45.859382+00:00
  • Repository: worldarchitect.ai
  • Branch: fix-campaign-upgrade-hang
  • Commit: 8f8eb9719a8c42b9883cb2898c52b01b87cbb779
  • Merge Base: c15f7895bad49e7c03042ad943ea0ec3ba743512
  • Commits Ahead of Main: 13

Git Provenance

.beads/issues.jsonl                       |   3 +
 mvp_site/agents.py                        | 897 +++++++++++++++++++++++++++---
 mvp_site/llm_parser.py                    |   3 +-
 mvp_site/tests/test_agents.py             | 154 +++++
 mvp_site/tests/test_dice_provably_fair.py |  13 +-
 mvp_site/tests/test_intent_classifier.py  |   8 +-
 mvp_site/tests/test_world_logic.py        | 167 +++++-
 mvp_site/world_logic.py                   | 462 ++-------------
 8 files changed, 1196 insertions(+), 511 deletions(-)

Server Runtime

  • Port: 51204
  • PID: 25375
  • Command: /opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python -m gunicorn mvp_site.main:app --bind 0.0.0.0:51204 --workers 1 --worker-class gthread --threads 4 --timeout 600 --max-requests 50 --access-logfile - --error-logfile - --log-level info

Environment Variables

  • WORLDAI_DEV_MODE: true
  • TESTING: None
  • MOCK_SERVICES_MODE: false
  • GOOGLE_APPLICATION_CREDENTIALS: [SET - file:serviceAccountKey.json]
  • WORLDAI_GOOGLE_APPLICATION_CREDENTIALS: [SET - file:serviceAccountKey.json]
  • FIRESTORE_EMULATOR_HOST: None
  • PORT: 51204
  • FIREBASE_PROJECT_ID: worldarchitecture-ai
  • GEMINI_API_KEY: [SET - 39 chars]
  • LLM_REQUEST_RESPONSE_CAPTURE_PATH: /tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/llm_request_responses_1779499653080.jsonl
  • HTTP_REQUEST_RESPONSE_CAPTURE_PATH: /tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/http_request_responses_1779499653080.jsonl
  • GEMINI_HTTP_REQUEST_RESPONSE_CAPTURE_PATH: /tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/gemini_http_request_responses_1779499653080.jsonl
  • MCP_TEST_PROVIDER_HTTP_CAPTURE_PATH: /tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/provider_http_request_responses_1779499653080.jsonl

Files in This Bundle

  • README.md - This manifest
  • methodology.md - Testing methodology
  • evidence.md - Evidence summary with Claim→Artifact Map and Coverage Matrix
  • notes.md - Additional context, TODOs, follow-ups
  • metadata.json - Machine-readable metadata
  • assertions.json - Strict before/after parity assertions (if present)
  • run.json - Test results
    • streaming_evidence.json - Normalized streaming evidence summary
    • request_responses.jsonl - Raw MCP request/response payloads (if present)
    • llm_request_responses.jsonl - Raw LLM request/response payloads (if present)
    • http_request_responses.jsonl - Raw local-server HTTP request/response payloads (if present)
    • gemini_http_request_responses.jsonl - Raw Gemini transport HTTP traces (if present)
    • artifacts/ - Additional evidence files

Cross-Process Persistence Evidence: test_campaign_upgrade_modal_lock_red_green

Summary

This document provides explicit evidence of cross-process persistence - proving that data written to one server process can be read by a completely different server process.

Server Process Changes

Phase PID Base URL Timestamp
Before Restart 8777 http://127.0.0.1:8083 2026-05-23T01:27:48.873679+00:00
After Restart 25375 http://127.0.0.1:51204 2026-05-23T01:29:04.611965+00:00

PID Change Proof

  • Old Server PID: 8777
  • New Server PID: 25375
  • PIDs Different: True

This proves the server was completely restarted with a new process, not just hot-reloaded.

Timeline of Events

  • 2026-05-23T01:27:48.778323+00:00 - SERVER START (PID 8777, URL: http://127.0.0.1:8083)
  • 2026-05-23T01:27:48.873686+00:00 - SERVER STOP (PID 8777, reason: Enable RED state (disable fix))
  • 2026-05-23T01:27:58.013642+00:00 - SERVER START (PID 11484, URL: http://127.0.0.1:50774)
  • 2026-05-23T01:28:55.062396+00:00 - SERVER STOP (PID 11484, reason: Enable GREEN state (enable fix))
  • 2026-05-23T01:29:04.611993+00:00 - SERVER START (PID 25375, URL: http://127.0.0.1:51204)

Evidence Chain

  1. Data Written - Settings saved via MCP update_user_settings tool to Server PID 8777
  2. Server Stopped - Process 8777 terminated cleanly
  3. Server Restarted - New process 25375 started on different port
  4. Data Read - Settings retrieved via MCP get_user_settings from Server PID 25375
  5. Data Verified - Values match original write (cross-process persistence confirmed)

Raw Server History

[
  {
    "event": "start",
    "pid": 8777,
    "base_url": "http://127.0.0.1:8083",
    "timestamp": "2026-05-23T01:27:48.778311+00:00"
  },
  {
    "event": "stop",
    "reason": "Enable RED state (disable fix)",
    "pid": 8777,
    "base_url": "http://127.0.0.1:8083",
    "timestamp": "2026-05-23T01:27:48.873679+00:00"
  },
  {
    "event": "start",
    "pid": 11484,
    "base_url": "http://127.0.0.1:50774",
    "timestamp": "2026-05-23T01:27:58.013611+00:00"
  },
  {
    "event": "stop",
    "reason": "Enable GREEN state (enable fix)",
    "pid": 11484,
    "base_url": "http://127.0.0.1:50774",
    "timestamp": "2026-05-23T01:28:55.062392+00:00"
  },
  {
    "event": "start",
    "pid": 25375,
    "base_url": "http://127.0.0.1:51204",
    "timestamp": "2026-05-23T01:29:04.611965+00:00"
  }
]

Raw Cross-Process Events

[
  {
    "type": "server_start",
    "pid": 8777,
    "base_url": "http://127.0.0.1:8083",
    "timestamp": "2026-05-23T01:27:48.778323+00:00"
  },
  {
    "type": "server_stop",
    "pid": 8777,
    "reason": "Enable RED state (disable fix)",
    "timestamp": "2026-05-23T01:27:48.873686+00:00"
  },
  {
    "type": "server_start",
    "pid": 11484,
    "base_url": "http://127.0.0.1:50774",
    "timestamp": "2026-05-23T01:27:58.013642+00:00"
  },
  {
    "type": "server_stop",
    "pid": 11484,
    "reason": "Enable GREEN state (enable fix)",
    "timestamp": "2026-05-23T01:28:55.062396+00:00"
  },
  {
    "type": "server_start",
    "pid": 25375,
    "base_url": "http://127.0.0.1:51204",
    "timestamp": "2026-05-23T01:29:04.611993+00:00"
  }
]

Evidence Summary: test_campaign_upgrade_modal_lock_red_green

Test Results

  • Total Scenarios: 3
  • Scenario Validation Passed: 3
  • Scenario Validation Failed: 0
  • Scenario Validation Pass Rate: 100.0%
  • Raw LLM Layer Passed: 2/2 (100.0%)

Scenario Results

red_infinite_loop

  • Status: ✅ PASS

green_stale_flag_routing

  • Status: ✅ PASS

EVIDENCE_SIGNATURE_GUARD

  • Status: ✅ PASS

Provenance Chain

  • Git HEAD: 8f8eb9719a8c42b9883cb2898c52b01b87cbb779
  • Test Timestamp: 2026-05-23T01:31:45.859382+00:00
  • Server PID: 25375

Claim → Artifact Map

Claim File Key Field(s)
Scenario validation passed: 3/3 run.json scenarios[].passed, scenarios[].errors
Streaming evidence normalized streaming_evidence.json summary., scenarios[].chunk_count_observed
Bundle artifact inventory artifacts/collection_log.txt core_files, jsonl_captures, campaigns_dir
MCP request/response captured request_responses.jsonl Full request/response pairs
Local server HTTP request/response captured http_request_responses.jsonl http_request/http_response entries
LLM request/response stream captured llm_request_responses.jsonl request/response entries (type field)
Gemini HTTP transport captured gemini_http_request_responses.jsonl http_request/http_response/transport_error entries
Server execution log artifacts/server.log Raw server output
Cross-process persistence artifacts/pre_restart_*.txt ps/lsof/server.log before restart
Git provenance metadata.json git_provenance.git_head = 8f8eb971...

Coverage Matrix

Scenario Status Campaign ID
red_infinite_loop ✅ Pass N/A
green_stale_flag_routing ✅ Pass N/A
EVIDENCE_SIGNATURE_GUARD ✅ Pass N/A

Evidence Integrity

  • All files in this bundle have corresponding .sha256 checksum files

  • Checksums use local basename paths so per-file verification works from each artifact directory

  • ⚠️ Server warnings detected (see artifacts/server.log)

  • Warning: ACTION_RESOLUTION_MISSING_FIELDS

What This Evidence Proves vs. Does NOT Prove

Proves:

  • Core logic and scenario validation for test_campaign_upgrade_modal_lock_red_green
  • Scenario execution pass rates (3/3)

Does NOT Prove:

  • Production server behavior (tested on local server unless otherwise noted)
  • Performance under load (single-request tests)
  • Edge cases not covered by scenarios
{
"test_name": "test_campaign_upgrade_modal_lock_red_green",
"run_id": "test_campaign_upgrade_modal_lock_red_green-014-20260523T013145",
"iteration": 14,
"bundle_version": "1.2.0",
"timestamp": "2026-05-23T01:31:45.859382+00:00",
"bundle_timestamp": "2026-05-23T01:31:45.859382+00:00",
"evidence_mode": "lightweight_prompt_tracking",
"evidence_mode_notes": "System instruction captured as filenames + char_count (not full text). Raw LLM request/response payloads captured in request_responses.jsonl. Server logs in artifacts/. Bundle file inventory in artifacts/collection_log.txt.",
"git_provenance": {
"git_head": "8f8eb9719a8c42b9883cb2898c52b01b87cbb779",
"git_branch": "fix-campaign-upgrade-hang",
"merge_base": "c15f7895bad49e7c03042ad943ea0ec3ba743512",
"commits_ahead_of_main": 13,
"diff_stat_vs_main": ".beads/issues.jsonl | 3 +\n mvp_site/agents.py | 897 +++++++++++++++++++++++++++---\n mvp_site/llm_parser.py | 3 +-\n mvp_site/tests/test_agents.py | 154 +++++\n mvp_site/tests/test_dice_provably_fair.py | 13 +-\n mvp_site/tests/test_intent_classifier.py | 8 +-\n mvp_site/tests/test_world_logic.py | 167 +++++-\n mvp_site/world_logic.py | 462 ++-------------\n 8 files changed, 1196 insertions(+), 511 deletions(-)",
"working_tree_dirty": false,
"working_tree_staged_changes": 0,
"working_tree_unstaged_changes": 0,
"working_tree_changed_files": [
"scratch/fetch_raw_state.py",
"testing_mcp/test_campaign_upgrade_modal_lock_red_green.py"
],
"working_tree_diff_sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
},
"server": {
"base_url": "http://127.0.0.1:51204",
"hostname": "127.0.0.1",
"mode": "local",
"port": "51204",
"pid": 25375,
"process_cmdline": "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python -m gunicorn mvp_site.main:app --bind 0.0.0.0:51204 --workers 1 --worker-class gthread --threads 4 --timeout 600 --max-requests 50 --access-logfile - --error-logfile - --log-level info",
"env_vars": {
"WORLDAI_DEV_MODE": "true",
"TESTING": null,
"MOCK_SERVICES_MODE": "false",
"GOOGLE_APPLICATION_CREDENTIALS": "[SET - file:serviceAccountKey.json]",
"WORLDAI_GOOGLE_APPLICATION_CREDENTIALS": "[SET - file:serviceAccountKey.json]",
"FIRESTORE_EMULATOR_HOST": null,
"PORT": "51204",
"FIREBASE_PROJECT_ID": "worldarchitecture-ai",
"GEMINI_API_KEY": "[SET - 39 chars]",
"LLM_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/llm_request_responses_1779499653080.jsonl",
"HTTP_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/http_request_responses_1779499653080.jsonl",
"GEMINI_HTTP_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/gemini_http_request_responses_1779499653080.jsonl",
"MCP_TEST_PROVIDER_HTTP_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/provider_http_request_responses_1779499653080.jsonl"
},
"lsof_output": "COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME\nPython 25375 jleechan 5u IPv4 0x1ae633624a03d9d6 0t0 TCP *:51204 (LISTEN)\nPython 25404 jleechan 5u IPv4 0x1ae633624a03d9d6 0t0 TCP *:51204 (LISTEN)",
"ps_output": "PID USER ELAPSED ARGS\n25375 jleechan 02:43 /opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python -m gunicorn mvp_site.main:app --bind 0.0.0.0:51204 --workers 1 --worker-class gthread --threads 4 --timeout 600 --max-requests 50 --access-logfile - --error-logfile - --log-level info"
},
"provenance": {
"git_fetch_origin_main": {
"returncode": 0,
"stdout": null,
"stderr": "From https://github.com/jleechanorg/worldarchitect.ai\n * branch main -> FETCH_HEAD\nAuto packing the repository in background for optimum performance.\nSee \"git help gc\" for manual housekeeping.\nwarning: The last gc run reported the following. Please correct the root cause\nand remove /Users/jleechan/projects/worldarchitect.ai/.git/worktrees/fix-campaign-upgrade-hang/gc.log\nAutomatic cleanup will not be performed until the file is removed.\n\nwarning: There are too many unreachable loose objects; run 'git prune' to remove them."
},
"git_head": "8f8eb9719a8c42b9883cb2898c52b01b87cbb779",
"git_branch": "fix-campaign-upgrade-hang",
"merge_base": "c15f7895bad49e7c03042ad943ea0ec3ba743512",
"commits_ahead_of_main": 13,
"diff_stat_vs_main": ".beads/issues.jsonl | 3 +\n mvp_site/agents.py | 897 +++++++++++++++++++++++++++---\n mvp_site/llm_parser.py | 3 +-\n mvp_site/tests/test_agents.py | 154 +++++\n mvp_site/tests/test_dice_provably_fair.py | 13 +-\n mvp_site/tests/test_intent_classifier.py | 8 +-\n mvp_site/tests/test_world_logic.py | 167 +++++-\n mvp_site/world_logic.py | 462 ++-------------\n 8 files changed, 1196 insertions(+), 511 deletions(-)",
"working_tree_staged_changes": 0,
"working_tree_unstaged_changes": 0,
"working_tree_untracked_files": 2,
"working_tree_changed_files": [
"scratch/fetch_raw_state.py",
"testing_mcp/test_campaign_upgrade_modal_lock_red_green.py"
],
"working_tree_diff_sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"working_tree_dirty": false,
"server": {
"base_url": "http://127.0.0.1:51204",
"hostname": "127.0.0.1",
"mode": "local",
"port": "51204",
"pid": 25375,
"process_cmdline": "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python -m gunicorn mvp_site.main:app --bind 0.0.0.0:51204 --workers 1 --worker-class gthread --threads 4 --timeout 600 --max-requests 50 --access-logfile - --error-logfile - --log-level info",
"env_vars": {
"WORLDAI_DEV_MODE": "true",
"TESTING": null,
"MOCK_SERVICES_MODE": "false",
"GOOGLE_APPLICATION_CREDENTIALS": "[SET - file:serviceAccountKey.json]",
"WORLDAI_GOOGLE_APPLICATION_CREDENTIALS": "[SET - file:serviceAccountKey.json]",
"FIRESTORE_EMULATOR_HOST": null,
"PORT": "51204",
"FIREBASE_PROJECT_ID": "worldarchitecture-ai",
"GEMINI_API_KEY": "[SET - 39 chars]",
"LLM_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/llm_request_responses_1779499653080.jsonl",
"HTTP_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/http_request_responses_1779499653080.jsonl",
"GEMINI_HTTP_REQUEST_RESPONSE_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/gemini_http_request_responses_1779499653080.jsonl",
"MCP_TEST_PROVIDER_HTTP_CAPTURE_PATH": "/tmp/worldarchitect.ai/fix-campaign-upgrade-hang/test_campaign_upgrade_modal_lock_red_green/iteration_014/provider_http_request_responses_1779499653080.jsonl"
},
"lsof_output": "COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME\nPython 25375 jleechan 5u IPv4 0x1ae633624a03d9d6 0t0 TCP *:51204 (LISTEN)\nPython 25404 jleechan 5u IPv4 0x1ae633624a03d9d6 0t0 TCP *:51204 (LISTEN)",
"ps_output": "PID USER ELAPSED ARGS\n25375 jleechan 02:43 /opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python -m gunicorn mvp_site.main:app --bind 0.0.0.0:51204 --workers 1 --worker-class gthread --threads 4 --timeout 600 --max-requests 50 --access-logfile - --error-logfile - --log-level info"
},
"timestamp": "2026-05-23T01:31:45.842240+00:00",
"test_file": "/Users/jleechan/.gemini/antigravity/worktrees/worldarchitect.ai/fix-campaign-upgrade-hang/testing_mcp/test_campaign_upgrade_modal_lock_red_green.py"
},
"summary": {
"total_scenarios": 3,
"passed": 3,
"failed": 0,
"campaign_capture_total": 0,
"campaign_capture_passed": 0,
"campaign_capture_failed": 0,
"raw_passed": 2,
"raw_total": 2,
"raw_pass_rate": "100.0%"
}
}

Comparative Red/Green runs testing the campaign upgrade modal lock routing loop. Red Scenario (with DISABLE_CAMPAIGN_UPGRADE_FIX=true): With the modal lock fix disabled, reaching universe_control >= 70 triggers the CampaignUpgradeAgent and traps the user in an infinite loop where subsequent non-upgrade actions are repeatedly routed to CampaignUpgradeAgent. Green Scenario (with DISABLE_CAMPAIGN_UPGRADE_FIX=false/unset): Selecting the ceremony entry choice locks the user to the CampaignUpgradeAgent. Completing the ceremony (advancing to tier divine/sovereign) releases the lock, allowing subsequent actions to route back to StoryModeAgent.

{
"scenarios": [
{
"name": "red_infinite_loop",
"passed": true,
"errors": [],
"checks": {
"normal_agent": "CampaignUpgradeAgent",
"disable_fix": true
},
"user_id": "test-test_campaign_upgrade_modal_lock_red_green-1779499652",
"user_email": "[email protected]"
},
{
"name": "green_stale_flag_routing",
"passed": true,
"errors": [],
"checks": {
"followup_agent": "RewardsAgent",
"ceremony_agent": "CampaignUpgradeAgent",
"in_progress_flag_after_completion": false,
"post_ceremony_agent": "RewardsAgent",
"disable_fix": false
},
"user_id": "test-test_campaign_upgrade_modal_lock_red_green-1779499652",
"user_email": "[email protected]"
},
{
"name": "EVIDENCE_SIGNATURE_GUARD",
"passed": true,
"signed_count": 6,
"user_id": "test-test_campaign_upgrade_modal_lock_red_green-1779499652"
}
],
"summary": {
"total": 3,
"passed": 3,
"failed": 0,
"pass_rate": "3/3 (100%)",
"raw_total": 2,
"raw_passed": 2,
"raw_pass_rate": "100.0%",
"raw_data_complete": true
}
}
{
"version": "1.0.0",
"generated_at": "2026-05-23T01:31:46.002366+00:00",
"summary": {
"scenarios_with_streaming_evidence": 0,
"total_chunk_events_observed": 0,
"stream_http_calls_captured": 16,
"process_action_calls_captured": 16,
"mcp_process_action_calls_captured": 0,
"route_stream_process_action_calls_captured": 16,
"process_action_calls_with_raw_response_text": 8
},
"scenarios": []
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment