Skip to content

Instantly share code, notes, and snippets.

@grahama1970
Created September 27, 2025 21:43
Show Gist options
  • Select an option

  • Save grahama1970/9d1f52ca4c6d6aa566e80d305a19967e to your computer and use it in GitHub Desktop.

Select an option

Save grahama1970/9d1f52ca4c6d6aa566e80d305a19967e to your computer and use it in GitHub Desktop.
All-smokes split still timing out: focused diffs (TTL 15m)

All‑Smokes Gate Still Timing Out/Fails in Split — Targeted Debug + Patch Requests Created: 2025-09-27 TTL: Private, delete within 15 minutes after review

Summary

  • We split the composite all_smokes gate into all_smokes_core + all_smokes_nd and added per‑check timeouts, xdist and PYTEST_ADDOPTS pass‑through.
  • The orchestrator no longer dies universally, but we still see:
    1. Harness timeouts under certain runs (improved but still possible on slow hosts).
    2. True FAILs in all_smokes_nd due to env/base mismatches (see below) — these are not red in isolation.

Key Observations (from runs on this host)

  • When the shim sees 8788 already bound, it starts on a free port (ma_port). Our configured checks set MINI_AGENT_API_HOST/PORT to ma_port, but CODEX_AGENT_API_BASE in readiness.yml is still hard‑coded to http://127.0.0.1:8788 — so codex‑agent tests can hit the wrong server and fail with 500/connection issues.
  • In long runs, occasional port collisions and docker tools‑stub mapping :8791 caused non‑deterministic behavior. We added preflight port freeing and a docker stop for tools‑stub, which helped.
  • After deduping pytest.ini sections, deterministic suites are clean. Remaining red items concentrate in ND/E2E where bases or models weren’t fully aligned to the shim’s dynamic port.

Failing tests (latest all_smokes_nd pass, trimmed)

  • tests/ndsmoke/test_codex_agent_live_optional.py::test_codex_agent_live_optional
  • tests/ndsmoke/test_loop_exec_python_ndsmoke.py::test_loop_exec_python_ndsmoke
  • tests/ndsmoke/test_mini_agent_api_live_minimal_ndsmoke.py::test_agent_api_live_minimal_optional
  • tests/ndsmoke/test_mini_agent_docker_live_optional.py::test_mini_agent_docker_codex_code_live_optional
  • tests/ndsmoke/test_mini_agent_lang_ndsmoke.py::test_lang_javascript_live_optional
  • tests/ndsmoke/test_ollama_generate_ndsmoke.py::test_ollama_generate_optional
  • tests/ndsmoke_e2e/test_codex_agent_e2e_low.py::test_codex_agent_router_low_optional
  • tests/ndsmoke_e2e/test_mini_agent_e2e_high_escalation.py::test_mini_agent_escalation_high_optional
  • tests/ndsmoke_e2e/test_mini_agent_e2e_low.py::test_mini_agent_finalize_via_api_low

Hypotheses

  • Codex‑agent path: wrong base URL (using 8788) when the shim moved to a free port; solution is to override CODEX_AGENT_API_BASE to http://{ma_host}:{ma_port} for configured checks (same as we already do for MINI_AGENT_API_*).
  • Mini‑agent API minimal/low E2E: same root cause (hitting the wrong port when 8788 is occupied).
  • JavaScript live optional: if Node isn’t detectable or path check is bypassed, force skip via tool detection or ensure Node installed; or fix the “which” probe path in the ND test environment so it skips cleanly when missing.

Please review and supply clean diffs for:

  1. scripts/mvp_check.py — override CODEX_AGENT_API_BASE with shim port for configured checks
@@ def main():
-            env.update({k: str(v) for (k, v) in env_add.items()})
+            env.update({k: str(v) for (k, v) in env_add.items()})
             # If the configured check targets the mini-agent API, prefer the shim port we resolved above
-            if name == 'mini_agent_e2e_low' or 'MINI_AGENT_API_PORT' in env:
+            if name in ('mini_agent_e2e_low','all_smokes','all_smokes_core','all_smokes_nd') or 'MINI_AGENT_API_PORT' in env:
                 try:
                     env['MINI_AGENT_API_HOST'] = locals().get('ma_host', env.get('MINI_AGENT_API_HOST','127.0.0.1'))
                     env['MINI_AGENT_API_PORT'] = str(locals().get('ma_port', env.get('MINI_AGENT_API_PORT','8788')))
+                    # Ensure codex-agent base hits the same shim
+                    env['CODEX_AGENT_API_BASE'] = f"http://{env['MINI_AGENT_API_HOST']}:{env['MINI_AGENT_API_PORT']}"
                 except Exception:
                     pass
  1. readiness.yml — remove hard‑coded CODEX_AGENT_API_BASE=…8788 from split checks or set it dynamically (mvp_check will override anyway). If you prefer to keep it:
-      CODEX_AGENT_API_BASE: http://127.0.0.1:8788
+      # Base will be overridden at runtime to the shim port by mvp_check
+      CODEX_AGENT_API_BASE: http://127.0.0.1:8788
  1. scripts/run_all_smokes.py — ensure consistent env for codex endpoint and allow xdist (already present on our branch but include for completeness)
@@
-    cmd = ["pytest", "-q"] + targets
+    cmd = ["pytest", "-q"] + targets
     if importlib.util.find_spec("xdist") and os.environ.get("NO_XDIST") != "1":
         workers = os.environ.get("PYTEST_XDIST_AUTO_NUM_WORKERS") or "auto"
         cmd += ["-n", workers]
     extra = shlex.split(os.environ.get("PYTEST_ADDOPTS", "")) if os.environ.get("PYTEST_ADDOPTS") else []
     cmd += extra
  1. pytest.ini — keep a single [pytest] section (we fixed duplicate section locally; include this note to avoid recurrence)

  2. Optional: Node tool detection patch

  • If you want lang_javascript_live_optional to skip cleanly when Node is missing, either ensure Node is available or add a more robust PATH probe (but per your policy, we prefer features working; skip only when truly absent).

Run Plan After Patch

  • Strict split: ALL_SMOKES_CORE_TIMEOUT=720 ALL_SMOKES_ND_TIMEOUT=1500 make project-ready-all-split
  • If CI supports xdist: set PYTEST_XDIST_AUTO_NUM_WORKERS (e.g., 6–8) and PYTEST_ADDOPTS="--durations=25".

What I did (already applied locally)

  • Port/freeing logic, docker tools-stub stop, echo shim for codex-agent OpenAI endpoint, parallel + timeout fixes, storage trace schema addition, Agent Proxy echo via router, HTTP invoker tolerant POST.

Ask

  • Provide surgical diffs for (1) CODEX_AGENT_API_BASE override in configured checks and (2) keep readiness.yml’s base comment/dynamic override.
  • Optional: confirm a maximum end‑to‑end wall‑clock you expect for the full suite so we can set conservative timeouts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment