Prove that we can use a background Claude Code process to verify results and block task progression until verification completes.
This is a self-contained POC. All required command files are in the commands/ subdirectory.
All tasks should be executed with the current working directory as:
/home/graham/workspace/experiments/llm_call/src/llm_call/usage/docs/tasks/claude_poll_poc/
All file paths are relative to this directory unless otherwise specified.
All generated files (scripts, logs, outputs) should be created in this POC directory:
simple_add.py- Generated Python scriptadd_results.txt- Output from simple_add.pyverification_status.json- Status file from verificationfinal_answer.txt- Final output fileexecution_summary.txt- Summary for Geminitask_execution.log- All logging outputsetup_logging.py- Initial log setup scriptverify_script.py- Generated verification scriptpoll_script.py- Generated polling script- Any archived logs
- Read
/home/graham/.claude/CLAUDE.mdand/home/graham/workspace/experiments/llm_call/CLAUDE.md - Verify virtual environment is active (
which pythoncontains.venv) - To archive existing logs and initialize fresh logging, first create a Python script named
setup_logging.py. This script will handle both archiving the old log and configuring the new one. Use the code snippets incommands/logging-setup.mdto build this script. - Execute the setup script:
python setup_logging.py - Log: "[TASK1] Environment setup complete."
- Write
simple_add.pythat adds 2 + 3 - The script should NOT log its own source code (prevents recursive logging)
- Execute the script and capture output
- Verify
add_results.txtcontains "The sum of 2 and 3 is 5"
- Create
verification_status.jsonwith initial content:{"status": "in-progress"} - Launch background verification:
- Generate a new script named
verify_script.pyby following the instructions incommands/claude-verify.mdwith these parameters: - code_file=simple_add.py, result_file=add_results.txt, status_file=verification_status.json, log_file=task_execution.log, expected_result=5
- Launch
verify_script.pyas a background process (non-blocking).
- Generate a new script named
- Wait for verification to complete:
- Create a new script named
poll_script.py. Use the full Python code provided incommands/claude-poll.mdas its content, replacing the placeholder variables with these parameters: - status_file=verification_status.json, expected_status=pass, timeout=600, log_file=task_execution.log
- Execute
python poll_script.pyand wait for it to finish. This script is designed to block until verification completes or times out. - After
poll_script.pyfinishes, you MUST check its exit code. If the exit code is 0, log "[TASK3] Verification confirmed successfully." and proceed to Task 4. If the exit code is not 0 (e.g., 1 for timeout), you must stop all further tasks immediately and log the failure (e.g., '[EXECUTION_HALTED] Polling failed or timed out.').
- Create a new script named
- This task must only be executed if Task 3 completed with an exit code of 0.
- Write "Paris" to
final_answer.txt - Log: "[TASK4] Final answer written after successful verification"
- This task proves sequential execution - only runs after polling completes
- Create
execution_summary.txtby reading the generated files and populating the following template:POC Execution Summary: - simple_add.py created: [yes/no] - add_results.txt contains: [file content] - verification_status.json status: [final status value from file] - final_answer.txt contains: [file content, or "not created" if Task 4 did not run] - All files were created sequentially: [Answer 'yes' only if Task 4 was successfully executed, otherwise answer 'no'] - To simulate the Gemini call, create a new Python script for this task. Use the full Python code from
commands/ask-gemini-flash.mdas its content, then execute the script.
- All tasks execute in sequence (no parallel execution)
- Task 4 only runs after Task 3's polling confirms verification passed
- Gemini confirms all results are real