Ralph Loop is a Hermes skill that turns any PRD (Product Requirements Document) into an autonomous, self-correcting build pipeline. It implements a "Ralph Wiggum Loop" -- the same high-level task is executed repeatedly against the current repository state until a strict verification condition is met (<RALPH_DONE>).
Key invariant: Progress lives 100% in files and git -- never in conversation context. Each iteration starts with fresh context, eliminating context rot on 100+ phase projects.
Read PRD.md + progress.json + git status
↓
Implement one atomic verifiable chunk (via subagent swarm)
↓
Run tests + linter + self-critique
↓
Git commit with descriptive message
↓
If verification passes and all phases done → <RALPH_DONE>
else → schedule next iteration
State files (external, not in context):
PRD.md-- Numbered phases with checkboxes:1. [ ] Setup,2. [x] Core, etc.progress.json-- Iteration counter, completed phases, last commit SHA, auto-improve counterralph_log.md-- Full history of every iteration with timestamps and results
Config files (per-project overrides):
~/.hermes/skills/ralph_orchestrator/config/agents.yaml-- Default agent definitions<project>/.ralph/agents.yaml-- Project-specific agent overrides (adds roles like "graphics")
Each agent has:
description/when_to_use-- Routing logic for which agent handles whatpersonality-- Injected into subagent system prompttoolsets-- Which tools the subagent can access (e.g.,["terminal", "file", "web"])llm_endpoint-- Model override (default:deepseek/deepseek-v3.2)temperature-- Sampling temperature (coder: 0.7, tester: 0.3, reviewer: 0.5)enabled-- Toggle agents on/off per-project
Default agents:
- coder -- Implements the current phase
- tester -- Writes and runs tests
- reviewer -- Self-critique for correctness, security, PRD compliance
Custom agents can be added per-project. For example, a game project might add a "graphics" agent for sprite/texture work.
Ralph has three execution modes, auto-detected at runtime:
| Mode | Detection | Behavior |
|---|---|---|
| Real | delegate_task available in hermes_tools |
Spawns actual AI subagents via delegate_task |
| Simulation | In sandbox but delegate_task unavailable |
Creates placeholder files, always fails verification |
| Standalone | Not in sandbox (direct Python) | Simulation mode |
Mode detection (real_mode/phase_detector.py):
- Check for
HERMES_RPC_SOCKETenv var (set in execute_code sandbox) - Try
from hermes_tools import delegate_task - If both succeed → real mode
For real subagents, run via Hermes CLI:
hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"The Python script alone (without Hermes) cannot spawn real subagents because delegate_task is not exposed in the execute_code sandbox by default.
Sandbox modifications (code_execution_tool.py):
- Added
delegate_tasktoSANDBOX_ALLOWED_TOOLS(alongsideterminal,read_file, etc.) - Added
delegate_taskstub to auto-generatedhermes_tools.pyin sandbox - Sandbox timeout: 900 seconds (15 min)
- Max tool calls: 50 per script
Delegate task limits (delegate_tool.py):
MAX_CONCURRENT_CHILDREN = 10(parallel subagents)MAX_DEPTH = 2(no recursive delegation: parent→child→grandchild blocked)DEFAULT_MAX_ITERATIONS = 500per subagent- Blocked tools for children:
delegate_task,clarify,memory,send_message,execute_code,restart_chat
Safety manager (real_mode/safety_manager.py):
- Tracks agent counts across nested delegations
- Enforces 10 agents per script, 50 total across hierarchy
- Singleton pattern for cross-invocation tracking
Every 10 successful iterations or on any failure, Ralph appends a lesson to its own SKILL.md:
## Lesson learned (iteration 42)
- Time: 2026-03-15T...
- Verification: PASS/FAIL
- Issues: (if any)
- Summary: Auto-generated improvement noteThe orchestrator literally evolves based on its own execution experience.
Before accepting an iteration, ALL must pass:
- Test suite runs without failures
- Linter passes
- All phases marked complete (
[x]) - Self-critique passes
- Git working tree clean
If any check fails, the loop continues but does not commit.
Working:
- PRD parsing, progress tracking, git integration
- Configurable agent system with per-project overrides
- Phase detection and mode switching
- Sandbox
delegate_taskintegration (code_execution_tool patches applied) - Safety limits (concurrent agents, depth, iteration caps)
- Self-improvement via SKILL.md lessons
- Cron scheduling for continuous operation
Requires Hermes restart to activate:
- Patches to
code_execution_tool.py(addingdelegate_taskto SANDBOX_ALLOWED_TOOLS and hermes_tools stub generator)
Not yet tested end-to-end:
- Full real-mode execution: PRD → real subagents → verification → commit → next phase
~/.hermes/skills/ralph_orchestrator/
├── SKILL.md # Self-improving docs + lesson log
├── config/
│ └── agents.yaml # Default agent definitions
├── scripts/
│ └── ralph_orchestrator.py # Main orchestrator (~1100 lines)
└── real_mode/
├── phase_detector.py # Detect real/sim/standalone mode
├── safety_manager.py # Agent count limits
├── real_subagent_swarm.py # Bridges to delegate_task
└── test_*.py # Integration tests
# Modified Hermes files (merged):
~/.hermes/hermes-agent/tools/
├── code_execution_tool.py # Added delegate_task to SANDBOX_ALLOWED_TOOLS
└── delegate_tool.py # Subagent spawning (upstream)
# Initialize a project
python3 ralph_orchestrator.py init "# My Project
## Phases
1. [ ] Setup project structure
2. [ ] Implement core algorithm
3. [ ] Write tests
4. [ ] Documentation"
# Run (simulation mode - for testing orchestration logic)
python3 ralph_orchestrator.py run
# Run (real mode - via Hermes CLI)
hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"
# Schedule continuous operation
python3 ralph_orchestrator.py create-cron "every 10 minutes"
# Monitor
python3 ralph_orchestrator.py status
tail -f ralph_log.md