Skip to content

Instantly share code, notes, and snippets.

@danieljue
Created March 17, 2026 20:48
Show Gist options
  • Select an option

  • Save danieljue/099133307bebf70b2e99b29279f61604 to your computer and use it in GitHub Desktop.

Select an option

Save danieljue/099133307bebf70b2e99b29279f61604 to your computer and use it in GitHub Desktop.
Hermes Ralph Loop v1

Ralph Loop: A Persistent Multi-Agent Build Orchestrator for Hermes

What It Is

Ralph Loop is a Hermes skill that turns any PRD (Product Requirements Document) into an autonomous, self-correcting build pipeline. It implements a "Ralph Wiggum Loop" -- the same high-level task is executed repeatedly against the current repository state until a strict verification condition is met (<RALPH_DONE>).

Key invariant: Progress lives 100% in files and git -- never in conversation context. Each iteration starts with fresh context, eliminating context rot on 100+ phase projects.

Core Cycle

Read PRD.md + progress.json + git status
    ↓
Implement one atomic verifiable chunk (via subagent swarm)
    ↓
Run tests + linter + self-critique
    ↓
Git commit with descriptive message
    ↓
If verification passes and all phases done → <RALPH_DONE>
else → schedule next iteration

Architecture

State files (external, not in context):

  • PRD.md -- Numbered phases with checkboxes: 1. [ ] Setup, 2. [x] Core, etc.
  • progress.json -- Iteration counter, completed phases, last commit SHA, auto-improve counter
  • ralph_log.md -- Full history of every iteration with timestamps and results

Config files (per-project overrides):

  • ~/.hermes/skills/ralph_orchestrator/config/agents.yaml -- Default agent definitions
  • <project>/.ralph/agents.yaml -- Project-specific agent overrides (adds roles like "graphics")

Configurable Agent System

Each agent has:

  • description / when_to_use -- Routing logic for which agent handles what
  • personality -- Injected into subagent system prompt
  • toolsets -- Which tools the subagent can access (e.g., ["terminal", "file", "web"])
  • llm_endpoint -- Model override (default: deepseek/deepseek-v3.2)
  • temperature -- Sampling temperature (coder: 0.7, tester: 0.3, reviewer: 0.5)
  • enabled -- Toggle agents on/off per-project

Default agents:

  • coder -- Implements the current phase
  • tester -- Writes and runs tests
  • reviewer -- Self-critique for correctness, security, PRD compliance

Custom agents can be added per-project. For example, a game project might add a "graphics" agent for sprite/texture work.

Execution Modes

Ralph has three execution modes, auto-detected at runtime:

Mode Detection Behavior
Real delegate_task available in hermes_tools Spawns actual AI subagents via delegate_task
Simulation In sandbox but delegate_task unavailable Creates placeholder files, always fails verification
Standalone Not in sandbox (direct Python) Simulation mode

Mode detection (real_mode/phase_detector.py):

  1. Check for HERMES_RPC_SOCKET env var (set in execute_code sandbox)
  2. Try from hermes_tools import delegate_task
  3. If both succeed → real mode

For real subagents, run via Hermes CLI:

hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"

The Python script alone (without Hermes) cannot spawn real subagents because delegate_task is not exposed in the execute_code sandbox by default.

Safety Systems

Sandbox modifications (code_execution_tool.py):

  • Added delegate_task to SANDBOX_ALLOWED_TOOLS (alongside terminal, read_file, etc.)
  • Added delegate_task stub to auto-generated hermes_tools.py in sandbox
  • Sandbox timeout: 900 seconds (15 min)
  • Max tool calls: 50 per script

Delegate task limits (delegate_tool.py):

  • MAX_CONCURRENT_CHILDREN = 10 (parallel subagents)
  • MAX_DEPTH = 2 (no recursive delegation: parent→child→grandchild blocked)
  • DEFAULT_MAX_ITERATIONS = 500 per subagent
  • Blocked tools for children: delegate_task, clarify, memory, send_message, execute_code, restart_chat

Safety manager (real_mode/safety_manager.py):

  • Tracks agent counts across nested delegations
  • Enforces 10 agents per script, 50 total across hierarchy
  • Singleton pattern for cross-invocation tracking

Self-Improvement

Every 10 successful iterations or on any failure, Ralph appends a lesson to its own SKILL.md:

## Lesson learned (iteration 42)
- Time: 2026-03-15T...
- Verification: PASS/FAIL
- Issues: (if any)
- Summary: Auto-generated improvement note

The orchestrator literally evolves based on its own execution experience.

Verification Checklist

Before accepting an iteration, ALL must pass:

  • Test suite runs without failures
  • Linter passes
  • All phases marked complete ([x])
  • Self-critique passes
  • Git working tree clean

If any check fails, the loop continues but does not commit.

Current Status

Working:

  • PRD parsing, progress tracking, git integration
  • Configurable agent system with per-project overrides
  • Phase detection and mode switching
  • Sandbox delegate_task integration (code_execution_tool patches applied)
  • Safety limits (concurrent agents, depth, iteration caps)
  • Self-improvement via SKILL.md lessons
  • Cron scheduling for continuous operation

Requires Hermes restart to activate:

  • Patches to code_execution_tool.py (adding delegate_task to SANDBOX_ALLOWED_TOOLS and hermes_tools stub generator)

Not yet tested end-to-end:

  • Full real-mode execution: PRD → real subagents → verification → commit → next phase

Files

~/.hermes/skills/ralph_orchestrator/
├── SKILL.md                          # Self-improving docs + lesson log
├── config/
│   └── agents.yaml                   # Default agent definitions
├── scripts/
│   └── ralph_orchestrator.py         # Main orchestrator (~1100 lines)
└── real_mode/
    ├── phase_detector.py             # Detect real/sim/standalone mode
    ├── safety_manager.py             # Agent count limits
    ├── real_subagent_swarm.py        # Bridges to delegate_task
    └── test_*.py                     # Integration tests

# Modified Hermes files (merged):
~/.hermes/hermes-agent/tools/
├── code_execution_tool.py            # Added delegate_task to SANDBOX_ALLOWED_TOOLS
└── delegate_tool.py                  # Subagent spawning (upstream)

Usage Examples

# Initialize a project
python3 ralph_orchestrator.py init "# My Project

## Phases
1. [ ] Setup project structure
2. [ ] Implement core algorithm
3. [ ] Write tests
4. [ ] Documentation"

# Run (simulation mode - for testing orchestration logic)
python3 ralph_orchestrator.py run

# Run (real mode - via Hermes CLI)
hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"

# Schedule continuous operation
python3 ralph_orchestrator.py create-cron "every 10 minutes"

# Monitor
python3 ralph_orchestrator.py status
tail -f ralph_log.md
Error in user YAML: (<unknown>): mapping values are not allowed in this context at line 2 column 95
---
name: ralph_orchestrator
description: Ralph Wiggum Loop orchestrator for long-run autonomous development. Infinite loop: PRD.md + progress.json -> subagent swarm -> git commit -> verify -> repeat until <RALPH_DONE>. Self-improves.
category: autonomous-ai-agents
---

Ralph Orchestrator

Persistent Ralph Loop for 100+ phase projects. External state only (no context rot).

What is the Ralph Wiggum Loop?

A self-referential infinite loop that executes the same high-level task against the current repository state until a strict verification condition is met. Progress lives 100% in files/git (never in conversation context). Each iteration starts with fresh context to prevent context rot.

Core cycle:

Read PRD.md + progress.json + git status + test results
    ↓
Implement one atomic verifiable chunk
    ↓
Run tests + linter + self-critique
    ↓
Git commit with descriptive message
    ↓
If verification passes and all phases done:
    Output "<RALPH_DONE>" and exit
else:
    Schedule next iteration (repeat)

Features

  • ✅ In-memory iteration state (PRD, progress, logs)
  • ✅ Subagent swarm for parallel implementation/testing/review
  • ✅ Trajectory compression to prevent context overflow
  • ✅ Git integration with auto-commits
  • ✅ Full verification: tests, linter, self-critique
  • ✅ Max iteration cap (default 200) with pause/resume
  • ✅ Auto-self-improvement: appends lessons every 10 successful iterations or on failure
  • ✅ ralph_log.md with full iteration history
  • ✅ Natural-language cron triggers

⚠️ IMPORTANT: Simulation Mode vs Real Execution

The Python script (ralph_orchestrator.py) runs in SIMULATION MODE by default because it cannot call delegate_task from the execute_code sandbox.

What Simulation Mode Does:

  • Creates placeholder files to test orchestration logic
  • Does NOT actually implement phases
  • Always fails verification to prevent false progress claims
  • Phases are never marked complete in simulation

For Real Implementation (with AI subagents):

# Run via Hermes CLI to enable delegate_task
hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"

Why this separation exists:

  • delegate_task is not available in execute_code sandbox
  • Python script can only simulate orchestration logic
  • Real subagent delegation requires Hermes CLI execution

Quick Start

# 1. Initialize your project (creates PRD.md + progress.json)
cd your-project
python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py init "Your high-level PRD with numbered phases here"

# 2. Edit PRD.md to add proper phases with checkboxes:
#    1. [ ] Setup: Initialize project structure
#    2. [ ] Core: Implement main algorithm
#    3. [ ] Tests: Write unit tests

# 3. Run the Ralph loop (runs until done or max iterations)
python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py run

# 4. Monitor progress
python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py status

Command-Line Interface

ralph_orchestrator.py <command> [args] [options]

Commands:
  init "PRD content"    Create PRD.md and progress.json
  run                  Start the Ralph loop (infinite until done)
  status               Show current state and progress
  create-cron "schedule"  Print cron command for scheduled execution

Options:
  --repo PATH          Repository root (default: current directory)
  --max-iterations N   Maximum iterations before pause (default: 200)

Examples:
  python3 ralph_orchestrator.py init "# My Project\n\n## Phases\n1. [ ] Setup\n2. [ ] Implementation"
  python3 ralph_orchestrator.py run --max-iterations 500
  python3 ralph_orchestrator.py create-cron "every 10 minutes"

PRD.md Format

The PRD must contain numbered phases with checkboxes:

# Project PRD

## Phases

1. [ ] Setup: Initialize git repository, install dependencies, create structure
2. [ ] Core: Implement the main algorithm in src/main.py
3. [ ] API: Build REST endpoints in src/api.py
4. [ ] Tests: Write unit tests for core module (target 90% coverage)
5. [ ] Docs: Create README.md with usage examples
6. [ ] Deployment: Configure production environment

## Verification

- All tests must pass (pytest -q)
- Linter clean (flake8)
- No uncommitted changes
- Each phase must produce expected outputs (see phase descriptions)

Phases are parsed by regex: ^(\d+)\.\s+\[([ xX])\]\s+(.+?)(?::\s+(.+))?$

progress.json Structure

Created automatically. Tracks:

{
  "version": "1.0",
  "created_at": "2025-03-14T...",
  "iteration": 42,
  "phases": [...],
  "completed_phases": ["phase_1", "phase_2"],
  "logs": [...],
  "last_commit": "abc123...",
  "verification_passed": true,
  "auto_improve_counter": 4,
  "settings": {
    "max_iterations": 200,
    "subagent_parallel": true,
    "self_critique_enabled": true
  }
}

Subagent Swarm

During each iteration, the following subagents are spawned (if Hermes integration available):

  1. coder: Implements the current phase
  2. tester: Designs and runs tests for the implementation
  3. reviewer: Self-critique for correctness, performance, security, PRD compliance

Subagents run in parallel (when possible) and their outputs are compressed if context exceeds 50%.

Verification Checklist

Before accepting an iteration, the following must pass:

  • Test suite runs without failures (run_tests())
  • Linter passes (run_linter())
  • All phases marked complete (all_phases_completed())
  • Self-critique passes (self_critique())
  • Git working tree clean (git status)

If any check fails, the loop continues (does not commit).

Git Integration

  • Auto-commits after each successful iteration
  • Commit format: Ralph iter N: completed phase X - title
  • Stores last commit SHA in progress.json
  • Requires git repository (initialized automatically if missing)

Auto-Improvement

After every 10 successful iterations or on any failure, the skill appends a lesson to its own SKILL.md:

## Lesson learned (iteration 42)
- Time: 2025-03-14T...
- Verification: PASS/FAIL
- Issues: (if any)
- Summary: Auto-generated improvement note

This is the self-improving aspect: the orchestration skill evolves based on its own execution experience.

Cron Scheduling

Schedule continuous operation until done:

# Natural language conversion to cron
python3 ralph_orchestrator.py create-cron "every 10 minutes"
# Output: */10 * * * * cd /path/to/project && python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py run

# Or use schedule_cronjob via Hermes
schedule_cronjob 'cd /path && python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py run' every 10m

The cron job will invoke a fresh agent session each time with no context carryover (as required by Ralph Loop). State is persisted in progress.json.

Logging

Full log to ralph_log.md:

## Iteration 42 [] - 2025-03-14T23:10:00Z

Iteration 42 results:
- coder: completed
- tester: completed
- reviewer: completed
✓ Verification passed
✓ Committed: [abc123] Ralph iter 42: completed phase 5 - Tests...

---

Also logs to stderr with timestamps.

Limits and Safety

  • Maximum iterations: 200 by default (configurable via --max-iterations)
  • If max reached, loop pauses and prints message. Re-run run to continue.
  • Each iteration sleeps 2 seconds between cycles to avoid tight loops
  • KeyboardInterrupt (Ctrl+C) saves state and exits cleanly
  • Git working tree is checked before commits; if dirty, iteration fails and retries

Architecture Notes

External state only: No conversation memory, no in-memory accumulation across iterations. Fresh state every time via RalphState.load().

Subagent delegation: The Python script cannot use delegate_task (not available in execute_code sandbox). For real subagents, run Ralph through Hermes chat: hermes -s ralph_orchestrator chat -q \"run Ralph loop in /path\". The script falls back to sequential simulation.

Trajectory compression: Summarizes previous iteration context to fit within token limits. Full history lives in ralph_log.md.

Self-critique: Customizable per-phase validation. Can specify expected_files in phase dict for existence checks.

Troubleshooting

No phases found: Ensure PRD.md has numbered phases with checkboxes like 1. [ ] Phase title

Subagents not spawning: delegate_task is not available in the execute_code sandbox. For real subagents, run Ralph through Hermes chat instead of the Python script. Use: hermes -s ralph_orchestrator chat -q \"run Ralph loop in /path/to/project\"

Git errors: Ensure .git exists and you have commit rights. Ralph auto-initializes git if missing (first run only).

Loop not exiting: Check that all phases are marked [x] and verification passes. Use status command.

Example Workflow

# Create a new project
mkdir my-project && cd my-project
echo "# Test Project" > README.md
git init && git add . && git commit -m "Initial"

# Initialize Ralph
python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py init "# Test PRD\n\n## Phases\n1. [ ] Add hello world\n2. [ ] Write tests\n3. [ ] Document"

# Edit PRD.md to flesh out details
$EDITOR PRD.md

# Start loop (runs in foreground)
python3 ralph_orchestrator.py run

# In another terminal, monitor
python3 ralph_orchestrator.py status
tail -f ralph_log.md

# When done, you'll see <RALPH_DONE> printed

Real Subagent Workflow (Hermes Chat)

The ralph_orchestrator.py script runs in the execute_code sandbox, which does not have access to delegate_task. For real subagent delegation, you must orchestrate Ralph through Hermes Agent directly:

Manual Orchestration Example

# 1. Parse PRD.md to find next incomplete phase
# 2. delegate_task for CODER: "Implement Phase X: ..."
# 3. delegate_task for TESTER: "Test Phase X implementation"
# 4. delegate_task for REVIEWER: "Review Phase X code"
# 5. Update progress.json with results
# 6. Git commit
# 7. Repeat for next phase

Using Hermes CLI

# Run Ralph coordination from within Hermes
hermes -s ralph_orchestrator chat -q "run Ralph loop in /path/to/project"

Current Limitations

  1. delegate_task unavailable in sandbox: The hermes_tools module in execute_code only exposes basic tools (terminal, read_file, write_file, etc.), not delegate_task.
  2. Hermes CLI hangs non-interactively: subprocess.run(["hermes", "chat", "-q", ...]) times out because Hermes expects interactive input.
  3. Solution: Run Ralph coordination from within Hermes chat, not from standalone Python.

Workaround Until Fix

Use the Python script for PRD parsing and progress tracking, but manually delegate tasks:

# Initialize project
python3 ~/.hermes/skills/ralph_orchestrator/scripts/ralph_orchestrator.py init "Your PRD"

# Then for each phase, manually run:
hermes chat -q "delegate_task goal='Implement Phase 1...' context='...' toolsets=['terminal','file','web']"
# ... tester, reviewer, update progress.json, commit

Notes

  • Designed for massive projects with 100+ phases
  • Works with any language/toolchain (tests/linter are extensible)
  • Does not require Hermes integration but leverages it when available
  • For production use, consider adding custom phase_validators and CI integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment