| name | orchestrate-plan |
|---|---|
| description | Use when executing a locked PLAN.md (e.g. produced by grill-with-docs-codex / codex-review) by delegating the implementation to cmux worker pane(s) ONE PHASE at a time — splitting the plan into context-window-sized phases, fully clearing the worker's context between phases, and (when the plan touches INDEPENDENT repos) running one worker pane per repo IN PARALLEL with a ~45s monitor/unblock loop. Triggers on "execute the plan", "implement PLAN.md", "orchestrate this implementation", "run the plan with a worker", "split the plan into phases and build it", "build the backend and frontend in parallel panes". Builds on cmux-orchestrate. Not for edits you can finish inline, and not outside a cmux workspace. |
You are the orchestrator (manager). You do NOT write the implementation yourself.
You split a locked plan into context-window-sized phases and drive one or more worker
Claudes in separate cmux panes: feed each worker one phase at a time, /clear its context
before every phase, verify each phase, then move on — while keeping your own context clean
so you can run the whole plan without drowning in tool output.
When the plan touches independent repos (e.g. a backend api service + a web
frontend that both code to a contract the plan already fixes), run one worker pane per repo
in parallel and babysit them all with a single ~45s monitor loop.
Three non-negotiables:
- One phase per context window. A phase is a coherent slice of the plan sized to fit one worker context (it may be several commits). Split anything bigger; cluster anything trivially small.
- Fully clear the worker before each phase (
/clear, never/compact). Every phase starts pristine; the worker re-grounds from disk (PLAN.md + the brief +git log). Strictly more deterministic than carrying a lossy summary. - Keep the orchestrator's context clean. Never Read worker source files or full scrollback into your window. Verify with git plumbing + narrow greps. Summarize to the human; never paste screens.
Prerequisites: a locked PLAN.md exists (run grill-with-docs-codex / codex-review first if
not). You are inside a cmux workspace — this skill drives cmux panes; the CLI cheat sheet lives in
the cmux-orchestrate skill (references/cmux-cli.md). Verbatim templates are in references/templates.md.
The worker pane runs Claude Code (default, launcher x) or OpenAI Codex (cdx). The user
picks. Everything about the orchestration (one phase per /clear, verify each phase, keep your
context clean, GATE irreversible phases, the === PHASE N DONE === marker, parallel lanes) is
identical — only the launch command and effort/fast-mode mechanics differ.
| Claude worker (default) | Codex worker (optional) | |
|---|---|---|
| Launcher | x (= claude --dangerously-skip-permissions) |
cdx (= codex --dangerously-bypass-approvals-and-sandbox, YOLO) |
| Medium effort | /effort medium slash command, re-applied after every /clear (Claude reverts effort on clear) |
launch flag cdx -c model_reasoning_effort=medium — Codex has no /effort command; the flag is process-level so it persists across /clear. Valid: minimal|low|medium|high|xhigh. |
| Fast mode | NEVER — never for a Claude worker or for you, the orchestrator | /fast slash command (Codex only). Send once after launch; after /clear, read-screen to confirm it's still ON, re-toggle only if it reverted. |
| Clear between phases | /clear |
/clear (works in Codex too) |
| Perms prompts | none (skip-perms) | none (YOLO bypass) |
Effort is ALWAYS medium for implementation. Building against a locked, well-specified plan
does NOT need high/xhigh — they are markedly slower for negligible gain here. Fast mode is
only ever for a Codex worker — never the orchestrator, never a Claude worker.
[ -n "$CMUX_SURFACE_ID" ] && echo "manager surface=$CMUX_SURFACE_ID" || echo "NOT in cmux — stop"If not in cmux, tell the user and stop (the worker-pane model requires it; see Fallback).
Resolve inputs: PLAN_FILE as an ABSOLUTE path (the per-phase prompt hands this path to the
worker, so it must be absolute), the target repo dir(s), and read PLAN.md (and CONTEXT.md/ADRs if
present) ONCE to derive the phase list — then rely on disk, not memory.
Parse PLAN.md into an ordered list of phases, each sized to fit ONE worker context window (a coherent slice — usually one feature-area or one repo's slice; may be several commits). Don't over-fragment into one-commit micro-tasks, and don't bundle two unrelated areas into one phase. For each phase record:
- number + one-line title + which PLAN section(s) it covers;
- repo it touches (the lane it belongs to);
- a phase description: the concrete scope + an explicit stop point ("…and nothing else") so the worker knows where this phase ends;
- GATED? — true if irreversible/outward-facing (deletes code, pushes, deploys, runs migrations, hits staging/prod). Gated phases need explicit human OK before they run;
- any watch-items (risks to carry forward).
Detect parallel lanes. Group phases by repo. If the plan spans multiple repos that can progress independently — each side codes to a contract the plan already pins down (e.g. the wire shapes for a new endpoint), with no phase in repo A needing a not-yet-built phase in repo B — mark them as parallel lanes: one worker pane per repo. If phases have cross-repo ordering dependencies (B can't start until A's API exists and isn't yet contract-frozen), keep them sequential (single lane, or stage the dependent lane to start later). When unsure, default to sequential and tell the human why.
Create a todo list. Call TaskCreate once per phase, in execution order, so progress is visible
and survives compaction. Within a lane, wire each phase to block the next (TaskUpdate addBlockedBy)
so order is enforced; phases in different parallel lanes are NOT blocked on each other. If a list
already exists (you were resumed mid-run), reuse it — check TaskList first; never duplicate.
Then present the phase breakdown + the lane plan (sequential vs N parallel panes) to the human and
get confirmation before spawning anything (human-led planning before fan-out — cmux golden rule 6).
Keep the list in lockstep with reality: mark a phase in_progress when you hand it to a worker, and
completed only after it's verified — at most one in_progress PER LANE.
Pick the branch base by topology, NOT by label (per repo). The "main branch" label is often wrong; base on the live trunk (recent integration commits / the code the plan references):
git rev-list --count <labelMain>..<candidate> # candidate ahead of labelMain?
git rev-list --count <candidate>..<labelMain> # …and behind?A candidate thousands of commits ahead and zero behind IS the trunk regardless of its name.
For EACH lane (one repo → one pane; a single-repo plan is just one lane):
- Create the feature branch off the verified base. Never work on main/master; never force-push.
- Write
WORKER-BRIEF.mdin that repo (template T1) — standing orders the worker re-reads every phase (repo+branch, what's done/out-of-scope, conventions, per-phase workflow, stop points, the=== PHASE N DONE ===marker, watch-items). Keep it untracked. - Spawn + name the pane, capture its surface ref (
$W_ai,$W_fe, …; one variable per lane):W=$(cmux --json new-split right --focus false | sed -n 's/.*"surface_ref" : "\(surface:[0-9]*\)".*/\1/p') cmux rename-tab --surface "$W" "worker-<repo>"
- Prep the shell + launch the worker (Claude default):
Confirm boot:
cmux send --surface "$W" -- 'cd <repo> && <env setup e.g. nvm use>\n' cmux send --surface "$W" -- 'x\n' # x = claude --dangerously-skip-permissions
cmux read-screen --surface "$W" | tail -8(Claude banner +❯). - Set
mediumeffort:cmux send --surface "$W" -- '/effort medium'; cmux send-key --surface "$W" enter
/effortmay pop a cache-warning confirm —send-key enteron the highlighted "Yes". Confirm the bottom-right reads● medium · /effort. Gotcha:/clearreverts effort to the launch default, so re-apply/effort mediumafter every clear (Phase 3 step 2).
For parallel lanes, do steps 1–5 for each pane before starting Phase 3, so all workers are armed.
Run this loop for each lane. Lanes run concurrently — drive them round-robin and let the Phase 4 monitor tell you which one needs attention.
- Confirm idle —
read-screen | tail -8; no spinner. - Fully clear, then re-set effort:
cmux send --surface "$W" -- '/clear\n',read-screen | tailto confirm a fresh transcript (if a/palette shows, an extrasend-key enterruns/clear)./clearreverts effort — immediately re-apply/effort medium+send-key enter; confirm● medium. - Send the per-phase prompt (template T2). It carries (a) the ABSOLUTE path to the full PLAN.md
and (b) this phase's description + stop point, so the worker reads the whole plan from disk but
builds ONLY this phase. The prompt also instructs the worker to run
/ponytail-reviewand apply reasonable simplifications before finishing (the brief, T1, makes this a standing gate). Because it's long, send the text THEN a separate Enter:cmux send --surface "$W" -- '<phase prompt text, single line, no apostrophes/double-quotes/tabs>' cmux send-key --surface "$W" enter
- Ghost autocomplete: a greyed Tab-suggestion is NOT typed text — backspace/Esc/Ctrl+U won't
clear it; type your prompt over it. Never send
\t/Tab (it accepts the ghost).
- Ghost autocomplete: a greyed Tab-suggestion is NOT typed text — backspace/Esc/Ctrl+U won't
clear it; type your prompt over it. Never send
- Confirm it submitted —
read-screen | tailshows the spinner running (not the prompt sitting in the box). - Wait for idle — single lane: arm the idle watcher (snippet W1,
run_in_background: true). Multiple lanes: rely on the Phase 4 multi-pane monitor instead of one watcher per pane. - On idle: confirm the phase-done marker. Read the worker's tail
(
read-screen --scrollback --lines 60 | tail -40) and look for the exact line=== PHASE <N> DONE ===. If present, verify independently and lightly (snippet V1:git log,git show --stat, narrowgrep -n). Do NOT Read source files into your context. - Verified → mark the phase
completed(TaskUpdate), set the lane's next phasein_progress, loop to phase N+1 in that lane. No marker / problem → see Phase 4 (nudge / correct / escalate). Lane out of phases → that worker is done; leave it idle or close the pane.
Every ~45 seconds, walk every active pane (cmux --json tree for current refs, then
read-screen --surface "$W" | tail -16 per lane) and classify each — then act (snippet W2 automates
the poll). Always read the screen first, then act (cmux golden rule):
=== PHASE N DONE ===+ idle → verify (V1), advance that lane:/clear→ re-apply effort → send next phase prompt. If the lane has no more phases, mark it complete.- Error / API failure / crash / "code is done" stall / sitting idle mid-phase → nudge: send a short
please continue(cmux send --surface "$W" -- 'please continue\n'). If it's clearly lost its task (e.g. context got cleared unexpectedly), re-send the phase prompt instead. - Worker is asking a question (an "ask user"-style prompt, a clarification, or a permission prompt
despite bypass) → read it. If it's safely answerable from PLAN.md / the brief / obvious convention,
pick the most appropriate answer and send it to unblock (free-text
send, orsend-keyfor a TUI menu — see cmux-orchestrate "Answering agents in workers"). If it's a real decision, a gated step, or otherwise not safely unblockable → escalate to the human (cmux notify+ pause that lane; keep its phasein_progress). Do not guess on irreversible/ambiguous choices. - Runaway / wrong direction →
cmux send-key --surface "$W" ctrl+c, then redirect.
Keep looping until all lanes are complete or a hard escalation. Throughout, keep YOUR context clean
(tail reads only; never full scrollback). Surface progress to the human in a few lines, optionally via
cmux set-progress / cmux notify.
- Read screens with
| tail -N(8–40 lines). Never dump full scrollback. - Verify with
git show --stat,git log --oneline, targetedgrep -n— neverReadwhole source files. - Trust the worker's own gates: the brief requires it to type-check + test + lint, run
/ponytail-reviewand apply reasonable simplifications, and self-report before printing the phase-done marker. You spot-check; you don't re-run everything. - For heavy verification, spawn a throwaway
Exploresubagent to check a phase and return one-line PASS/FAIL — its file reads stay in its context, not yours. - Summarize to the human; don't relay screens.
- You orchestrate; you do not implement. One phase per
/clear(full reset, never/compact); each phase prompt is self-contained = absolute PLAN.md path + this phase's description. - The worker ends every phase by printing exactly
=== PHASE <N> DONE ===so you can grep completion. The busy-watcher regex must match ONLY ephemeral spinner text ([0-9]+s ·,esc to interrupt,thinking) — NEVER the static status bar or the=== PHASE N DONE ===marker text (matching the marker fires the watcher instantly off the prompt echo). Detect the marker separately, after idle. - Worker effort is always
medium— re-applied after every/clear(Claude reverts it). Fast mode only ever for a Codex worker. - Parallel only for genuinely independent repo lanes. If phases cross-depend, stay sequential.
At most one
in_progressphase PER LANE. - GATE before any irreversible/outward-facing phase — explicit human confirmation. Don't trust branch labels; confirm topology. Don't fake convergence; report real failures. Worker stages only changed files; planning docs (PLAN.md, CONTEXT.md, WORKER-BRIEF.md, PLAN-REVIEW-LOG.md) stay untracked.
- Don't read full files or scrollback into your context "just to be sure" — that defeats the point.
- Don't parallelize repos that actually cross-depend; don't let one lane outrun a gated step.
- Don't let a worker batch phases or skip its self-verification.
- Don't auto-answer a worker's question when it's a real/irreversible decision — escalate instead.
The same shape works with the Agent tool: each phase = one fresh general-purpose subagent (a new
subagent IS a pristine context, satisfying "clear before each phase"), and independent repo lanes =
subagents dispatched in parallel. The Agent return value is just a summary (keeping your context clean).
You lose live pane visibility and mid-phase intervention, so prefer the cmux path when available.