You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A spec for turning cmux into the control plane for coordinated multi-agent
development — visible, steerable, auditable.
The Idea
cmux is already a terminal multiplexer with 130+ CLI methods, browser surfaces,
sidebar status, notifications, and workspace management. Pi is already an
extensible coding agent with RPC mode, lifecycle events, tool registration,
session forking, and an EventBus. pi-tasks already provides structured task
tracking with dependency DAGs, subagent spawning via EventBus RPC, auto-cascade,
and file-locked shared task stores.
None of these systems know about each other. This spec bridges them.
What This Enables
You sit in one pi session (the orchestrator). You say:
"Deploy the new auth system across all three repos."
The orchestrator:
Creates a task DAG via pi-tasks (4 tasks, 3 repos + integration tests)
Spawns 3 worker pi sessions in separate cmux workspaces via spawn-pi
Each worker gets a focused prompt, a scoped cwd, and a model assignment
The fleet dashboard widget shows live status of all agents
cmux sidebar/tab indicators show which workspaces need attention
Workers complete → pi-tasks auto-cascades to the integration test task
You can switch to any workspace and take over at any time
Everything is in your terminal — not a web UI, not an abstraction layer
Design Principles
Visible by default. Every agent runs in a real cmux workspace you can see,
read, and interact with. No invisible background processes unless you opt in.
Steerable. The orchestrator can steer workers mid-flight. You can take over
from any agent. Workers can be aborted, retried, or redirected.
Composable. Each piece (cmux extension, pi-tasks, RPC bridge, fleet
dashboard) works independently and compounds when combined.
Safe. Workers load pi-cmux (standalone package) for full visibility —
never --no-extensions blindness. PI_CMUX_ROLE=worker disables subprocess
spawns while keeping sidebar/notifications/tool activity. RPC workers are
sandboxed. Shared task stores use file locking.
Key capability for orchestration:new-workspace --cwd <path> --command <cmd>
can spawn a pi session in one call. read-screen can observe any surface.
send can inject text into any surface.
Layer 1: Pi (Agent Runtime)
Each workspace runs an independent pi process. Pi provides:
Interactive mode — full TUI with tools, extensions, skills
RPC mode (--mode rpc) — headless JSON-over-stdin control with
prompt/steer/follow_up/abort/get_state/get_messages/compact/fork/new_session
The highest-leverage addition. Lets the orchestrator create worker agents
in visible cmux workspaces with a single tool call.
Interface
pi.registerTool({name: "spawn_pi",label: "Spawn Pi Agent",description: "Spawn a new pi agent in a cmux workspace. The agent runs "+"in a visible terminal you can switch to at any time.",parameters: Type.Object({prompt: Type.String({description: "Initial prompt for the agent"}),cwd: Type.Optional(Type.String({description: "Working directory (default: current)"})),model: Type.Optional(Type.String({description: "Model to use (default: claude-sonnet-4)"})),task_id: Type.Optional(Type.String({description: "pi-tasks task ID to bind to this agent"})),workspace_name: Type.Optional(Type.String({description: "cmux workspace label"})),direction: Type.Optional(Type.Union([Type.Literal("workspace"),// new workspace (default)Type.Literal("right"),// split right in current workspaceType.Literal("down"),// split down in current workspace])),skills: Type.Optional(Type.Array(Type.String(),{description: "Skill names to load (e.g., ['next-best-practices'])"})),session: Type.Optional(Type.String({description: "Session file to continue (for resuming previous work)"})),}),});
Behavior
1. Create the surface
if(direction==="workspace"||!direction){// New workspaceconstresult=cmux("new-workspace","--cwd",cwd);// Parse workspace/surface refs from result}else{// Split in current workspaceconstresult=cmux("new-split",direction);}
2. Build the pi command
constpiArgs=["pi","--model",model||"claude-sonnet-4","--print",// non-interactive: run prompt and keep running for follow-ups];if(skills?.length){for(constskillofskills)piArgs.push("--skill",skill);}if(session){piArgs.push("--continue",session);}// Prompt goes via stdin after launch, or as trailing argpiArgs.push(JSON.stringify(prompt));
parameters: Type.Object({agent_id: Type.String({description: "Agent ID from spawn_pi"}),lines: Type.Optional(Type.Number({description: "Lines to read (default: 30)"})),})// Implementation: cmux read-screen --surface <surfaceRef> --lines <n>// Parse output to extract: last assistant response, status, model info
send_agent — Steer a worker mid-flight
parameters: Type.Object({agent_id: Type.String({description: "Agent ID"}),message: Type.String({description: "Message to send"}),})// Implementation: cmux send --surface <surfaceRef> "<message>\n"// For RPC workers: { type: "steer", message } via stdin
Workers are real pi sessions with curated extensions — not lobotomized
--no-extensions drones. The extension system is an asset. The orchestrator
decides which extensions each worker gets.
Curated Extension Sets
The spawn_pi tool accepts an extensions parameter — a list of extension
paths or package names to load. The orchestrator picks the right set per task:
# Full-featured worker (most tasks)
pi -e pi-cmux/cmux.ts \
-e pi-tasks/index.ts \
-e agent-secrets/agent-secrets.ts \
--model claude-sonnet-4 \
"refactor the auth middleware"# Lightweight worker (lint, format)
pi -e pi-cmux/cmux.ts \
--tools read,bash \
--model claude-haiku-4-5 \
"lint src/ and fix all errors"# Worker with MCQ for interactive decisions
pi -e pi-cmux/cmux.ts \
-e mcq/index.ts \
--model claude-sonnet-4 \
"set up the new project — ask me about stack choices"
--no-extensions only disables auto-discovery (prevents the pi-tools
megapackage from loading everything). Explicit -e paths still load.
Extension Presets
Common combinations for spawn_pi:
Preset
Extensions
Use Case
standard
pi-cmux, pi-tasks, agent-secrets
Most work
minimal
pi-cmux
Lint, format, simple fixes
interactive
pi-cmux, mcq
Tasks needing user choices
full
(all of pi-tools)
Orchestrator-grade agents
Worker Mode (PI_CMUX_ROLE=worker)
pi-cmux detects PI_CMUX_ROLE=worker and adjusts:
Feature
Orchestrator
Worker
Sidebar status + tool activity
✅
✅
Notifications on agent_end
✅
✅ (focus-aware)
mark-read/mark-unread
✅
✅
cmux tools
✅
✅
Session naming (subprocess)
✅
❌
Turn summary (subprocess)
✅
❌
Worker mode keeps ALL visibility while disabling the two features that spawn
child pi processes. The fork-bomb risk is eliminated without losing anything.
Additional Safety
The orchestrator tracks all spawned surfaces and cleans up on session_shutdown.
Max concurrent agent limit (configurable, default 5) prevents runaway spawning.
Workers can be sandboxed via --tools read,bash for restricted operations.
PI_CMUX_CHILD=1 env guard remains as belt-and-suspenders for haiku spawns.
The dashboard combines three data sources polled on a timer:
1. Fleet tracker (in-memory)
The spawn_pi tool registers agents in a Map<string, AgentInfo>:
interfaceAgentInfo{id: string;// Short UUIDsurfaceRef: string;// cmux surface refworkspaceRef: string;// cmux workspace reftaskId?: string;// pi-tasks task IDmodel: string;cwd: string;prompt: string;// First 200 charsstatus: "starting"|"running"|"idle"|"completed"|"failed";spawnedAt: number;lastActivity?: string;// Latest tool activity from read-screen}
2. cmux tree + per-workspace read-screen
// Get workspace layoutconsttree=cmux("tree","--all");// For each tracked agent, read the last few linesfor(constagentoffleet.values()){constscreen=cmuxSafe("read-screen","--surface",agent.surfaceRef,"--lines","3");agent.lastActivity=parseActivity(screen);agent.status=inferStatus(screen);}
Activity parsing heuristics:
If screen contains the pi spinner/streaming indicator → running
If screen contains the pi input prompt → idle
If screen contains an error trace → failed
Extract the most recent tool use line for lastActivity
constPOLL_INTERVAL=3000;// 3s when agents are activeletfleetTimer: ReturnType<typeofsetInterval>|undefined;functionstartFleetPolling(){if(fleetTimer)return;fleetTimer=setInterval(()=>{updateFleetStatus();tui?.requestRender();},POLL_INTERVAL);}functionstopFleetPolling(){if(fleetTimer){clearInterval(fleetTimer);fleetTimer=undefined;}}
Adaptive polling
When agents are active: poll every 3s
When all agents idle/completed: poll every 15s
When no agents tracked: stop polling, hide widget
cmux sidebar sync
In addition to the widget, update the cmux sidebar status with a compact fleet summary:
Register shortcuts in the orchestrator for fleet management:
pi.registerShortcut("ctrl+shift+f",{description: "Toggle fleet dashboard",handler: ()=>toggleFleetWidget(),});pi.registerShortcut("ctrl+shift+n",{description: "Switch to next active agent workspace",handler: ()=>{constnext=getNextActiveAgent();if(next)cmux("select-workspace","--workspace",next.workspaceRef);},});
Integration with pi-tasks Widget
Both pi-tasks and the fleet dashboard use ctx.ui.setWidget(). They coexist:
pi-tasks widget (key: "tasks") — shows task list with dependency state
Fleet dashboard (key: "fleet") — shows agent/workspace state
Both render above the editor. The fleet dashboard appears above the task list
when agents are active, creating a unified orchestration view:
pi.registerTool({name: "rpc_spawn",label: "RPC Worker",description: "Spawn a headless pi worker for quick parallel tasks. "+"Cheaper than spawn-pi (no workspace). Results delivered as messages.",parameters: Type.Object({prompt: Type.String({description: "Task prompt"}),cwd: Type.Optional(Type.String()),model: Type.Optional(Type.String({description: "Default: claude-haiku-4-5"})),task_id: Type.Optional(Type.String({description: "pi-tasks task ID"})),tools: Type.Optional(Type.Array(Type.String(),{description: "Allowed tools (default: read,bash,edit,write)"})),}),});
Cleanup
On orchestrator session_shutdown:
pi.on("session_shutdown",async()=>{rpcPool.killAll();// Visible agents: optionally close their workspacesfor(constagentoffleet.values()){cmuxSafe("close-workspace","--workspace",agent.workspaceRef);}});
How agents in different cmux workspaces communicate without shared memory.
The Problem
Each pi process is independent. Pi's EventBus is per-process. When agent A
in workspace:2 completes a task, agent B in workspace:1 doesn't know about it.
Three Relay Mechanisms
1. cmux as Message Bus (simplest, recommended)
cmux surfaces can send text to each other. The orchestrator can inject messages
into worker sessions:
// Orchestrator tells worker to do somethingcmux("send","--surface",worker.surfaceRef,"Now run the integration tests\n");// Orchestrator reads worker outputconstscreen=cmux("read-screen","--surface",worker.surfaceRef,"--lines","20");
This is crude but effective. The orchestrator polls workers periodically and
can steer them based on what it sees. No new infrastructure required.
Enhancement: structured completion signals. Workers can echo a structured
completion marker that the orchestrator's polling loop detects:
// Worker's session_shutdown or agent_end hook writes a markercmuxSafe("log","--source","worker","--",JSON.stringify({type: "agent_complete",
agentId,
taskId,status: "completed",summary: lastAssistantText.slice(0,200),}));
The orchestrator's extension polls cmux sidebar-state for log entries
from other workspaces.
2. Shared Task Store (already built)
pi-tasks supports project-scoped stores and PI_TASKS env variable for
shared file paths. Multiple pi sessions can coordinate via the same task file:
# All workers use the same task listexport PI_TASKS=sprint-1
The file store uses file locking (O_EXCL + stale PID detection). Any session
can read/write tasks. The orchestrator creates tasks, workers pick them up
and mark them complete.
This is the pi-tasks way. No custom relay needed — the store IS the
shared state. Workers poll the store (pi-tasks already re-reads on every
get() and list() for file-backed stores).
Start with mechanism 2 (shared task store). It's already built, battle-tested
with file locking, and requires zero new infrastructure. The orchestrator polls
the task store (it already does this via store.list()), and workers update
their task status via the same store.
Add mechanism 1 (cmux send/read) for steering. When the orchestrator needs
to redirect a worker or inject new instructions, it uses cmux send. When it
needs to read worker output for decision-making, it uses cmux read-screen.
Graduate to mechanism 3 (Unix socket) only if polling latency matters. For
most development workflows, 3-5 second polling is fine. Real-time relay is
overkill until you're running 10+ agents and need sub-second coordination.
Event Types
Regardless of relay mechanism, standardize on event types:
How pi-tasks evolves from single-session task tracking to multi-agent
coordination engine — and what to fork/extend from the existing codebase.
What pi-tasks Already Has (v1)
Feature
Status
Notes
Task CRUD with IDs
✅
TaskStore with Map-backed CRUD
Dependency DAG
✅
Bidirectional blocks/blockedBy with cycle warnings
File-backed shared stores
✅
PI_TASKS env, file locking, stale PID detection
Background process tracking
✅
ProcessTracker — spawn, buffer output, wait, stop
Subagent spawning via EventBus
✅
subagents:rpc:spawn with scoped reply channels
Auto-cascade on completion
✅
Completed tasks trigger unblocked dependents
TUI widget with spinners
✅
Star spinner, token counts, elapsed time, blocked-by
System-reminder injection
✅
Periodic nudges via tool_result event
/tasks interactive command
✅
View, create, clear, settings panel
Session/project/memory scoping
✅
Configurable via settings or env
What's Missing for Multi-Agent Orchestration
1. cmux-Aware Task Execution
pi-tasks spawns subagents via pi.events (EventBus RPC to @tintinweb/pi-subagents).
These run as background agents within the same pi process. For the orchestration rig,
we need tasks to spawn as separate pi processes in cmux workspaces.
Fork point: Replace or augment the spawnSubagent() function to support
cmux-based spawning:
TaskExecute reads this config to decide HOW to execute each task.
2. Fleet State in Task Store
Currently, agent state lives in the in-memory agentTaskMap and the fleet
tracker. For persistence across orchestrator restarts, store fleet state in
the task metadata:
On orchestrator restart (session_start), scan the task store for tasks with
status: "in_progress" and metadata.workspaceRef. Verify the workspace still
exists via cmux tree. If it does, resume monitoring. If not, mark the task
as failed.
3. Shared Store Protocol
For multi-agent coordination, all agents need to use the same task store.
The orchestrator sets up the store, then passes the store path to workers:
# Orchestrator sets the envexport PI_TASKS=/path/to/shared-tasks.json
# Workers inherit it (or it's passed via spawn-pi)
cmux new-workspace --command "PI_TASKS=/path/to/shared-tasks.json pi ..."
Workers can then use TaskList/TaskGet/TaskUpdate to claim tasks,
report progress, and mark completion. The file-locked store handles concurrency.
Worker task claim protocol:
1. Worker calls TaskList → finds pending tasks
2. Worker calls TaskUpdate { taskId, status: "in_progress", owner: agentId }
3. If store.update succeeds and owner is set → task is claimed
4. Worker executes the task
5. Worker calls TaskUpdate { taskId, status: "completed" }
Race condition prevention: the file lock ensures only one writer at a time.
First worker to lock and update the owner field wins the claim.
4. Enhanced Widget for Fleet View
The existing pi-tasks widget shows task status. Extend it with fleet context:
What we found digging through dmux, pi-coordination, pi-boomerang,
jido_symphony, pi-verbosity-control, executor, and nicobailon's catalog.
1. LLM-Based Pane State Detection (dmux)
dmux's PaneAnalyzer is the pattern we need for fleet monitoring. Instead
of fragile regex parsing of terminal output, it captures the last 50 lines
of a pane and sends them to a cheap LLM (Gemini Flash / GPT-4o-mini via
OpenRouter) to classify the state:
Parallel model fallback — races Gemini Flash, Grok, GPT-4o-mini via
Promise.any(). First success wins. Typical latency <1s.
Content-hash cache — MD5 of captured content → cached result (5s TTL).
Identical screen = skip the API call entirely.
Request deduplication — if an analysis is already in-flight for a pane,
concurrent requests share the same Promise.
Two-stage analysis — Stage 1 determines state (20 tokens). Stage 2
extracts details only if needed: options for dialogs (360 tokens), summary
for idle agents (180 tokens).
Autopilot — when an option dialog is detected and the first option has
no risk (potentialHarm.hasRisk === false), auto-sends the accept key.
What we steal: The fleet dashboard should use this pattern instead of
regex-based screen parsing. When a worker goes idle, the analyzer extracts
a summary for the dashboard and notification. When it detects a dialog, the
orchestrator can auto-accept or escalate to the user.
2. Smart Attention Service (dmux)
dmux's DmuxAttentionService has a much better attention model than our
simple mark-read/mark-unread:
Armed state — panes only generate notifications after at least one
working status is observed. Prevents startup noise.
Fingerprinting — each attention event gets a fingerprint
(${status}:${title}:${body}). Same fingerprint = don't re-notify.
Each phase is a distinct stage. The coordinator spawns workers from the task
queue, priority-aware. Workers run in parallel with dependency resolution.
Two modes: subprocess (separate pi process, default) or SDK (in-process via
createAgentSession()). SDK mode enables steering mid-flight but shares
process (crash = everything crashes).
What we steal:
The task queue + dependency resolution model is more mature than pi-tasks
File reservations prevent multi-worker conflicts
Supervisor loop pattern for stuck agent recovery
The phase pipeline concept for structuring orchestration
A2A messaging as an alternative to cmux send/read-screen
Worker context persistence for crash recovery
4. Boomerang: Execute-and-Collapse (pi-boomerang)
The /boomerang pattern: run a task autonomously, then collapse the entire
exchange into a brief summary using navigateTree. The summary preserves
WHAT was done without the step-by-step details.
Key mechanism:
// Before boomerangpi.on("before_agent_start",async(event,ctx)=>{// Inject "BOOMERANG MODE ACTIVE" into system prompt// Agent works fully autonomous (no clarifying questions)});// After boomerangpi.on("agent_end",async(_event,ctx)=>{// Collapse via navigateTree back to anchor pointawaitcommandCtx.navigateTree(targetId,{summarize: true});});// Custom summary via session_before_treepi.on("session_before_tree",async(event)=>{// Generate structured summary from entries// Files read, files written, commands run, outcomereturn{summary: {summary: generatedSummary}};});
The key insight from pi-coordination: the orchestrator itself is an agent
with tools, not a script. It has spawn_workers, check_status,
broadcast, escalate_to_user as LLM-callable tools. The LLM decides
when to spawn more workers, when to intervene, when to escalate.
The key insight from dmux: status detection should be LLM-based, not
regex. Screen content → cheap model → structured state. This handles the
infinite variety of terminal output without brittle pattern matching.
The key insight from boomerang: token efficiency is critical for workers.
Execute-and-collapse keeps context windows manageable across many tasks.
The key insight from jido_symphony: the polling/dispatch/monitor loop is
battle-tested with proper retry backoff, stall detection, and rate limiting.
Don't reinvent — translate to TypeScript.