| title | Parallel Execution Primitives for Claude Code Operator Agents |
|---|---|
| date | 2026-04-12 |
| project | kampus |
| feature | claude-code-parallel-primitives |
| type | research |
| status | complete |
Claude Code has exactly three primitives for parallel execution -- subagents, git worktrees, and Agent Teams -- and the most important thing about all three is what they do not provide: any implicit safety net for concurrent writes. Parallel subagents sharing a working directory produce silent file corruption. No warnings. No conflict markers. No errors. Last write wins at the OS level, and the Edit tool's string replacement fails unpredictably when line counts shift between agents. The git index lock (fatal: Unable to create '.git/index.lock': File exists) fires even when agents write to completely disjoint files. Claude Code will not protect you. The operator must enforce isolation boundaries, or the operator ships corruption.
This is the central finding and it has a direct corollary: the only sanctioned patterns for parallel code generation are strict file partitioning (disjoint file ownership per agent) or worktree isolation (each agent gets its own branch and working directory). Read-only parallelism is safe without either. Everything else is a race condition waiting to surface.
For the operator specifically, the lowest-risk path is subagent fan-out with worktree isolation for independent tasks, where the state machine's dependency graph determines what can run concurrently. This preserves every guarantee the operator already makes -- circuit breaker, retry logic, state machine authority -- while cutting wall-clock time proportional to the parallelism factor. Agent Teams, the peer-to-peer coordination layer shipped in February 2026 behind an experimental flag, is architecturally interesting and operationally fragile: no session resumption, task status synchronization bugs, delegation compliance failures, and 20 documented issues across 10 official and 10 community-discovered reports. Anthropic's launch of Managed Agents (April 2026, public beta) as the production-grade multi-agent offering suggests Agent Teams may remain a power-user local feature indefinitely.
Token costs scale N-times for N parallel agents, compounded by prompt cache misses -- parallel agents cold-start independently rather than sharing cache. Five parallel subagents each processing 100K tokens of shared system prompt pay $3.125 versus $0.825 for sequential execution on that prefix alone. That is a 3.8x penalty on cache alone, before counting anything else. Model tiering -- Haiku for exploration, Sonnet for implementation, Opus only for orchestration -- is the highest-leverage cost optimization, yielding 40-50% savings. The 20x cost premium documented in Anthropic's harness design research comes from longer autonomous sessions with evaluation loops, not from parallelism itself; adding fan-out to the harness pattern roughly 1.5-2x the sequential harness cost.
The operator's hybrid architecture -- deterministic state machine core plus agentic skill execution shell -- is exactly the pattern the industry has converged on for production multi-agent systems. David Fetterman named it "deterministic core, agentic shell." Stately AI ships XState bindings for it. Every serious production deployment in 2025-2026 uses some version of it. Parallelism is not an architectural change for the operator. It is a scheduling optimization within the existing architecture. The state machine continues to own task ordering and transitions; the operator gains the ability to dispatch multiple independent transitions simultaneously.
Agent Teams shipped in February 2026 as a research preview, gated behind CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. One Claude Code session (the team lead) spawns independent teammate agents, each with its own context window and full tool access. Four components: a team lead, teammates, a shared task list (JSON files at ~/.claude/tasks/{team-name}/), and a per-agent mailbox system (JSON inboxes at ~/.claude/teams/{team-name}/inboxes/).
What makes Agent Teams fundamentally different from subagents is not scale but topology. Subagents are spokes reporting to a hub. Agent Teams are a mesh.
Peer-to-peer messaging -- teammates communicate directly, not just through the parent. This enables adversarial debugging, collaborative research, and self-organizing patterns that subagents structurally cannot support. Shared task list with dependency tracking -- tasks have pending/in-progress/completed states, file-locking for claims, and automatic unblocking when dependencies complete. Persistent teammates -- each teammate is a full Claude Code session that persists for the team's duration, unlike ephemeral subagent invocations. Quality gates via hooks -- TeammateIdle, TaskCreated, TaskCompleted hooks enable programmatic governance (e.g., requiring tests to pass before task closure). Plan approval workflow -- teammates can be required to plan before implementing, with the lead reviewing and approving plans.
The stability picture is honest and unflattering. Agent Teams works well for parallel research, independent module development, and competing-hypothesis debugging. It works badly for anything requiring reliability. Twenty documented issues span session management fragility, VS Code integration breakage, tmux race conditions, delegation compliance failures, and task status synchronization bugs. The feature remains behind an experimental flag with no announced GA timeline. The recommended sweet spot from community practice is 3-5 teammates with 5-6 tasks each.
Cost profile: A 3-agent team uses 3-4x tokens of a single session. Plan approval phases push this to ~7x. The cost is not the problem. The reliability is.
Git worktrees are how you make parallel code generation safe. They create independent working directories with their own branch, index, and files while sharing the same repository history. Everything else -- file partitioning, careful prompting, hoping agents stay in their lane -- is a brittle approximation of what worktrees give you by construction.
Three entry points:
| Method | Use Case |
|---|---|
claude --worktree <name> |
User sessions |
isolation: worktree in subagent frontmatter |
Automated parallel code generation |
EnterWorktree tool |
Mid-session isolation |
Worktrees are created at <repo>/.claude/worktrees/<name>/ on a branch named worktree-<name>, branching from origin/HEAD. Base branch selection is not configurable via flag -- requires git remote set-head or a WorktreeCreate hook. This is a real limitation the operator must work around.
Merge semantics: There is no automatic merge. This is a feature, not a missing feature. Worktree branches must be integrated manually via git merge, git cherry-pick, gh pr create, or by asking Claude in the main session. Conflicts surface only at merge time and use standard git conflict resolution. The team lead (or operator) coordinates sequential merging.
Cleanup: Worktrees with no changes are auto-removed when the subagent finishes. Changed worktrees persist for manual review. Orphaned worktrees are cleaned up at startup after 30 days (configurable) if they have no uncommitted changes, untracked files, or unpushed commits.
Monorepo considerations: Worktrees need dependency installation since node_modules is absent. worktree.symlinkDirectories symlinks specified directories from the main repo. worktree.sparsePaths uses git sparse-checkout for large monorepos. Known bug: Claude Code's atomic write pattern replaces symlinks with regular files -- tracked in issue #40857.
The constraint operators must internalize: Subagent results return as text summaries, not diffs or commits. The parent must explicitly interact with the worktree's branch to integrate changes. The operator needs explicit merge logic after parallel subagent completion. There is no shortcut here.
Parallel subagents in Claude Code are genuinely concurrent -- they fire simultaneously, not sequentially. Without isolation, they share the same working directory. This is where every naive "just run N agents" approach breaks.
Three findings, in order of severity:
1. Concurrent writes to the same file produce silent corruption. Last-write-wins at the OS level. The Edit tool's string replacement fails unpredictably when line counts change between agents. No warnings, no errors, no conflict markers. The file is simply wrong, and nobody tells you.
2. Git index lock contention causes commit failures. Git's .git/index.lock mutex means concurrent git add or git commit calls fail with fatal: Unable to create '.git/index.lock': File exists -- even when agents write to completely disjoint files. This is not a Claude Code bug. It is how git works. Issue #28823 documents the pattern.
3. Claude Code provides no implicit safety net. No filesystem-level locking. No copy-on-write. No transaction semantics. No warnings when you fire multiple subagents at the same working directory. The system trusts you to know what you are doing.
Five sanctioned patterns for safe parallel code generation:
| Pattern | How It Works | When to Use |
|---|---|---|
| Strict file partitioning | Each agent owns disjoint files; commits serialized after completion | Different modules, no shared files |
| Worktree isolation | Each agent gets its own branch/directory/index | Any overlapping writes |
| Read-only parallel, sequential writes | Parallel research with read-only tools; parent applies changes | Multiple perspectives on same files |
| File-based coordination with post-merge | Agents in separate clones; git sync forces conflict resolution | Large-scale (Anthropic's 16-agent C compiler approach) |
| Agent Teams with shared task list | Teammates in own worktrees; file-locked task claims | Complex multi-agent coordination |
The lesson from Anthropic's own 16-agent C compiler build is that even Anthropic does not trust its agents to share a working directory. They used separate clones with git sync. The operator should not be more trusting than its maker.
The orchestration landscape divides into three categories, and the third is eating the other two.
State-machine-driven orchestration -- what the operator currently does -- provides explicit transitions, full auditability, and testable-without-LLM determinism, but cannot adapt to unforeseen situations. LLM-driven orchestration handles unstructured problems but compounds failure rates: a 10-step process at 99% per-step reliability delivers 90.4% overall reliability. Stack ten agents and the system fails roughly once in ten runs. The hybrid pattern -- deterministic core with agentic shell -- is the dominant 2025-2026 production architecture, and the operator already implements it.
Anthropic's own research delivers the sharpest indictment of pure multi-agent orchestration: 57% of multi-agent project failures originate in orchestration design, not agent capability. The synthesis/fan-in step is where systems die -- an aggregator without explicit merge policies produces bloated or arbitrary output. The industry is learning what the operator already knows: the machine must own the rules.
Cross-framework fan-out/fan-in comparison:
| System | Fan-Out | Fan-In | Key Differentiator |
|---|---|---|---|
| Claude Code subagents | Multiple Agent tool calls in same message | Parent reads return strings, synthesizes | Simplest; no inter-child communication |
| Claude Code Agent Teams | Team lead spawns teammates + shared task list | Lead synthesizes; peers message each other | Peer-to-peer communication |
| LangGraph | Send API to multiple nodes |
Reducer functions merge typed state | Structured state with explicit reducers |
| OpenAI Agents SDK | asyncio.gather() for parallel calls |
Manager combines outputs | Handoffs + agents-as-tools |
| Google ADK | ParallelAgent named type |
Synthesizer agent post-parallel | Most explicit primitives |
| Anthropic Managed Agents | Multiple stateless brains on same session | Shared event log | Brain/Hands/Session decoupling |
The XState/Stately Agent pattern maps directly to the operator's design. The machine enforces rules while the agent handles creative work. Two bridging tools -- get_current_state and take_action -- translate between the agent and machine domains. David Fetterman's "Deterministic Core, Agentic Shell" blog post names the pattern the operator has been running. The operator is not an experiment. It is the architecture the industry converged on.
Cost scaling is multiplicative, not additive. This is the fact that every "just parallelize it" proposal must confront honestly.
N parallel subagents cost N-times, plus coordination overhead. The overhead is not fixed -- it depends on which primitive you use and how you manage the prompt cache.
| Scenario | Relative Token Cost | Wall-Clock Speed | Cache Efficiency |
|---|---|---|---|
| Single agent, sequential | 1.0x (baseline) | Slowest | Best |
| N subagents, parallel | ~Nx + overhead | Fastest | Poor (parallel cold-starts) |
| N subagents, staggered | ~Nx, better cache | Moderate | Good (serial cache hits) |
| Agent team (3 members) | 3-7x | Fast | Independent per agent |
| Agent team (5 members) | 5-15x | Fastest | Independent per agent |
| Sequential multi-agent harness | 10-22x | Slow but thorough | Moderate |
Prompt cache behavior is the hidden cost multiplier that nobody talks about. Parallel agents cold-start independently rather than sharing cache. Five parallel subagents each processing 100K tokens of shared system prompt pay ~3.8x more than sequential execution on that prefix alone ($3.125 vs. $0.825). Staggering launches -- waiting for the first response before firing subsequent agents -- restores cache benefits but sacrifices the wall-clock speed that was the entire point of parallelizing. This is a genuine tradeoff with no free lunch.
Model tiering is where the real savings live:
| Role | Recommended Model | Cost vs. Opus |
|---|---|---|
| Orchestrator / complex reasoning | Opus 4.6 | 1.0x |
| Code generation / coordination | Sonnet 4.6 | 0.6x |
| Exploration / search / simple tasks | Haiku 4.5 | 0.2x |
A 40-50% cost reduction from model tiering alone. Routing exploration subagents to Haiku at $0.25/$1.25 per million tokens versus Opus at $15/$75 is not an optimization. It is the difference between sustainable and ruinous.
Practical strategies for the operator:
- Keep spawn prompts minimal -- everything in the prompt inflates every subagent's context
- Cap extended thinking (
MAX_THINKING_TOKENS=10000) for routine tasks - Move stable instructions to skills (loaded on-demand vs. CLAUDE.md loaded at session start)
- Clean up agent teams promptly -- idle teammates still consume tokens via polling
- Use subagents over Agent Teams for focused tasks (lower overhead, higher reliability)
The operator's primitive selection is not a preference. It is a safety decision.
| Use Case | Primitive | Isolation | Model | Risk Level |
|---|---|---|---|---|
| Parallel research / analysis (no writes) | Subagents | None needed | Haiku | Low |
| Parallel code generation, disjoint files | Subagents | File partitioning | Sonnet | Low-Medium |
| Parallel code generation, possible overlap | Subagents | Worktree | Sonnet | Medium |
| QA on multiple completed tasks | Subagents | Worktree (read from branches) | Sonnet | Low |
| Complex multi-step with inter-agent discussion | Agent Teams | Worktree (implicit) | Mixed | High (experimental) |
| Single-file changes from multiple perspectives | Sequential subagents | None | Sonnet | Low |
| Large-scale parallel (10+ agents) | Manual worktrees or Docker containers | Full isolation | Mixed | Medium-High |
The question is not which primitive is "best." The question is which primitive the operator can trust with its reliability contract.
| Dimension | Subagents | Agent Teams | Manual Worktrees |
|---|---|---|---|
| Readiness | Production | Experimental | Production |
| Coordination | Parent manages all | Self-coordinating | Human manages |
| Communication | Results to parent only | Peer-to-peer + task list | None |
| Cost | N-times + overhead | 3-7x per 3 agents | Variable |
| Operator integration | Natural (Agent tool calls) | Requires env var opt-in | Manual orchestration |
| Reliability | High | Medium (known issues) | High |
| Session resumption | N/A (ephemeral) | Not supported | N/A |
| Merge burden | Parent merges worktree branches | Lead merges | Human merges |
Subagents with worktree isolation are the only combination that is both production-ready and naturally integrates with the operator's Agent tool calls. Agent Teams becomes interesting when -- and only when -- it exits experimental status and gains session resumption. Manual worktrees are the escape hatch for anything subagents cannot handle.
These use production-ready primitives and well-understood patterns. No open research questions block them.
-
Dependency graph analysis before dispatch. Parse
tasks.mdfor independent task sets. Tasks with no shared dependencies can be fanned out in a single Agent tool response. The state machine already knows the dependency graph. Use it. -
Subagent fan-out for independent tasks with worktree isolation. Each parallel subagent gets
isolation: worktree. The operator fires multiple Agent tool calls in one message. Results return as text summaries; the operator merges worktree branches sequentially. This is the bread-and-butter pattern. -
Read-only parallel research. Subagents with
toolsrestricted toRead, Grep, Globcan safely run in parallel without isolation. No writes, no races. Use for codebase analysis, audit, or information gathering phases. -
Model tiering for cost optimization. Route exploration subagents to Haiku, implementation to Sonnet, keep Opus for the operator itself. This is not optional if the operator runs at any scale.
-
Sequential commit serialization. After parallel subagents complete (in their worktrees), the operator merges branches one at a time. Git index lock contention is structurally impossible when merges are serial.
-
Circuit breaker per parallel branch. Each parallel subagent's failure is isolated. If one fails, the operator retries that task independently without affecting completed parallel tasks. The circuit breaker pattern the operator already owns extends naturally to parallel branches.
These are feasible but unvalidated in the operator context. Each carries an empirical question that cannot be answered by reading docs.
-
Fan-in merge policies. When parallel tasks return, what happens if one fails and others succeed? What if results conflict? The operator needs explicit rules. "2 of 3 succeeded" is not a state the current machine represents. These policies must be designed, implemented, and stress-tested with real runs.
-
Worktree merge conflict resolution at scale. How often do tasks the operator categorizes as "independent" actually produce merge conflicts? How well does Claude resolve them? Nobody knows. This needs empirical data from real operator runs, not assumptions.
-
Staggered subagent launches for cache optimization. Waiting for the first subagent's response before launching subsequent ones restores cache benefits. But it sacrifices wall-clock time -- the one thing parallelism was supposed to buy. The crossover point where staggering beats full parallelism depends on task duration and prompt size, both of which vary.
-
Agent Teams as an operator execution backend. The operator could spawn a team for complex features instead of sequential skill invocations. But Agent Teams' experimental status, session management fragility, and delegation compliance issues need stress-testing before this is viable. The operator's reliability contract is non-negotiable.
-
Dynamic isolation selection. Can the operator detect file overlap before dispatching and choose between file partitioning (cheaper) and worktree isolation (safer)? This requires analyzing task descriptions to predict file sets -- which is asking an LLM to predict another LLM's behavior. Approach with skepticism.
-
Parallel state machine transitions. The XState machine needs to support concurrent transitions for tasks in different states. This may require machine redesign (parallel states or multiple active state nodes). The machine currently assumes one active transition at a time. Changing this touches the operator's core invariant.
-
Cost-bounded parallelism. Set per-agent token limits and abort parallel branches that exceed budget. The right thresholds are unknown. Too low and you get premature termination on legitimate work. Too high and you have no bound at all.
These are the questions that do not have answers in the documentation, the community, or Anthropic's own engineering posts. They require building, measuring, and deciding.
-
How does the XState machine represent concurrent execution? Parallel states? Multiple active nodes? External tracking of subagent handles? The current machine assumes sequential transitions. Every answer here changes the operator's core contract.
-
What is the real-world merge conflict rate for "independent" tasks? The operator categorizes tasks as independent based on dependency analysis. How often is it wrong? Is file partitioning sufficient for a typical feature build, or does worktree isolation always pay for itself? This determines whether the cheaper option is viable.
-
What does partial fan-in failure look like? Two of three parallel tasks succeed and one fails. Advance the successes and retry the failure? Roll back all three? The answer depends on whether the successful tasks' outputs are valid in isolation -- and that depends on the specific feature, not on any general rule.
-
What is the cost-optimal parallelism factor? At what N does cache miss overhead outweigh wall-clock savings? For five agents with 100K shared prefix, the cache penalty is 3.8x. For ten agents, it is worse. The crossover point depends on task duration, and nobody has published measurements.
-
Can the circuit breaker extend to parallel execution? Trip the circuit breaker for the entire parallel batch if the failure rate exceeds threshold? Or per-branch only? The operator's existing circuit breaker was designed for sequential execution. The semantics under parallelism are genuinely ambiguous.
-
Staggered or full parallel? Staggering preserves cache. Full parallel preserves speed. The right answer depends on whether you are cost-constrained or time-constrained, and that changes per run. Is this a runtime decision the operator should make dynamically, or a configuration choice?
-
What is the minimum viable task duration for worktree overhead? Worktree creation and teardown are not free. For a 30-second subagent task, worktree overhead may dominate execution time. Where is the breakpoint below which you should just use file partitioning?
-
When will Agent Teams exit experimental? Anthropic shipped Managed Agents in April 2026 as the production multi-agent offering. Will Agent Teams remain the local/power-user feature? Will Managed Agents subsume it? The answer determines whether investing in Agent Teams integration is a bet on the platform or a bet against it.
- Create custom subagents - Claude Code Docs
- Subagents in the SDK - Claude Code Docs
- Orchestrate teams of Claude Code sessions - Claude Code Docs
- Common workflows - Claude Code Docs
- Hooks reference - Claude Code Docs
- Claude Code settings - Claude Code Docs
- Manage costs effectively - Claude Code Docs
- Prompt caching - Claude API Docs
- Compaction - Claude API Docs
- Pricing - Claude API Docs
- Building Effective Agents
- Multi-Agent Research System
- Harness Design for Long-Running Apps
- Building Agents with the Claude Agent SDK
- Managed Agents
- Building a C Compiler with Parallel Claudes
- From Tasks to Swarms: Agent Teams in Claude Code
- Agent Teams: The Switch Got Flipped
- Agent Teams Controls Guide
- Claude Code Sub-Agents: Parallel vs Sequential Patterns
- Claude Code Worktrees: Parallel Sessions Without Conflicts
- Claude Code Changelog
- How the Task Tool Actually Distributes Work
- Git Worktrees for Parallel Claude Code Sessions
- Parallelizing AI Coding Agents
- Extending Claude Code Worktrees for True Database Isolation
- pnpm + Git Worktrees for Multi-Agent Development
- Agent Teams: How I Learned to Stop Worrying (Intility Engineering)
- Claude Code Swarm Orchestration Skill (Gist)
- Claude Code Sub Agents - Burn Out Your Tokens (DEV Community)
- Claude Code Cost Optimisation Guide (systemprompt.io)
- Boris Cherny: Built-in Git Worktree Support Announcement
- Addy Osmani: The Code Agent Orchestra
- OpenAI Agents SDK: Multi-Agent Orchestration
- Google ADK: Parallel Agents
- Google Developers Blog: Multi-Agent Patterns in ADK
- Deterministic Core, Agentic Shell (David Fetterman)
- Stately AI Agent (XState)
- HatchWorks: Orchestrating AI Agents in Production
- Swarm vs. Supervisor Architecture Guide
- DataCamp: CrewAI vs LangGraph vs AutoGen
- Anthropic: AI Agent Orchestration Patterns (2026 Guide)
- Shipyard: Multi-Agent Orchestration for Claude Code
- #42856: Model Overrides Delegation
- #33043: IPC Socket Hang
- #40168: tmux Race Condition
- #28048: VS Code Tools Unavailable
- #25254: VS Code Message Delivery Broken
- #23676: CLAUDE_CONFIG_DIR Not Inherited
- #26511: Shift+Up/Down Auto-Sends
- #23561: Bedrock Model Mismatch
- #23629: TaskUpdate Status Not Synced
- #28823: Race Condition with Git index.lock
- #3013: Feature Request: Parallel Agent Execution Mode
- #34275: Worktree Settings Missing Prose
- #40857: symlinkDirectories Write Bug
- #1052: Field Notes: Git Worktree Pattern
- EnterWorktree Tool Description (System Prompt)