The workflows.md doc explains what a workflow is and when to reach for one. It never documents the runtime API the script is written against. This guide fills that gap: every global, primitive, and pattern, derived by reverse-engineering
examples/deep-research.js.Treat the API shapes here as observed-from-the-example, not an official spec. Where behaviour is inferred rather than directly shown, it's marked (inferred).
A workflow is one JavaScript file that the workflow runtime executes in an isolated environment (not in Claude's conversation). The script is the orchestrator: it holds the loop, the branching, and the intermediate results in plain variables. The only way it "does work" is by spawning subagents — the script itself has no filesystem, shell, or network access. Agents read/write/fetch/run; the script coordinates them.
your script ──spawns──▶ subagents (do the actual work, return structured JSON)
│ │
└── keeps results in ◀──────┘
plain variables
│
└── returns ONE final object ──▶ lands in Claude's context as the result
Two consequences drive everything below:
- Intermediate data never touches Claude's context — it lives in script variables.
Only the final
returnvalue is surfaced. So you can fan out to hundreds of agents cheaply (context-wise) and distil aggressively before returning. - The script is top-level
await'd code, not a function body — but you canreturnfrom the top level to end the run early with a result (see §6).
A workflow file has exactly two parts: a meta export, then the script body.
export const meta = {
name: 'deep-research', // command name → /deep-research
description: '…', // shown in /workflows and / autocomplete
whenToUse: '…', // guidance for when Claude should pick it
phases: [ // declared phases, shown in the progress UI
{ title: 'Scope', detail: 'Decompose question into 5 angles' },
{ title: 'Search', detail: '5 parallel WebSearch agents' },
// …
],
}
// ── script body: runs top-to-bottom, can `await` and `return` ──
phase('Scope')
const x = await agent('…', { label: 'scope', schema: SCOPE_SCHEMA })
// …
return { /* final result object */ }| Field | Type | Purpose |
|---|---|---|
name |
string | The /command name once saved to .claude/workflows/. |
description |
string | One-liner in the workflows list and / autocomplete. |
whenToUse |
string | Tells Claude when to auto-select this workflow. Can include preconditions (e.g. "ask clarifying questions first if underspecified"). |
phases |
{title, detail}[] |
The phases the progress UI renders. Each title should match what you pass to phase() / the phase: agent option. |
The
phasesarray is declarative UI metadata — it tells the runtime which phases exist and in what order. The script then marks progress into them at runtime withphase()or thephase:option (§3.2).
These names are injected globals inside the script — you do not import them.
The argument passed at launch (Workflow({ name, args }) or /deep-research <text>)
arrives as a global args. It may be undefined/non-string, so guard it:
const QUESTION = (typeof args === 'string' && args.trim()) || ''
if (!QUESTION) {
return { error: 'No research question provided. Pass it as args.' }
}args is a single string, not parsed flags. Decompose it yourself (or, per the
example's whenToUse, have the caller refine it before launch).
The core primitive. Spawns a subagent with prompt, returns a Promise that resolves
to the agent's structured output.
const scope = await agent(promptString, {
label: 'scope', // short id shown in the progress UI
schema: SCOPE_SCHEMA, // JSON Schema the output MUST conform to
phase: 'Search', // (optional) assigns this agent to a declared phase
})| Option | Required | Meaning |
|---|---|---|
label |
yes (by convention) | Human-readable id in the /workflows progress view. Keep it short and unique-ish; the example embeds data: "fetch:" + host, "v" + v + ":" + claim.slice(0,40). |
schema |
yes (by convention) | A JSON Schema (§4). The runtime constrains the agent to return JSON matching it, and resolves the promise to the parsed object. |
phase |
no | Associates the agent with a meta.phases title. Use this inside pipeline/parallel where you can't call phase() inline. |
Return value:
- On success → the parsed object matching
schema. - On user-skip or agent error →
null. Every consumer must handle null — the example doesif (!r) return null,.filter(Boolean), and treats null votes as abstentions. Do not assume a result came back.
The prompts in the example always end with "Structured output only." — a strong instruction reinforcing the schema. Mirror that.
Call it to advance the progress UI to a declared phase. Used for sequential, top-level stages:
phase('Scope') // … scope work …
phase('Verify') // … verify work …
phase('Synthesize') // … synthesis work …Inside pipeline/parallel, agents overlap in time, so you can't mark phases
sequentially — pass phase: 'Search' in the agent options instead.
Runs many tasks concurrently and waits for all of them (a barrier). Returns a Promise of the results array, in input order.
const verdicts = await parallel(
Array.from({ length: 3 }, (_, v) => () => // ← array of THUNKS, not promises
agent(VERIFY_PROMPT(claim, v), { label: 'v' + v, schema: VERDICT_SCHEMA })
)
)Critical detail: parallel takes an array of functions that return promises
(() => agent(...)), not an array of already-started promises. The thunks are lazy so
the runtime can throttle to the concurrency cap (≤16 agents) instead of starting all at
once. If you pass started promises, you lose that bounding.
parallel nests — the example runs parallel(claims.map(claim => () => parallel(votes)))
to get a claims-×-votes grid, all bounded by the same cap.
Pushes each item through a sequence of stage functions, streaming: as soon as an item clears stage 1 it flows into stage 2, without waiting for the other items' stage 1 to finish. Returns a Promise of the per-item results.
const searchResults = await pipeline(
scope.angles, // items
angle => agent(SEARCH_PROMPT(angle), { // stage 1: search per angle
label: 'search:' + angle.label, phase: 'Search', schema: SEARCH_SCHEMA
}).then(r => r && { angle: angle.label, results: r.results }),
searchResult => { // stage 2: dedup + fan-out fetch
const novel = dedupAndBudget(searchResult.results) // mutates shared state
return parallel(novel.map(source => () => agent(FETCH_PROMPT(source), {...})))
}
)Why streaming matters: stage 2 here mutates shared accumulator state (seen,
fetchSlots, dupes). Because items arrive at stage 2 as they complete, the dedup map
and fetch budget fill up incrementally and naturally throttle later arrivals — a
first-come-first-served budget. A barrier-style "search all, then dedup all" would lose
that interleaving.
Shape note: a stage can return a value or a promise or a
parallel(...)of sub-agents. The result for that item is whatever the last stage resolves to — here an array (fromparallel), which is why the caller doessearchResults.flat().
Writes a human-readable line into the run's progress view. Used liberally for observability — counts, decisions, per-claim vote tallies:
log('Decomposed into ' + scope.angles.length + ' angles: ' + scope.angles.map(a => a.label).join(', '))
log('Fetched ' + allSources.length + ' sources → ' + allClaims.length + ' claims')log is for humans watching the run; it does not feed back into agents or the result.
Every agent call pairs a prompt with a JSON Schema. The runtime forces the agent's
output to validate against it and hands you the parsed object — so downstream code can
trust the shape (within the null-check caveat).
Schemas are plain JSON Schema objects defined as top-level consts:
const SCOPE_SCHEMA = {
type: 'object',
required: ['question', 'angles', 'summary'],
properties: {
question: { type: 'string' },
summary: { type: 'string' },
angles: {
type: 'array', minItems: 3, maxItems: 6,
items: {
type: 'object', required: ['label', 'query'],
properties: {
label: { type: 'string' },
query: { type: 'string' },
rationale: { type: 'string' }, // optional (not in `required`)
},
},
},
},
}Observed conventions:
- Use
requiredto mark must-have fields; omit optional ones from it (rationale,publishDate,counterSource,openQuestionsare all optional in the example). - Use
enumfor controlled vocabularies (relevance: ['high','medium','low'],sourceQuality: ['primary','secondary','blog','forum','unreliable']). The script then uses rank maps ({ high: 0, medium: 1, low: 2 }) to sort by them. - Bound arrays with
maxItemsto cap how much each agent can return (claims maxItems: 5,results maxItems: 6) — this is your per-agent firehose limiter.
| You want… | Use |
|---|---|
| One agent, wait for it | await agent(...) |
| N independent agents, all at once, wait for all | await parallel(items.map(x => () => agent(...))) |
| Multi-stage flow where stage N+1 should start per-item as soon as stage N finishes | await pipeline(items, stage1, stage2, …) |
| A grid (N claims × M votes) | nested parallel |
| Post-process one agent's result inline | .then() on the agent(...) promise |
| Tolerate a failing agent without killing the run | .catch() on the promise + return a sentinel |
Barrier vs no-barrier is the key design decision:
pipeline= no barrier between stages → items stream through. Choose it when later items can start working while earlier ones finish, or when shared state should fill incrementally (the fetch-budget example).await parallel(...)= barrier → all must finish before the next line. The example deliberately puts a barrier beforeVerifybecause the entire claim pool must be assembled and ranked before voting starts. The code comments this explicitly.
The script's top-level return is the workflow's deliverable — the one thing that lands
in Claude's context. The example returns a rich object: { question, ...report, refuted, sources, stats }.
Early returns are a primary control-flow tool. Every dead-end gets its own structured return so the run always produces a usable, self-describing result instead of throwing:
if (!QUESTION) return { error: 'No research question provided.' }
if (!scope) return { error: 'Scope agent returned no result.' }
if (rankedClaims.length === 0) return { question, summary: 'No claims extracted…', findings: [], stats: {…} }
if (confirmed.length === 0) return { question, summary: 'All claims refuted…', findings: [], refuted: [...], stats: {…} }
if (!report) return { question, summary: 'Synthesis failed — returning raw claims…', confirmed: [...], stats: {…} }
return { question, ...report, refuted, sources, stats } // happy pathNote the pattern: degrade gracefully. When synthesis fails, salvage the verified
claims raw rather than discarding the whole run. Include a stats block so the reader can
see what happened (angles, sourcesFetched, claimsExtracted, confirmed, killed,
agentCalls, …).
- Prompt builders as functions:
const SEARCH_PROMPT = (angle) => "…". Keep prompt construction out of the orchestration logic; pass data in. - End prompts with "Structured output only." to reinforce the schema.
- Embed data in
labelfor a readable progress view:"fetch:" + host,"v" + v + ":" + claim.slice(0, 40). - Rank maps for enum fields:
const relRank = { high: 0, medium: 1, low: 2 }then.sort((a,b) => relRank[a.relevance] - relRank[b.relevance]). - Tunable constants at the top:
VOTES_PER_CLAIM,REFUTATIONS_REQUIRED,MAX_FETCH,MAX_VERIFY_CLAIMS— make the orchestration's knobs visible and editable. - Adversarial quality patterns: the real payoff of "plan in code" is patterns a single
pass can't do — e.g. spawn
VOTES_PER_CLAIMindependent skeptics per claim and require a quorum (valid.length >= REFUTATIONS_REQUIRED && refuted < REFUTATIONS_REQUIRED). Abstentions (null votes) are counted and must not silently pass a claim. .flat().filter(Boolean)to collapse a pipeline's nested arrays and drop nulls.- Cap the work:
.slice(0, MAX_VERIFY_CLAIMS)after ranking, so cost stays bounded even if upstream produced a lot.
| Limit | Value |
|---|---|
| Concurrent agents | ≤16 (fewer on low-CPU machines) — why parallel takes thunks |
| Total agents per run | 1,000 (hard cap against runaway loops) |
| Mid-run user input | none — design each sign-off point as its own workflow |
| Script's own FS/shell/network | none — only agents touch the outside world |
| Model | session model unless a stage routes to another |
The example tracks its own agentCalls in stats —
1 + angles + sources + (verified × votes) + 1 — a good habit for staying under the cap
and reasoning about cost.
export const meta = {
name: 'my-workflow',
description: 'One line.',
whenToUse: 'When …',
phases: [
{ title: 'Plan', detail: 'Decompose the task' },
{ title: 'Work', detail: 'Fan out workers' },
{ title: 'Report', detail: 'Synthesize' },
],
}
const ITEM_SCHEMA = { type: 'object', required: ['items'], properties: { items: { type: 'array', items: { type: 'string' } } } }
const RESULT_SCHEMA = { type: 'object', required: ['finding'], properties: { finding: { type: 'string' } } }
const REPORT_SCHEMA = { type: 'object', required: ['summary'], properties: { summary: { type: 'string' }, details: { type: 'array', items: { type: 'string' } } } }
// Phase 1 — plan
phase('Plan')
const INPUT = (typeof args === 'string' && args.trim()) || ''
if (!INPUT) return { error: 'No input provided.' }
const plan = await agent(
'Decompose this task into items.\n\nTask: ' + INPUT + '\n\nStructured output only.',
{ label: 'plan', schema: ITEM_SCHEMA }
)
if (!plan) return { error: 'Planning failed.' }
log('Planned ' + plan.items.length + ' items')
// Phase 2 — fan out, wait for all (barrier)
const results = (await parallel(
plan.items.map(item => () =>
agent('Do the work for: ' + item + '\n\nStructured output only.',
{ label: 'work:' + item.slice(0, 30), phase: 'Work', schema: RESULT_SCHEMA })
)
)).filter(Boolean) // drop nulls (skipped/errored agents)
log('Completed ' + results.length + '/' + plan.items.length + ' items')
if (results.length === 0) return { input: INPUT, summary: 'No items completed.', details: [] }
// Phase 3 — synthesize
phase('Report')
const report = await agent(
'Synthesize these findings:\n' + results.map(r => '- ' + r.finding).join('\n') + '\n\nStructured output only.',
{ label: 'report', schema: REPORT_SCHEMA }
)
if (!report) return { input: INPUT, summary: 'Synthesis failed.', findings: results.map(r => r.finding) }
return {
input: INPUT,
...report,
stats: { items: plan.items.length, completed: results.length, agentCalls: 1 + plan.items.length + 1 },
}| API | Returns | Blocks? | Notes |
|---|---|---|---|
args |
string | undefined | — | Single input string; guard before use. |
agent(prompt, {label, schema, phase}) |
Promise<obj | null> | await it | null on skip/error — always check. |
phase(title) |
— | no | Marks UI phase for sequential top-level stages. |
parallel([() => p, …]) |
Promise<results[]> | barrier | Thunks, not promises. ≤16 at a time. Order preserved. |
pipeline(items, s1, s2, …) |
Promise<results[]> | streams | No barrier between stages; good for shared accumulators. |
log(msg) |
— | no | Progress view only; not seen by agents. |
return value |
— | ends run | The single object surfaced to Claude. |