Skip to content

Instantly share code, notes, and snippets.

@qpleple
Created June 3, 2026 12:04
Show Gist options
  • Select an option

  • Save qpleple/a6ba93c585765db9f93e0b7cd4a3fe7e to your computer and use it in GitHub Desktop.

Select an option

Save qpleple/a6ba93c585765db9f93e0b7cd4a3fe7e to your computer and use it in GitHub Desktop.

Workflow implementation guide (JS reference)

The workflows.md doc explains what a workflow is and when to reach for one. It never documents the runtime API the script is written against. This guide fills that gap: every global, primitive, and pattern, derived by reverse-engineering examples/deep-research.js.

Treat the API shapes here as observed-from-the-example, not an official spec. Where behaviour is inferred rather than directly shown, it's marked (inferred).


1. Mental model

A workflow is one JavaScript file that the workflow runtime executes in an isolated environment (not in Claude's conversation). The script is the orchestrator: it holds the loop, the branching, and the intermediate results in plain variables. The only way it "does work" is by spawning subagents — the script itself has no filesystem, shell, or network access. Agents read/write/fetch/run; the script coordinates them.

your script  ──spawns──▶  subagents (do the actual work, return structured JSON)
     │                          │
     └── keeps results in ◀──────┘
         plain variables
     │
     └── returns ONE final object  ──▶  lands in Claude's context as the result

Two consequences drive everything below:

  1. Intermediate data never touches Claude's context — it lives in script variables. Only the final return value is surfaced. So you can fan out to hundreds of agents cheaply (context-wise) and distil aggressively before returning.
  2. The script is top-level await'd code, not a function body — but you can return from the top level to end the run early with a result (see §6).

2. File anatomy

A workflow file has exactly two parts: a meta export, then the script body.

export const meta = {
  name: 'deep-research',                    // command name → /deep-research
  description: '…',                          // shown in /workflows and / autocomplete
  whenToUse: '…',                            // guidance for when Claude should pick it
  phases: [                                  // declared phases, shown in the progress UI
    { title: 'Scope',      detail: 'Decompose question into 5 angles' },
    { title: 'Search',     detail: '5 parallel WebSearch agents' },
    // …
  ],
}

// ── script body: runs top-to-bottom, can `await` and `return` ──
phase('Scope')
const x = await agent('…', { label: 'scope', schema: SCOPE_SCHEMA })
// …
return { /* final result object */ }

meta fields

Field Type Purpose
name string The /command name once saved to .claude/workflows/.
description string One-liner in the workflows list and / autocomplete.
whenToUse string Tells Claude when to auto-select this workflow. Can include preconditions (e.g. "ask clarifying questions first if underspecified").
phases {title, detail}[] The phases the progress UI renders. Each title should match what you pass to phase() / the phase: agent option.

The phases array is declarative UI metadata — it tells the runtime which phases exist and in what order. The script then marks progress into them at runtime with phase() or the phase: option (§3.2).


3. Runtime API

These names are injected globals inside the script — you do not import them.

3.1 args — the input string

The argument passed at launch (Workflow({ name, args }) or /deep-research <text>) arrives as a global args. It may be undefined/non-string, so guard it:

const QUESTION = (typeof args === 'string' && args.trim()) || ''
if (!QUESTION) {
  return { error: 'No research question provided. Pass it as args.' }
}

args is a single string, not parsed flags. Decompose it yourself (or, per the example's whenToUse, have the caller refine it before launch).

3.2 agent(prompt, options) — spawn one subagent

The core primitive. Spawns a subagent with prompt, returns a Promise that resolves to the agent's structured output.

const scope = await agent(promptString, {
  label: 'scope',          // short id shown in the progress UI
  schema: SCOPE_SCHEMA,     // JSON Schema the output MUST conform to
  phase: 'Search',          // (optional) assigns this agent to a declared phase
})
Option Required Meaning
label yes (by convention) Human-readable id in the /workflows progress view. Keep it short and unique-ish; the example embeds data: "fetch:" + host, "v" + v + ":" + claim.slice(0,40).
schema yes (by convention) A JSON Schema (§4). The runtime constrains the agent to return JSON matching it, and resolves the promise to the parsed object.
phase no Associates the agent with a meta.phases title. Use this inside pipeline/parallel where you can't call phase() inline.

Return value:

  • On success → the parsed object matching schema.
  • On user-skip or agent error → null. Every consumer must handle null — the example does if (!r) return null, .filter(Boolean), and treats null votes as abstentions. Do not assume a result came back.

The prompts in the example always end with "Structured output only." — a strong instruction reinforcing the schema. Mirror that.

3.3 phase(title) — mark the current phase

Call it to advance the progress UI to a declared phase. Used for sequential, top-level stages:

phase('Scope')      // … scope work …
phase('Verify')     // … verify work …
phase('Synthesize') // … synthesis work …

Inside pipeline/parallel, agents overlap in time, so you can't mark phases sequentially — pass phase: 'Search' in the agent options instead.

3.4 parallel(thunks) — bounded concurrent fan-out

Runs many tasks concurrently and waits for all of them (a barrier). Returns a Promise of the results array, in input order.

const verdicts = await parallel(
  Array.from({ length: 3 }, (_, v) => () =>      // ← array of THUNKS, not promises
    agent(VERIFY_PROMPT(claim, v), { label: 'v' + v, schema: VERDICT_SCHEMA })
  )
)

Critical detail: parallel takes an array of functions that return promises (() => agent(...)), not an array of already-started promises. The thunks are lazy so the runtime can throttle to the concurrency cap (≤16 agents) instead of starting all at once. If you pass started promises, you lose that bounding.

parallel nests — the example runs parallel(claims.map(claim => () => parallel(votes))) to get a claims-×-votes grid, all bounded by the same cap.

3.5 pipeline(items, stage1, stage2, …) — streaming stages (no barrier)

Pushes each item through a sequence of stage functions, streaming: as soon as an item clears stage 1 it flows into stage 2, without waiting for the other items' stage 1 to finish. Returns a Promise of the per-item results.

const searchResults = await pipeline(
  scope.angles,                              // items

  angle => agent(SEARCH_PROMPT(angle), {     // stage 1: search per angle
    label: 'search:' + angle.label, phase: 'Search', schema: SEARCH_SCHEMA
  }).then(r => r && { angle: angle.label, results: r.results }),

  searchResult => {                          // stage 2: dedup + fan-out fetch
    const novel = dedupAndBudget(searchResult.results)   // mutates shared state
    return parallel(novel.map(source => () => agent(FETCH_PROMPT(source), {...})))
  }
)

Why streaming matters: stage 2 here mutates shared accumulator state (seen, fetchSlots, dupes). Because items arrive at stage 2 as they complete, the dedup map and fetch budget fill up incrementally and naturally throttle later arrivals — a first-come-first-served budget. A barrier-style "search all, then dedup all" would lose that interleaving.

Shape note: a stage can return a value or a promise or a parallel(...) of sub-agents. The result for that item is whatever the last stage resolves to — here an array (from parallel), which is why the caller does searchResults.flat().

3.6 log(message) — progress line

Writes a human-readable line into the run's progress view. Used liberally for observability — counts, decisions, per-claim vote tallies:

log('Decomposed into ' + scope.angles.length + ' angles: ' + scope.angles.map(a => a.label).join(', '))
log('Fetched ' + allSources.length + ' sources → ' + allClaims.length + ' claims')

log is for humans watching the run; it does not feed back into agents or the result.


4. Schemas (structured output)

Every agent call pairs a prompt with a JSON Schema. The runtime forces the agent's output to validate against it and hands you the parsed object — so downstream code can trust the shape (within the null-check caveat).

Schemas are plain JSON Schema objects defined as top-level consts:

const SCOPE_SCHEMA = {
  type: 'object',
  required: ['question', 'angles', 'summary'],
  properties: {
    question: { type: 'string' },
    summary:  { type: 'string' },
    angles: {
      type: 'array', minItems: 3, maxItems: 6,
      items: {
        type: 'object', required: ['label', 'query'],
        properties: {
          label:     { type: 'string' },
          query:     { type: 'string' },
          rationale: { type: 'string' },   // optional (not in `required`)
        },
      },
    },
  },
}

Observed conventions:

  • Use required to mark must-have fields; omit optional ones from it (rationale, publishDate, counterSource, openQuestions are all optional in the example).
  • Use enum for controlled vocabularies (relevance: ['high','medium','low'], sourceQuality: ['primary','secondary','blog','forum','unreliable']). The script then uses rank maps ({ high: 0, medium: 1, low: 2 }) to sort by them.
  • Bound arrays with maxItems to cap how much each agent can return (claims maxItems: 5, results maxItems: 6) — this is your per-agent firehose limiter.

5. Concurrency patterns — when to use which

You want… Use
One agent, wait for it await agent(...)
N independent agents, all at once, wait for all await parallel(items.map(x => () => agent(...)))
Multi-stage flow where stage N+1 should start per-item as soon as stage N finishes await pipeline(items, stage1, stage2, …)
A grid (N claims × M votes) nested parallel
Post-process one agent's result inline .then() on the agent(...) promise
Tolerate a failing agent without killing the run .catch() on the promise + return a sentinel

Barrier vs no-barrier is the key design decision:

  • pipeline = no barrier between stages → items stream through. Choose it when later items can start working while earlier ones finish, or when shared state should fill incrementally (the fetch-budget example).
  • await parallel(...) = barrier → all must finish before the next line. The example deliberately puts a barrier before Verify because the entire claim pool must be assembled and ranked before voting starts. The code comments this explicitly.

6. The return value (and early returns)

The script's top-level return is the workflow's deliverable — the one thing that lands in Claude's context. The example returns a rich object: { question, ...report, refuted, sources, stats }.

Early returns are a primary control-flow tool. Every dead-end gets its own structured return so the run always produces a usable, self-describing result instead of throwing:

if (!QUESTION)            return { error: 'No research question provided.' }
if (!scope)               return { error: 'Scope agent returned no result.' }
if (rankedClaims.length === 0) return { question, summary: 'No claims extracted…', findings: [], stats: {} }
if (confirmed.length === 0)    return { question, summary: 'All claims refuted…', findings: [], refuted: [...], stats: {} }
if (!report)              return { question, summary: 'Synthesis failed — returning raw claims…', confirmed: [...], stats: {} }
return { question, ...report, refuted, sources, stats }   // happy path

Note the pattern: degrade gracefully. When synthesis fails, salvage the verified claims raw rather than discarding the whole run. Include a stats block so the reader can see what happened (angles, sourcesFetched, claimsExtracted, confirmed, killed, agentCalls, …).


7. Idioms worth copying

  • Prompt builders as functions: const SEARCH_PROMPT = (angle) => "…". Keep prompt construction out of the orchestration logic; pass data in.
  • End prompts with "Structured output only." to reinforce the schema.
  • Embed data in label for a readable progress view: "fetch:" + host, "v" + v + ":" + claim.slice(0, 40).
  • Rank maps for enum fields: const relRank = { high: 0, medium: 1, low: 2 } then .sort((a,b) => relRank[a.relevance] - relRank[b.relevance]).
  • Tunable constants at the top: VOTES_PER_CLAIM, REFUTATIONS_REQUIRED, MAX_FETCH, MAX_VERIFY_CLAIMS — make the orchestration's knobs visible and editable.
  • Adversarial quality patterns: the real payoff of "plan in code" is patterns a single pass can't do — e.g. spawn VOTES_PER_CLAIM independent skeptics per claim and require a quorum (valid.length >= REFUTATIONS_REQUIRED && refuted < REFUTATIONS_REQUIRED). Abstentions (null votes) are counted and must not silently pass a claim.
  • .flat().filter(Boolean) to collapse a pipeline's nested arrays and drop nulls.
  • Cap the work: .slice(0, MAX_VERIFY_CLAIMS) after ranking, so cost stays bounded even if upstream produced a lot.

8. Runtime limits (from workflows.md)

Limit Value
Concurrent agents ≤16 (fewer on low-CPU machines) — why parallel takes thunks
Total agents per run 1,000 (hard cap against runaway loops)
Mid-run user input none — design each sign-off point as its own workflow
Script's own FS/shell/network none — only agents touch the outside world
Model session model unless a stage routes to another

The example tracks its own agentCalls in stats1 + angles + sources + (verified × votes) + 1 — a good habit for staying under the cap and reasoning about cost.


9. Minimal skeleton

export const meta = {
  name: 'my-workflow',
  description: 'One line.',
  whenToUse: 'When …',
  phases: [
    { title: 'Plan',   detail: 'Decompose the task' },
    { title: 'Work',   detail: 'Fan out workers' },
    { title: 'Report', detail: 'Synthesize' },
  ],
}

const ITEM_SCHEMA   = { type: 'object', required: ['items'], properties: { items: { type: 'array', items: { type: 'string' } } } }
const RESULT_SCHEMA = { type: 'object', required: ['finding'], properties: { finding: { type: 'string' } } }
const REPORT_SCHEMA = { type: 'object', required: ['summary'], properties: { summary: { type: 'string' }, details: { type: 'array', items: { type: 'string' } } } }

// Phase 1 — plan
phase('Plan')
const INPUT = (typeof args === 'string' && args.trim()) || ''
if (!INPUT) return { error: 'No input provided.' }

const plan = await agent(
  'Decompose this task into items.\n\nTask: ' + INPUT + '\n\nStructured output only.',
  { label: 'plan', schema: ITEM_SCHEMA }
)
if (!plan) return { error: 'Planning failed.' }
log('Planned ' + plan.items.length + ' items')

// Phase 2 — fan out, wait for all (barrier)
const results = (await parallel(
  plan.items.map(item => () =>
    agent('Do the work for: ' + item + '\n\nStructured output only.',
          { label: 'work:' + item.slice(0, 30), phase: 'Work', schema: RESULT_SCHEMA })
  )
)).filter(Boolean)                      // drop nulls (skipped/errored agents)
log('Completed ' + results.length + '/' + plan.items.length + ' items')

if (results.length === 0) return { input: INPUT, summary: 'No items completed.', details: [] }

// Phase 3 — synthesize
phase('Report')
const report = await agent(
  'Synthesize these findings:\n' + results.map(r => '- ' + r.finding).join('\n') + '\n\nStructured output only.',
  { label: 'report', schema: REPORT_SCHEMA }
)
if (!report) return { input: INPUT, summary: 'Synthesis failed.', findings: results.map(r => r.finding) }

return {
  input: INPUT,
  ...report,
  stats: { items: plan.items.length, completed: results.length, agentCalls: 1 + plan.items.length + 1 },
}

10. Quick reference

API Returns Blocks? Notes
args string | undefined Single input string; guard before use.
agent(prompt, {label, schema, phase}) Promise<obj | null> await it null on skip/error — always check.
phase(title) no Marks UI phase for sequential top-level stages.
parallel([() => p, …]) Promise<results[]> barrier Thunks, not promises. ≤16 at a time. Order preserved.
pipeline(items, s1, s2, …) Promise<results[]> streams No barrier between stages; good for shared accumulators.
log(msg) no Progress view only; not seen by agents.
return value ends run The single object surfaced to Claude.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment