Status: Draft v1 (language-agnostic)
Purpose: Define a service that orchestrates coding agents to get project work done.
Symphony is a long-running automation service that continuously reads work from an issue tracker (Linear in this specification version), creates an isolated workspace for each issue, and runs a coding agent session for that issue inside the workspace.
The service solves four operational problems:
- It turns issue execution into a repeatable daemon workflow instead of manual scripts.
- It isolates agent execution in per-issue workspaces so agent commands run only inside per-issue workspace directories.
- It keeps the workflow policy in-repo (
WORKFLOW.md) so teams version the agent prompt and runtime settings with their code. - It provides enough observability to operate and debug multiple concurrent agent runs.
Implementations are expected to document their trust and safety posture explicitly. This specification does not require a single approval, sandbox, or operator-confirmation policy; some implementations may target trusted environments with a high-trust configuration, while others may require stricter approvals or sandboxing.
Important boundary:
- Symphony is a scheduler/runner and tracker reader.
- Ticket writes (state transitions, comments, PR links) are typically performed by the coding agent using tools available in the workflow/runtime environment.
- A successful run may end at a workflow-defined handoff state (for example
Human Review), not necessarilyDone.
- Poll the issue tracker on a fixed cadence and dispatch work with bounded concurrency.
- Maintain a single authoritative orchestrator state for dispatch, retries, and reconciliation.
- Create deterministic per-issue workspaces and preserve them across runs.
- Stop active runs when issue state changes make them ineligible.
- Recover from transient failures with exponential backoff.
- Load runtime behavior from a repository-owned
WORKFLOW.mdcontract. - Expose operator-visible observability (at minimum structured logs).
- Support restart recovery without requiring a persistent database.
- Rich web UI or multi-tenant control plane.
- Prescribing a specific dashboard or terminal UI implementation.
- General-purpose workflow engine or distributed job scheduler.
- Built-in business logic for how to edit tickets, PRs, or comments. (That logic lives in the workflow prompt and agent tooling.)
- Mandating strong sandbox controls beyond what the coding agent and host OS provide.
- Mandating a single default approval, sandbox, or operator-confirmation posture for all implementations.
-
Workflow Loader- Reads
WORKFLOW.md. - Parses YAML front matter and prompt body.
- Returns
{config, prompt_template}.
- Reads
-
Config Layer- Exposes typed getters for workflow config values.
- Applies defaults and environment variable indirection.
- Performs validation used by the orchestrator before dispatch.
-
Issue Tracker Client- Fetches candidate issues in active states.
- Fetches current states for specific issue IDs (reconciliation).
- Fetches terminal-state issues during startup cleanup.
- Normalizes tracker payloads into a stable issue model.
-
Orchestrator- Owns the poll tick.
- Owns the in-memory runtime state.
- Decides which issues to dispatch, retry, stop, or release.
- Tracks session metrics and retry queue state.
-
Workspace Manager- Maps issue identifiers to workspace paths.
- Ensures per-issue workspace directories exist.
- Runs workspace lifecycle hooks.
- Cleans workspaces for terminal issues.
-
Agent Runner- Creates workspace.
- Builds prompt from issue + workflow template.
- Launches the coding agent app-server client.
- Streams agent updates back to the orchestrator.
-
Status Surface(optional)- Presents human-readable runtime status (for example terminal output, dashboard, or other operator-facing view).
-
Logging- Emits structured runtime logs to one or more configured sinks.
Symphony is easiest to port when kept in these layers:
-
Policy Layer(repo-defined)WORKFLOW.mdprompt body.- Team-specific rules for ticket handling, validation, and handoff.
-
Configuration Layer(typed getters)- Parses front matter into typed runtime settings.
- Handles defaults, environment tokens, and path normalization.
-
Coordination Layer(orchestrator)- Polling loop, issue eligibility, concurrency, retries, reconciliation.
-
Execution Layer(workspace + agent subprocess)- Filesystem lifecycle, workspace preparation, coding-agent protocol.
-
Integration Layer(Linear adapter)- API calls and normalization for tracker data.
-
Observability Layer(logs + optional status surface)- Operator visibility into orchestrator and agent behavior.
- Issue tracker API (Linear for
tracker.kind: linearin this specification version). - Local filesystem for workspaces and logs.
- Optional workspace population tooling (for example Git CLI, if used).
- Coding-agent executable that supports JSON-RPC-like app-server mode over stdio.
- Host environment authentication for the issue tracker and coding agent.
Normalized issue record used by orchestration, prompt rendering, and observability output.
Fields:
id(string)- Stable tracker-internal ID.
identifier(string)- Human-readable ticket key (example:
ABC-123).
- Human-readable ticket key (example:
title(string)description(string or null)priority(integer or null)- Lower numbers are higher priority in dispatch sorting.
state(string)- Current tracker state name.
branch_name(string or null)- Tracker-provided branch metadata if available.
url(string or null)labels(list of strings)- Normalized to lowercase.
blocked_by(list of blocker refs)- Each blocker ref contains:
id(string or null)identifier(string or null)state(string or null)
- Each blocker ref contains:
created_at(timestamp or null)updated_at(timestamp or null)
Parsed WORKFLOW.md payload:
config(map)- YAML front matter root object.
prompt_template(string)- Markdown body after front matter, trimmed.
Typed runtime values derived from WorkflowDefinition.config plus environment resolution.
Examples:
- poll interval
- workspace root
- active and terminal issue states
- concurrency limits
- coding-agent executable/args/timeouts
- workspace hooks
Filesystem workspace assigned to one issue identifier.
Fields (logical):
path(workspace path; current runtime typically uses absolute paths, but relative roots are possible if configured without path separators)workspace_key(sanitized issue identifier)created_now(boolean, used to gateafter_createhook)
One execution attempt for one issue.
Fields (logical):
issue_idissue_identifierattempt(integer or null,nullfor first run,>=1for retries/continuation)workspace_pathstarted_atstatuserror(optional)
State tracked while a coding-agent subprocess is running.
Fields:
session_id(string,<thread_id>-<turn_id>)thread_id(string)turn_id(string)codex_app_server_pid(string or null)last_codex_event(string/enum or null)last_codex_timestamp(timestamp or null)last_codex_message(summarized payload)codex_input_tokens(integer)codex_output_tokens(integer)codex_total_tokens(integer)last_reported_input_tokens(integer)last_reported_output_tokens(integer)last_reported_total_tokens(integer)turn_count(integer)- Number of coding-agent turns started within the current worker lifetime.
Scheduled retry state for an issue.
Fields:
issue_ididentifier(best-effort human ID for status surfaces/logs)attempt(integer, 1-based for retry queue)due_at_ms(monotonic clock timestamp)timer_handle(runtime-specific timer reference)error(string or null)
Single authoritative in-memory state owned by the orchestrator.
Fields:
poll_interval_ms(current effective poll interval)max_concurrent_agents(current effective global concurrency limit)running(mapissue_id -> running entry)claimed(set of issue IDs reserved/running/retrying)retry_attempts(mapissue_id -> RetryEntry)completed(set of issue IDs; bookkeeping only, not dispatch gating)codex_totals(aggregate tokens + runtime seconds)codex_rate_limits(latest rate-limit snapshot from agent events)
Issue ID- Use for tracker lookups and internal map keys.
Issue Identifier- Use for human-readable logs and workspace naming.
Workspace Key- Derive from
issue.identifierby replacing any character not in[A-Za-z0-9._-]with_. - Use the sanitized value for the workspace directory name.
- Derive from
Normalized Issue State- Compare states after
lowercase.
- Compare states after
Session ID- Compose from coding-agent
thread_idandturn_idas<thread_id>-<turn_id>.
- Compose from coding-agent
Workflow file path precedence:
- Explicit application/runtime setting (set by CLI startup path).
- Default:
WORKFLOW.mdin the current process working directory.
Loader behavior:
- If the file cannot be read, return
missing_workflow_fileerror. - The workflow file is expected to be repository-owned and version-controlled.
WORKFLOW.md is a Markdown file with optional YAML front matter.
Design note:
WORKFLOW.mdshould be self-contained enough to describe and run different workflows (prompt, runtime settings, hooks, and tracker selection/config) without requiring out-of-band service-specific configuration.
Parsing rules:
- If file starts with
---, parse lines until the next---as YAML front matter. - Remaining lines become the prompt body.
- If front matter is absent, treat the entire file as prompt body and use an empty config map.
- YAML front matter must decode to a map/object; non-map YAML is an error.
- Prompt body is trimmed before use.
Returned workflow object:
config: front matter root object (not nested under aconfigkey).prompt_template: trimmed Markdown body.
Top-level keys:
trackerpollingworkspacehooksagentcodex
Unknown keys should be ignored for forward compatibility.
Note:
- The workflow front matter is extensible. Optional extensions may define additional top-level keys
(for example
server) without changing the core schema above. - Extensions should document their field schema, defaults, validation rules, and whether changes apply dynamically or require restart.
- Common extension:
server.port(integer) enables the optional HTTP server described in Section 13.7.
Fields:
kind(string)- Required for dispatch.
- Current supported value:
linear
endpoint(string)- Default for
tracker.kind == "linear":https://api.linear.app/graphql
- Default for
api_key(string)- May be a literal token or
$VAR_NAME. - Canonical environment variable for
tracker.kind == "linear":LINEAR_API_KEY. - If
$VAR_NAMEresolves to an empty string, treat the key as missing.
- May be a literal token or
project_slug(string)- Required for dispatch when
tracker.kind == "linear".
- Required for dispatch when
active_states(list of strings)- Default:
Todo,In Progress
- Default:
terminal_states(list of strings)- Default:
Closed,Cancelled,Canceled,Duplicate,Done
- Default:
Fields:
interval_ms(integer or string integer)- Default:
30000 - Changes should be re-applied at runtime and affect future tick scheduling without restart.
- Default:
Fields:
root(path string or$VAR)- Default:
<system-temp>/symphony_workspaces ~and strings containing path separators are expanded.- Bare strings without path separators are preserved as-is (relative roots are allowed but discouraged).
- Default:
Fields:
after_create(multiline shell script string, optional)- Runs only when a workspace directory is newly created.
- Failure aborts workspace creation.
before_run(multiline shell script string, optional)- Runs before each agent attempt after workspace preparation and before launching the coding agent.
- Failure aborts the current attempt.
after_run(multiline shell script string, optional)- Runs after each agent attempt (success, failure, timeout, or cancellation) once the workspace exists.
- Failure is logged but ignored.
before_remove(multiline shell script string, optional)- Runs before workspace deletion if the directory exists.
- Failure is logged but ignored; cleanup still proceeds.
timeout_ms(integer, optional)- Default:
60000 - Applies to all workspace hooks.
- Non-positive values should be treated as invalid and fall back to the default.
- Changes should be re-applied at runtime for future hook executions.
- Default:
Fields:
max_concurrent_agents(integer or string integer)- Default:
10 - Changes should be re-applied at runtime and affect subsequent dispatch decisions.
- Default:
max_retry_backoff_ms(integer or string integer)- Default:
300000(5 minutes) - Changes should be re-applied at runtime and affect future retry scheduling.
- Default:
max_concurrent_agents_by_state(mapstate_name -> positive integer)- Default: empty map.
- State keys are normalized (
lowercase) for lookup. - Invalid entries (non-positive or non-numeric) are ignored.
Fields:
For Codex-owned config values such as approval_policy, thread_sandbox, and
turn_sandbox_policy, supported values are defined by the targeted Codex app-server version.
Implementors should treat them as pass-through Codex config values rather than relying on a
hand-maintained enum in this spec. To inspect the installed Codex schema, run
codex app-server generate-json-schema --out <dir> and inspect the relevant definitions referenced
by v2/ThreadStartParams.json and v2/TurnStartParams.json. Implementations may validate these
fields locally if they want stricter startup checks.
command(string shell command)- Default:
codex app-server - The runtime launches this command via
bash -lcin the workspace directory. - The launched process must speak a compatible app-server protocol over stdio.
- Default:
approval_policy(CodexAskForApprovalvalue)- Default: implementation-defined.
thread_sandbox(CodexSandboxModevalue)- Default: implementation-defined.
turn_sandbox_policy(CodexSandboxPolicyvalue)- Default: implementation-defined.
turn_timeout_ms(integer)- Default:
3600000(1 hour)
- Default:
read_timeout_ms(integer)- Default:
5000
- Default:
stall_timeout_ms(integer)- Default:
300000(5 minutes) - If
<= 0, stall detection is disabled.
- Default:
The Markdown body of WORKFLOW.md is the per-issue prompt template.
Rendering requirements:
- Use a strict template engine (Liquid-compatible semantics are sufficient).
- Unknown variables must fail rendering.
- Unknown filters must fail rendering.
Template input variables:
issue(object)- Includes all normalized issue fields, including labels and blockers.
attempt(integer or null)null/absent on first attempt.- Integer on retry or continuation run.
Fallback prompt behavior:
- If the workflow prompt body is empty, the runtime may use a minimal default prompt
(
You are working on an issue from Linear.). - Workflow file read/parse failures are configuration/validation errors and should not silently fall back to a prompt.
Error classes:
missing_workflow_fileworkflow_parse_errorworkflow_front_matter_not_a_maptemplate_parse_error(during prompt rendering)template_render_error(unknown variable/filter, invalid interpolation)
Dispatch gating behavior:
- Workflow file read/YAML errors block new dispatches until fixed.
- Template errors fail only the affected run attempt.
Configuration precedence:
- Workflow file path selection (runtime setting -> cwd default).
- YAML front matter values.
- Environment indirection via
$VAR_NAMEinside selected YAML values. - Built-in defaults.
Value coercion semantics:
- Path/command fields support:
~home expansion$VARexpansion for env-backed path values- Apply expansion only to values intended to be local filesystem paths; do not rewrite URIs or arbitrary shell command strings.
Dynamic reload is required:
- The software should watch
WORKFLOW.mdfor changes. - On change, it should re-read and re-apply workflow config and prompt template without restart.
- The software should attempt to adjust live behavior to the new config (for example polling cadence, concurrency limits, active/terminal states, codex settings, workspace paths/hooks, and prompt content for future runs).
- Reloaded config applies to future dispatch, retry scheduling, reconciliation decisions, hook execution, and agent launches.
- Implementations are not required to restart in-flight agent sessions automatically when config changes.
- Extensions that manage their own listeners/resources (for example an HTTP server port change) may require restart unless the implementation explicitly supports live rebind.
- Implementations should also re-validate/reload defensively during runtime operations (for example before dispatch) in case filesystem watch events are missed.
- Invalid reloads should not crash the service; keep operating with the last known good effective configuration and emit an operator-visible error.
This validation is a scheduler preflight run before attempting to dispatch new work. It validates the workflow/config needed to poll and launch workers, not a full audit of all possible workflow behavior.
Startup validation:
- Validate configuration before starting the scheduling loop.
- If startup validation fails, fail startup and emit an operator-visible error.
Per-tick dispatch validation:
- Re-validate before each dispatch cycle.
- If validation fails, skip dispatch for that tick, keep reconciliation active, and emit an operator-visible error.
Validation checks:
- Workflow file can be loaded and parsed.
tracker.kindis present and supported.tracker.api_keyis present after$resolution.tracker.project_slugis present when required by the selected tracker kind.codex.commandis present and non-empty.
This section is intentionally redundant so a coding agent can implement the config layer quickly.
tracker.kind: string, required, currentlylineartracker.endpoint: string, defaulthttps://api.linear.app/graphqlwhentracker.kind=lineartracker.api_key: string or$VAR, canonical envLINEAR_API_KEYwhentracker.kind=lineartracker.project_slug: string, required whentracker.kind=lineartracker.active_states: list of strings, default["Todo", "In Progress"]tracker.terminal_states: list of strings, default["Closed", "Cancelled", "Canceled", "Duplicate", "Done"]polling.interval_ms: integer, default30000workspace.root: path, default<system-temp>/symphony_workspacesworker.ssh_hosts(extension): list of SSH host strings, optional; when omitted, work runs locallyworker.max_concurrent_agents_per_host(extension): positive integer, optional; shared per-host cap applied across configured SSH hostshooks.after_create: shell script or nullhooks.before_run: shell script or nullhooks.after_run: shell script or nullhooks.before_remove: shell script or nullhooks.timeout_ms: integer, default60000agent.max_concurrent_agents: integer, default10agent.max_turns: integer, default20agent.max_retry_backoff_ms: integer, default300000(5m)agent.max_concurrent_agents_by_state: map of positive integers, default{}codex.command: shell command string, defaultcodex app-servercodex.approval_policy: CodexAskForApprovalvalue, default implementation-definedcodex.thread_sandbox: CodexSandboxModevalue, default implementation-definedcodex.turn_sandbox_policy: CodexSandboxPolicyvalue, default implementation-definedcodex.turn_timeout_ms: integer, default3600000codex.read_timeout_ms: integer, default5000codex.stall_timeout_ms: integer, default300000server.port(extension): integer, optional; enables the optional HTTP server,0may be used for ephemeral local bind, and CLI--portoverrides it
The orchestrator is the only component that mutates scheduling state. All worker outcomes are reported back to it and converted into explicit state transitions.
This is not the same as tracker states (Todo, In Progress, etc.). This is the service's internal
claim state.
-
Unclaimed- Issue is not running and has no retry scheduled.
-
Claimed- Orchestrator has reserved the issue to prevent duplicate dispatch.
- In practice, claimed issues are either
RunningorRetryQueued.
-
Running- Worker task exists and the issue is tracked in
runningmap.
- Worker task exists and the issue is tracked in
-
RetryQueued- Worker is not running, but a retry timer exists in
retry_attempts.
- Worker is not running, but a retry timer exists in
-
Released- Claim removed because issue is terminal, non-active, missing, or retry path completed without re-dispatch.
Important nuance:
- A successful worker exit does not mean the issue is done forever.
- The worker may continue through multiple back-to-back coding-agent turns before it exits.
- After each normal turn completion, the worker re-checks the tracker issue state.
- If the issue is still in an active state, the worker should start another turn on the same live
coding-agent thread in the same workspace, up to
agent.max_turns. - The first turn should use the full rendered task prompt.
- Continuation turns should send only continuation guidance to the existing thread, not resend the original task prompt that is already present in thread history.
- Once the worker exits normally, the orchestrator still schedules a short continuation retry (about 1 second) so it can re-check whether the issue remains active and needs another worker session.
A run attempt transitions through these phases:
PreparingWorkspaceBuildingPromptLaunchingAgentProcessInitializingSessionStreamingTurnFinishingSucceededFailedTimedOutStalledCanceledByReconciliation
Distinct terminal reasons are important because retry logic and logs differ.
-
Poll Tick- Reconcile active runs.
- Validate config.
- Fetch candidate issues.
- Dispatch until slots are exhausted.
-
Worker Exit (normal)- Remove running entry.
- Update aggregate runtime totals.
- Schedule continuation retry (attempt
1) after the worker exhausts or finishes its in-process turn loop.
-
Worker Exit (abnormal)- Remove running entry.
- Update aggregate runtime totals.
- Schedule exponential-backoff retry.
-
Codex Update Event- Update live session fields, token counters, and rate limits.
-
Retry Timer Fired- Re-fetch active candidates and attempt re-dispatch, or release claim if no longer eligible.
-
Reconciliation State Refresh- Stop runs whose issue states are terminal or no longer active.
-
Stall Timeout- Kill worker and schedule retry.
- The orchestrator serializes state mutations through one authority to avoid duplicate dispatch.
claimedandrunningchecks are required before launching any worker.- Reconciliation runs before dispatch on every tick.
- Restart recovery is tracker-driven and filesystem-driven (no durable orchestrator DB required).
- Startup terminal cleanup removes stale workspaces for issues already in terminal states.
At startup, the service validates config, performs startup cleanup, schedules an immediate tick, and
then repeats every polling.interval_ms.
The effective poll interval should be updated when workflow config changes are re-applied.
Tick sequence:
- Reconcile running issues.
- Run dispatch preflight validation.
- Fetch candidate issues from tracker using active states.
- Sort issues by dispatch priority.
- Dispatch eligible issues while slots remain.
- Notify observability/status consumers of state changes.
If per-tick validation fails, dispatch is skipped for that tick, but reconciliation still happens first.
An issue is dispatch-eligible only if all are true:
- It has
id,identifier,title, andstate. - Its state is in
active_statesand not interminal_states. - It is not already in
running. - It is not already in
claimed. - Global concurrency slots are available.
- Per-state concurrency slots are available.
- Blocker rule for
Todostate passes:- If the issue state is
Todo, do not dispatch when any blocker is non-terminal.
- If the issue state is
Sorting order (stable intent):
priorityascending (1..4 are preferred; null/unknown sorts last)created_atoldest firstidentifierlexicographic tie-breaker
Global limit:
available_slots = max(max_concurrent_agents - running_count, 0)
Per-state limit:
max_concurrent_agents_by_state[state]if present (state key normalized)- otherwise fallback to global limit
The runtime counts issues by their current tracked state in the running map.
Optional SSH host limit:
- When
worker.max_concurrent_agents_per_hostis set, each configured SSH host may run at most that many concurrent agents at once. - Hosts at that cap are skipped for new dispatch until capacity frees up.
Retry entry creation:
- Cancel any existing retry timer for the same issue.
- Store
attempt,identifier,error,due_at_ms, and new timer handle.
Backoff formula:
- Normal continuation retries after a clean worker exit use a short fixed delay of
1000ms. - Failure-driven retries use
delay = min(10000 * 2^(attempt - 1), agent.max_retry_backoff_ms). - Power is capped by the configured max retry backoff (default
300000/ 5m).
Retry handling behavior:
- Fetch active candidate issues (not all issues).
- Find the specific issue by
issue_id. - If not found, release claim.
- If found and still candidate-eligible:
- Dispatch if slots are available.
- Otherwise requeue with error
no available orchestrator slots.
- If found but no longer active, release claim.
Note:
- Terminal-state workspace cleanup is handled by startup cleanup and active-run reconciliation (including terminal transitions for currently running issues).
- Retry handling mainly operates on active candidates and releases claims when the issue is absent, rather than performing terminal cleanup itself.
Reconciliation runs every tick and has two parts.
Part A: Stall detection
- For each running issue, compute
elapsed_mssince:last_codex_timestampif any event has been seen, elsestarted_at
- If
elapsed_ms > codex.stall_timeout_ms, terminate the worker and queue a retry. - If
stall_timeout_ms <= 0, skip stall detection entirely.
Part B: Tracker state refresh
- Fetch current issue states for all running issue IDs.
- For each running issue:
- If tracker state is terminal: terminate worker and clean workspace.
- If tracker state is still active: update the in-memory issue snapshot.
- If tracker state is neither active nor terminal: terminate worker without workspace cleanup.
- If state refresh fails, keep workers running and try again on the next tick.
When the service starts:
- Query tracker for issues in terminal states.
- For each returned issue identifier, remove the corresponding workspace directory.
- If the terminal-issues fetch fails, log a warning and continue startup.
This prevents stale terminal workspaces from accumulating after restarts.
Workspace root:
workspace.root(normalized path; the current config layer expands path-like values and preserves bare relative names)
Per-issue workspace path:
<workspace.root>/<sanitized_issue_identifier>
Workspace persistence:
- Workspaces are reused across runs for the same issue.
- Successful runs do not auto-delete workspaces.
Input: issue.identifier
Algorithm summary:
- Sanitize identifier to
workspace_key. - Compute workspace path under workspace root.
- Ensure the workspace path exists as a directory.
- Mark
created_now=trueonly if the directory was created during this call; otherwisecreated_now=false. - If
created_now=true, runafter_createhook if configured.
Notes:
- This section does not assume any specific repository/VCS workflow.
- Workspace preparation beyond directory creation (for example dependency bootstrap, checkout/sync, code generation) is implementation-defined and is typically handled via hooks.
The spec does not require any built-in VCS or repository bootstrap behavior.
Implementations may populate or synchronize the workspace using implementation-defined logic and/or
hooks (for example after_create and/or before_run).
Failure handling:
- Workspace population/synchronization failures return an error for the current attempt.
- If failure happens while creating a brand-new workspace, implementations may remove the partially prepared directory.
- Reused workspaces should not be destructively reset on population failure unless that policy is explicitly chosen and documented.
Supported hooks:
hooks.after_createhooks.before_runhooks.after_runhooks.before_remove
Execution contract:
- Execute in a local shell context appropriate to the host OS, with the workspace directory as
cwd. - On POSIX systems,
sh -lc <script>(or a stricter equivalent such asbash -lc <script>) is a conforming default. - Hook timeout uses
hooks.timeout_ms; default:60000 ms. - Log hook start, failures, and timeouts.
Failure semantics:
after_createfailure or timeout is fatal to workspace creation.before_runfailure or timeout is fatal to the current run attempt.after_runfailure or timeout is logged and ignored.before_removefailure or timeout is logged and ignored.
This is the most important portability constraint.
Invariant 1: Run the coding agent only in the per-issue workspace path.
- Before launching the coding-agent subprocess, validate:
cwd == workspace_path
Invariant 2: Workspace path must stay inside workspace root.
- Normalize both paths to absolute.
- Require
workspace_pathto haveworkspace_rootas a prefix directory. - Reject any path outside the workspace root.
Invariant 3: Workspace key is sanitized.
- Only
[A-Za-z0-9._-]allowed in workspace directory names. - Replace all other characters with
_.
This section defines the language-neutral contract for integrating a coding agent app-server.
Compatibility profile:
- The normative contract is message ordering, required behaviors, and the logical fields that must be extracted (for example session IDs, completion state, approval handling, and usage/rate-limit telemetry).
- Exact JSON field names may vary slightly across compatible app-server versions.
- Implementations should tolerate equivalent payload shapes when they carry the same logical meaning, especially for nested IDs, approval requests, user-input-required signals, and token/rate-limit metadata.
Subprocess launch parameters:
- Command:
codex.command - Invocation:
bash -lc <codex.command> - Working directory: workspace path
- Stdout/stderr: separate streams
- Framing: line-delimited protocol messages on stdout (JSON-RPC-like JSON per line)
Notes:
- The default command is
codex app-server. - Approval policy, cwd, and prompt are expressed in the protocol messages in Section 10.2.
Recommended additional process settings:
- Max line size: 10 MB (for safe buffering)
Reference: https://developers.openai.com/codex/app-server/
The client must send these protocol messages in order:
Illustrative startup transcript (equivalent payload shapes are acceptable if they preserve the same semantics):
{"id":1,"method":"initialize","params":{"clientInfo":{"name":"symphony","version":"1.0"},"capabilities":{}}}
{"method":"initialized","params":{}}
{"id":2,"method":"thread/start","params":{"approvalPolicy":"<implementation-defined>","sandbox":"<implementation-defined>","cwd":"/abs/workspace"}}
{"id":3,"method":"turn/start","params":{"threadId":"<thread-id>","input":[{"type":"text","text":"<rendered prompt-or-continuation-guidance>"}],"cwd":"/abs/workspace","title":"ABC-123: Example","approvalPolicy":"<implementation-defined>","sandboxPolicy":{"type":"<implementation-defined>"}}}initializerequest- Params include:
clientInfoobject (for example{name, version})capabilitiesobject (may be empty)
- If the targeted Codex app-server requires capability negotiation for dynamic tools, include the necessary capability flag(s) here.
- Wait for response (
read_timeout_ms)
- Params include:
initializednotificationthread/startrequest- Params include:
approvalPolicy= implementation-defined session approval policy valuesandbox= implementation-defined session sandbox valuecwd= absolute workspace path- If optional client-side tools are implemented, include their advertised tool specs using the protocol mechanism supported by the targeted Codex app-server version.
- Params include:
turn/startrequest- Params include:
threadIdinput= single text item containing rendered prompt for the first turn, or continuation guidance for later turns on the same threadcwdtitle=<issue.identifier>: <issue.title>approvalPolicy= implementation-defined turn approval policy valuesandboxPolicy= implementation-defined object-form sandbox policy payload when required by the targeted app-server version
- Params include:
Session identifiers:
- Read
thread_idfromthread/startresultresult.thread.id - Read
turn_idfrom eachturn/startresultresult.turn.id - Emit
session_id = "<thread_id>-<turn_id>" - Reuse the same
thread_idfor all continuation turns inside one worker run
The client reads line-delimited messages until the turn terminates.
Completion conditions:
turn/completed-> successturn/failed-> failureturn/cancelled-> failure- turn timeout (
turn_timeout_ms) -> failure - subprocess exit -> failure
Continuation processing:
- If the worker decides to continue after a successful turn, it should issue another
turn/starton the same livethreadId. - The app-server subprocess should remain alive across those continuation turns and be stopped only when the worker run is ending.
Line handling requirements:
- Read protocol messages from stdout only.
- Buffer partial stdout lines until newline arrives.
- Attempt JSON parse on complete stdout lines.
- Stderr is not part of the protocol stream:
- ignore it or log it as diagnostics
- do not attempt protocol JSON parsing on stderr
The app-server client emits structured events to the orchestrator callback. Each event should include:
event(enum/string)timestamp(UTC timestamp)codex_app_server_pid(if available)- optional
usagemap (token counts) - payload fields as needed
Important emitted events may include:
session_startedstartup_failedturn_completedturn_failedturn_cancelledturn_ended_with_errorturn_input_requiredapproval_auto_approvedunsupported_tool_callnotificationother_messagemalformed
Approval, sandbox, and user-input behavior is implementation-defined.
Policy requirements:
- Each implementation should document its chosen approval, sandbox, and operator-confirmation posture.
- Approval requests and user-input-required events must not leave a run stalled indefinitely. An implementation should either satisfy them, surface them to an operator, auto-resolve them, or fail the run according to its documented policy.
Example high-trust behavior:
- Auto-approve command execution approvals for the session.
- Auto-approve file-change approvals for the session.
- Treat user-input-required turns as hard failure.
Unsupported dynamic tool calls:
- Supported dynamic tool calls that are explicitly implemented and advertised by the runtime should be handled according to their extension contract.
- If the agent requests a dynamic tool call (
item/tool/call) that is not supported, return a tool failure response and continue the session. - This prevents the session from stalling on unsupported tool execution paths.
Optional client-side tool extension:
- An implementation may expose a limited set of client-side tools to the app-server session.
- Current optional standardized tool:
linear_graphql. - If implemented, supported tools should be advertised to the app-server session during startup using the protocol mechanism supported by the targeted Codex app-server version.
- Unsupported tool names should still return a failure result and continue the session.
linear_graphql extension contract:
-
Purpose: execute a raw GraphQL query or mutation against Linear using Symphony's configured tracker auth for the current session.
-
Availability: only meaningful when
tracker.kind == "linear"and valid Linear auth is configured. -
Preferred input shape:
{ "query": "single GraphQL query or mutation document", "variables": { "optional": "graphql variables object" } } -
querymust be a non-empty string. -
querymust contain exactly one GraphQL operation. -
variablesis optional and, when present, must be a JSON object. -
Implementations may additionally accept a raw GraphQL query string as shorthand input.
-
Execute one GraphQL operation per tool call.
-
If the provided document contains multiple operations, reject the tool call as invalid input.
-
operationNameselection is intentionally out of scope for this extension. -
Reuse the configured Linear endpoint and auth from the active Symphony workflow/runtime config; do not require the coding agent to read raw tokens from disk.
-
Tool result semantics:
- transport success + no top-level GraphQL
errors->success=true - top-level GraphQL
errorspresent ->success=false, but preserve the GraphQL response body for debugging - invalid input, missing auth, or transport failure ->
success=falsewith an error payload
- transport success + no top-level GraphQL
-
Return the GraphQL response or error payload as structured tool output that the model can inspect in-session.
Illustrative responses (equivalent payload shapes are acceptable if they preserve the same outcome):
{"id":"<approval-id>","result":{"approved":true}}
{"id":"<tool-call-id>","result":{"success":false,"error":"unsupported_tool_call"}}Hard failure on user input requirement:
- If the agent requests user input, fail the run attempt immediately.
- The client detects this via:
- explicit method (
item/tool/requestUserInput), or - turn methods/flags indicating input is required.
- explicit method (
Timeouts:
codex.read_timeout_ms: request/response timeout during startup and sync requestscodex.turn_timeout_ms: total turn stream timeoutcodex.stall_timeout_ms: enforced by orchestrator based on event inactivity
Error mapping (recommended normalized categories):
codex_not_foundinvalid_workspace_cwdresponse_timeoutturn_timeoutport_exitresponse_errorturn_failedturn_cancelledturn_input_required
The Agent Runner wraps workspace + prompt + app-server client.
Behavior:
- Create/reuse workspace for issue.
- Build prompt from workflow template.
- Start app-server session.
- Forward app-server events to orchestrator.
- On any error, fail the worker attempt (the orchestrator will retry).
Note:
- Workspaces are intentionally preserved after successful runs.
An implementation must support these tracker adapter operations:
-
fetch_candidate_issues()- Return issues in configured active states for a configured project.
-
fetch_issues_by_states(state_names)- Used for startup terminal cleanup.
-
fetch_issue_states_by_ids(issue_ids)- Used for active-run reconciliation.
Linear-specific requirements for tracker.kind == "linear":
tracker.kind == "linear"- GraphQL endpoint (default
https://api.linear.app/graphql) - Auth token sent in
Authorizationheader tracker.project_slugmaps to Linear projectslugId- Candidate issue query filters project using
project: { slugId: { eq: $projectSlug } } - Issue-state refresh query uses GraphQL issue IDs with variable type
[ID!] - Pagination required for candidate issues
- Page size default:
50 - Network timeout:
30000 ms
Important:
- Linear GraphQL schema details can drift. Keep query construction isolated and test the exact query fields/types required by this specification.
A non-Linear implementation may change transport details, but the normalized outputs must match the domain model in Section 4.
Candidate issue normalization should produce fields listed in Section 4.1.1.
Additional normalization details:
labels-> lowercase stringsblocked_by-> derived from inverse relations where relation type isblockspriority-> integer only (non-integers become null)created_atandupdated_at-> parse ISO-8601 timestamps
Recommended error categories:
unsupported_tracker_kindmissing_tracker_api_keymissing_tracker_project_sluglinear_api_request(transport failures)linear_api_status(non-200 HTTP)linear_graphql_errorslinear_unknown_payloadlinear_missing_end_cursor(pagination integrity error)
Orchestrator behavior on tracker errors:
- Candidate fetch failure: log and skip dispatch for this tick.
- Running-state refresh failure: log and keep active workers running.
- Startup terminal cleanup failure: log warning and continue startup.
Symphony does not require first-class tracker write APIs in the orchestrator.
- Ticket mutations (state transitions, comments, PR metadata) are typically handled by the coding agent using tools defined by the workflow prompt.
- The service remains a scheduler/runner and tracker reader.
- Workflow-specific success often means "reached the next handoff state" (for example
Human Review) rather than tracker terminal stateDone. - If the optional
linear_graphqlclient-side tool extension is implemented, it is still part of the agent toolchain rather than orchestrator business logic.
Inputs to prompt rendering:
workflow.prompt_template- normalized
issueobject - optional
attemptinteger (retry/continuation metadata)
- Render with strict variable checking.
- Render with strict filter checking.
- Convert issue object keys to strings for template compatibility.
- Preserve nested arrays/maps (labels, blockers) so templates can iterate.
attempt should be passed to the template because the workflow prompt may provide different
instructions for:
- first run (
attemptnull or absent) - continuation run after a successful prior session
- retry after error/timeout/stall
If prompt rendering fails:
- Fail the run attempt immediately.
- Let the orchestrator treat it like any other worker failure and decide retry behavior.
Required context fields for issue-related logs:
issue_idissue_identifier
Required context for coding-agent session lifecycle logs:
session_id
Message formatting requirements:
- Use stable
key=valuephrasing. - Include action outcome (
completed,failed,retrying, etc.). - Include concise failure reason when present.
- Avoid logging large raw payloads unless necessary.
The spec does not prescribe where logs must go (stderr, file, remote sink, etc.).
Requirements:
- Operators must be able to see startup/validation/dispatch failures without attaching a debugger.
- Implementations may write to one or more sinks.
- If a configured log sink fails, the service should continue running when possible and emit an operator-visible warning through any remaining sink.
If the implementation exposes a synchronous runtime snapshot (for dashboards or monitoring), it should return:
running(list of running session rows)- each running row should include
turn_count retrying(list of retry queue rows)codex_totalsinput_tokensoutput_tokenstotal_tokensseconds_running(aggregate runtime seconds as of snapshot time, including active sessions)
rate_limits(latest coding-agent rate limit payload, if available)
Recommended snapshot error modes:
timeoutunavailable
A human-readable status surface (terminal output, dashboard, etc.) is optional and implementation-defined.
If present, it should draw from orchestrator state/metrics only and must not be required for correctness.
Token accounting rules:
- Agent events may include token counts in multiple payload shapes.
- Prefer absolute thread totals when available, such as:
thread/tokenUsage/updatedpayloadstotal_token_usagewithin token-count wrapper events
- Ignore delta-style payloads such as
last_token_usagefor dashboard/API totals. - Extract input/output/total token counts leniently from common field names within the selected payload.
- For absolute totals, track deltas relative to last reported totals to avoid double-counting.
- Do not treat generic
usagemaps as cumulative totals unless the event type defines them that way. - Accumulate aggregate totals in orchestrator state.
Runtime accounting:
- Runtime should be reported as a live aggregate at snapshot/render time.
- Implementations may maintain a cumulative counter for ended sessions and add active-session
elapsed time derived from
runningentries (for examplestarted_at) when producing a snapshot/status view. - Add run duration seconds to the cumulative ended-session runtime when a session ends (normal exit or cancellation/termination).
- Continuous background ticking of runtime totals is not required.
Rate-limit tracking:
- Track the latest rate-limit payload seen in any agent update.
- Any human-readable presentation of rate-limit data is implementation-defined.
Humanized summaries of raw agent protocol events are optional.
If implemented:
- Treat them as observability-only output.
- Do not make orchestrator logic depend on humanized strings.