Atmosphere — Unified @Agent API roadmap — autonomous run report

Session: 2026-04-10 → 2026-04-11 · main branch · worktree-per-phase workflow Latest push: 19aa501d81 (P0 AiPipeline HITL gap fix + D-9 drafts + P1/P2 follow-ups) Previous milestones: e271e89ab6 (Phase 12 SK runtime #6 — 7/7 CI green), c879ae1779 (Phases 8/10/11 SPI shells — 7/7 CI green)

Commits (chronological)

#	SHA	Scope	Title
1	`5fb5aac966`	Phase 0	feat(ai): unify HITL approval path across runtime bridges
2	`db7520a0e3`	Phase 0+docs	docs sweeps + BuiltInAgentRuntime STRUCTURED_OUTPUT README restore
3	`3ceb2a9679`	ADK HITL	route ADK tool invocation through executeWithApproval (per-request runner rebuild)
4	`2380ef8e4f`	Phase 1	promote ai.tokens.* to typed StreamingSession.usage()
5	`0e01c5e3c7`	Phase 2	ExecutionHandle SPI + Spring AI Reactor Disposable wrapper
6	`27cdc9c69d`	Phase 3	AgentLifecycleListener SPI + fireXxx helpers
7	`c6b63508e1`	Phase 4	Content.Audio variant + AgentExecutionContext.parts()
8	`392565fe68`	Phase 5	StreamingSession.toolCallDelta default sink
9	`c83469a478`	Phase 6	ToolApprovalPolicy sealed interface (AllowAll/DenyAll/Annotated/Custom)
10	`05a4d25e99`	Phase 7	capability flags (PROMPT_CACHING, MULTI_AGENT_HANDOFF, CANCELLATION, MODEL_ENUMERATION, TOKEN_USAGE)
11	`c879ae1779`	Phase 8/10/11	EmbeddingRuntime SPI + ToolArgumentValidator + AgentRuntime.models()
12	`a0ebcc6b6a`	D-1 + D-6 ADK	AgentExecutionResult record + ADK native cancel via AdkEventAdapter
13	`e271e89ab6`	Phase 12	Microsoft Semantic Kernel runtime #6 (new `modules/semantic-kernel/`)
14	`19aa501d81`	P0 review fixes	AiPipeline HITL gap + D-9 + P1 @Deprecated + P2 Embabel note + P2 contract assertion

Phase 9 (RetryPolicy) was already in tree — see D-8.

CI verification

`e271e89ab6` (Phase 12) — 7/7 green ✅

Workflow	Run ID	Status
CI: Core (JDK 21/26)	—	success
CI: E2E (Playwright)	—	success
CI: Samples	—	success
CI: Benchmarks	—	success
CI: Dependency Graph	—	success
Security: CodeQL	—	success
Deploy: SNAPSHOT	—	success

`19aa501d81` (P0 fix) — in progress

7 workflows running at push time; stop rule in effect — the 09:52 EDT wakeup handles the green check.

Coverage

~60 new tests across the roadmap:
- 15 HITL approval tests (4 runtimes × 3 outcomes + shared helper)
- 6 TokenUsage / StreamingSession sink tests
- 4 ADK approval bridge tests
- 5 ExecutionHandle tests
- 5 AgentLifecycleListener tests
- 5 ToolApprovalPolicy tests
- 4 AgentExecutionResult tests (D-1)
- 4 SemanticKernel smoke tests (Phase 12)
- 3 AiPipeline HITL regression tests (P0 fix, 2026-04-11)
7 new SPI types: TokenUsage, ExecutionHandle, AgentLifecycleListener, ToolApprovalPolicy, EmbeddingRuntime, AgentExecutionResult, Content.Audio
3 new helpers: ToolArgumentValidator, Content.audio() factory, AiPipeline.tryResolveApproval()
5 new capability flags: PROMPT_CACHING, MULTI_AGENT_HANDOFF, CANCELLATION, MODEL_ENUMERATION, TOKEN_USAGE
ApprovalGateExecutor deleted — single source of truth via ToolExecutionHelper.executeWithApproval
6/6 runtime bridges (LC4j / Spring AI / Koog / ADK / Built-in / Semantic Kernel) — 5 route through unified HITL at the bridge layer; Embabel documented as HITL-exempt
1 new module: modules/semantic-kernel/ with SK-Java 1.4.0 (semantickernel-api + semantickernel-aiservices-openai, both provided scope)

Decisions queued during the run

D-1 — `AgentExecutionResult` record

Status: RESOLVED in a0ebcc6b6a. Record lives at modules/ai/.../AgentExecutionResult.java. AgentRuntime.generateResult(context) wraps CollectingSession via composition and captures usage() events alongside text. 4 unit tests.

D-2 — Default `usage()` sink translates to legacy keys

Wire-format compatible — existing Micrometer / budget interceptors unchanged.

D-3 — Skip Embabel for Phase 1 token reporting

Still a follow-up. Reliable wiring requires Embabel JAR inspection; process.blackboard.lastResult() doesn't expose usage directly.

D-4 — Add Koog token reporting (gap fix folded into Phase 1)

Koog's onLLMCallCompleted now reads ResponseMetaInfo.{input,output,total}TokensCount. 5/6 runtimes honest.

D-5 — `TOKEN_USAGE` capability flag reversal

Initially skipped (Runtime Truth); Phase 7 re-added alongside the other flags for routing/discovery use cases.

D-6 — Native cancel rollout per runtime

Status: Spring AI shipped in Phase 2; ADK shipped in a0ebcc6b6a via AdkEventAdapter.whenDone() + doExecuteWithHandle. LC4j, Built-in, Koog still on the default ExecutionHandle.completed() — documented per-runtime refactors.

D-7 — `doExecuteWithHandle` template method on `AbstractAgentRuntime`

Single override point shared by execute and executeWithHandle.

D-8 — Phase 9 RetryPolicy was already shipped

Discovered when the worktree's RetryPolicy.java already implemented the interface.

D-9 — Contract parity assertion + BuiltInAgentRuntime STRUCTURED_OUTPUT fix

Status: RESOLVED in 19aa501d81. ChefFamille's drafts (runtimeWithSystemPromptAlsoDeclaresStructuredOutput + BuiltInAgentRuntime.capabilities() declaring STRUCTURED_OUTPUT) pulled from stash@{0} into the P0 commit.

D-10 — 5 doc sweeps landed mid-run

Rebased cleanly.

2026-04-11 Phase 0 review findings → resolution

ChefFamille reviewed the Phase 0 commits end-to-end and found a critical gap that the per-runtime bridge tests and the e2e wire-format test did not catch.

P0 — AiPipeline HITL gap ✅ FIXED in `19aa501d81`

Finding: AiStreamingSession was the only main-source producer of ApprovalStrategy. Every non-websocket entry point (AgentProcessor, CoordinatorProcessor, AgUiHandler, ChannelAiBridge) built AgentExecutionContext via AiPipeline.execute() → the 14-arg constructor → approvalStrategy = null. Runtime bridges correctly routed through executeWithApproval, but the helper fell through to direct execution when the strategy was null. So @RequiresApproval tools silently bypassed gating on A2A / @Coordinator / AG-UI / Slack / Telegram / Discord / WhatsApp / Messenger paths. Violation of Correctness Invariant #7 (Mode Parity).

Fix: AiPipeline now owns an ApprovalRegistry and threads ApprovalStrategy.virtualThread(approvalRegistry) into the 15-arg context constructor on every execute() call. New tryResolveApproval(String) + approvalRegistry() accessors let callers route protocol-specific approval messages back through the parked VT. Regression test AiPipelineHitlTest (3 assertions, 6s total) proves:

Pipeline threads a non-null strategy into the bridge.
@RequiresApproval tools hit the gate and time out safely when no client responds.
Non-@RequiresApproval tools still bypass the gate unchanged.
tryResolveApproval accepts /__approval/<id>/{approve,deny} messages.

Default behavior on non-websocket paths: safe timeout — tools don't execute unless a channel-specific approval UX wires responses through tryResolveApproval.

P1 — `@Deprecated(forRemoval=true)` on 14-arg constructor ✅ FIXED

Added to AgentExecutionContext with Javadoc pointing at the Phase 0 review gap. Under -Werror, every new new AgentExecutionContext(...) with 14 args surfaces a compile error at PR review. Fixed the one legitimate main-source consumer (OnSessionCloseStrategy — no tools; uses 15-arg with explicit null). 11 legitimate non-HITL callers (LLM judge, test utilities, contract tests, routing tests) got class-level @SuppressWarnings({"deprecation", "removal"}) so the warning only fires on new code.

P1 — D-9 drafts ✅ LANDED

ChefFamille's stash@{0} drafts landed in the P0 commit. BuiltInAgentRuntime.capabilities() now declares STRUCTURED_OUTPUT with a Javadoc explaining both mechanisms (pipeline wrapping + native jsonMode). The new contract assertion runtimeWithSystemPromptAlsoDeclaresStructuredOutput() enforces that every runtime declaring SYSTEM_PROMPT also declares STRUCTURED_OUTPUT (Runtime Truth Invariant #5).

P2 — Embabel README HITL gap note ✅ DONE

One paragraph in modules/embabel/README.md documenting that Embabel delegates tool execution to its own AgentPlatform so @RequiresApproval is not honored on Embabel-backed flows. Points to workarounds (enforce at @AgentAction level, or route sensitive tools through a different runtime).

P2 — `hitlPendingApprovalEmitsProtocolEvent` contract assertion ✅ DONE

Cross-runtime assertion in AbstractAgentRuntimeContractTest. Any runtime declaring TOOL_CALLING that fails to route through executeWithApproval lights this up. Runtimes without tool calling (Embabel, SK Phase 12) skip cleanly.

Review observations ChefFamille flagged as excellent (kept as-is)

Spring AI per-request callback capture over ThreadLocal — cleaner than the plan proposed; recommended pattern for future runtimes.
ADK per-request runner rebuild — noted in Javadoc at AdkAgentRuntime:132-135 as the only way to bind HITL context per invocation. Flagged for a P3 performance benchmark.
ApprovalGateExecutor fully deleted — zero references remain, confirmed by grep.

Deferred — follow-up work

Phase 12 — SK tool calling

SemanticKernelToolBridge intentionally not shipped in the first Phase 12 commit. SK's @DefineKernelFunction-driven plugin system needs either a compile-time annotation processor or runtime bytecode synthesis to map Atmosphere's dynamic ToolDefinition shape. Phase 12 declares TEXT_STREAMING + SYSTEM_PROMPT + CONVERSATION_MEMORY + TOKEN_USAGE (4/4 for which SK has direct API support). Tool calling is a dedicated follow-up.

Native cancel — LC4j / Built-in / Koog (D-6)

SPI in since Phase 2; Spring AI wired in 0e01c5e3c7; ADK wired in a0ebcc6b6a. Remaining three have per-runtime refactoring notes in the blockers log:

LC4j: StreamingChatResponseHandler has no direct HTTP cancel; either subclass OpenAiStreamingChatModel or flip a polled flag (soft-cancel at next token boundary).
Built-in: OpenAiCompatibleClient.doStreamWithToolLoop uses blocking HttpClient.send(…, ofInputStream()); track + close the InputStream from another thread to interrupt SSE.
Koog: runBlocking { agent.run(...) } → CoroutineScope { async { ... } } + captured Job.cancel().

D-3 — Embabel token reporting

Still unresolved. Requires Embabel JAR inspection that was too expensive in previous runs.

ADK per-request runner rebuild cost (P3)

Rebuilding Gemini + LlmAgent + InMemoryRunner per tool-calling request is correct for HITL context binding but expensive at high QPS. Benchmark + cache-the-runner-and-only-rebuild-tools candidate.

`AgentExecutionContext.retryPolicy()` field threading

RetryPolicy type exists (Phase 9 pre-shipped) but isn't yet referenced from context.

`ToolApprovalPolicy` consumption

Sealed interface shipped in Phase 6 but not yet wired into executeWithApproval — current default is "annotated" (tool.requiresApproval()). Phase 6 follow-up.

Phase 5 — Native `toolCallDelta` emission

SPI default sink ships; per-runtime native delta wiring (Spring AI partial frames, LC4j onPartialToolExecutionRequest, Koog StreamFrame.ToolCallDelta, ADK streaming events) is per-runtime follow-up work.

State of `main` after `19aa501d81`

6/6 runtimes integrated: Built-in, LangChain4j, Spring AI, Koog, ADK, Semantic Kernel
12/12 phases + P0 review fixes landed
All Correctness Invariants satisfied for the websocket + non-websocket paths (Mode Parity closed)
phase_roadmap_blockers.md is the living log of judgment calls, in-tree memory
Two secret gists: this report + the Phase 0 plan (c17c13d2bdc3c70f14af82446671cba4)

jfarcand/unified-agent-api-phase-report.md Secret

Select an option

No results found

Select an option

No results found

Atmosphere — Unified @Agent API roadmap — autonomous run report

Commits (chronological)

CI verification

`e271e89ab6` (Phase 12) — 7/7 green ✅

`19aa501d81` (P0 fix) — in progress

Coverage

Decisions queued during the run

D-1 — `AgentExecutionResult` record

D-2 — Default `usage()` sink translates to legacy keys

D-3 — Skip Embabel for Phase 1 token reporting

D-4 — Add Koog token reporting (gap fix folded into Phase 1)

D-5 — `TOKEN_USAGE` capability flag reversal

D-6 — Native cancel rollout per runtime

D-7 — `doExecuteWithHandle` template method on `AbstractAgentRuntime`

D-8 — Phase 9 RetryPolicy was already shipped

D-9 — Contract parity assertion + BuiltInAgentRuntime STRUCTURED_OUTPUT fix

D-10 — 5 doc sweeps landed mid-run

2026-04-11 Phase 0 review findings → resolution

P0 — AiPipeline HITL gap ✅ FIXED in `19aa501d81`

P1 — `@Deprecated(forRemoval=true)` on 14-arg constructor ✅ FIXED

P1 — D-9 drafts ✅ LANDED

P2 — Embabel README HITL gap note ✅ DONE

P2 — `hitlPendingApprovalEmitsProtocolEvent` contract assertion ✅ DONE

Review observations ChefFamille flagged as excellent (kept as-is)

Deferred — follow-up work

Phase 12 — SK tool calling

Native cancel — LC4j / Built-in / Koog (D-6)

D-3 — Embabel token reporting

ADK per-request runner rebuild cost (P3)

`AgentExecutionContext.retryPolicy()` field threading

`ToolApprovalPolicy` consumption

Phase 5 — Native `toolCallDelta` emission

State of `main` after `19aa501d81`

jfarcand/unified-agent-api-phase-report.md Secret

Atmosphere — Unified @Agent API roadmap — autonomous run report

Commits (chronological)

CI verification

e271e89ab6 (Phase 12) — 7/7 green ✅

19aa501d81 (P0 fix) — in progress

Coverage

Decisions queued during the run

D-1 — AgentExecutionResult record

D-2 — Default usage() sink translates to legacy keys

D-3 — Skip Embabel for Phase 1 token reporting

D-4 — Add Koog token reporting (gap fix folded into Phase 1)

D-5 — TOKEN_USAGE capability flag reversal

D-6 — Native cancel rollout per runtime

D-7 — doExecuteWithHandle template method on AbstractAgentRuntime

D-8 — Phase 9 RetryPolicy was already shipped

D-9 — Contract parity assertion + BuiltInAgentRuntime STRUCTURED_OUTPUT fix

D-10 — 5 doc sweeps landed mid-run

2026-04-11 Phase 0 review findings → resolution

P0 — AiPipeline HITL gap ✅ FIXED in 19aa501d81

P1 — @Deprecated(forRemoval=true) on 14-arg constructor ✅ FIXED

P1 — D-9 drafts ✅ LANDED

P2 — Embabel README HITL gap note ✅ DONE

P2 — hitlPendingApprovalEmitsProtocolEvent contract assertion ✅ DONE

Review observations ChefFamille flagged as excellent (kept as-is)

Deferred — follow-up work

Phase 12 — SK tool calling

Native cancel — LC4j / Built-in / Koog (D-6)

D-3 — Embabel token reporting

ADK per-request runner rebuild cost (P3)

AgentExecutionContext.retryPolicy() field threading

ToolApprovalPolicy consumption

Phase 5 — Native toolCallDelta emission

State of main after 19aa501d81

`e271e89ab6` (Phase 12) — 7/7 green ✅

`19aa501d81` (P0 fix) — in progress

D-1 — `AgentExecutionResult` record

D-2 — Default `usage()` sink translates to legacy keys

D-5 — `TOKEN_USAGE` capability flag reversal

D-7 — `doExecuteWithHandle` template method on `AbstractAgentRuntime`

P0 — AiPipeline HITL gap ✅ FIXED in `19aa501d81`

P1 — `@Deprecated(forRemoval=true)` on 14-arg constructor ✅ FIXED

P2 — `hitlPendingApprovalEmitsProtocolEvent` contract assertion ✅ DONE

`AgentExecutionContext.retryPolicy()` field threading

`ToolApprovalPolicy` consumption

Phase 5 — Native `toolCallDelta` emission

State of `main` after `19aa501d81`