Skip to content

Instantly share code, notes, and snippets.

@jfarcand
Last active April 11, 2026 13:34
Show Gist options
  • Select an option

  • Save jfarcand/128d6fe48a195fe1a3b1aaab4fb8592a to your computer and use it in GitHub Desktop.

Select an option

Save jfarcand/128d6fe48a195fe1a3b1aaab4fb8592a to your computer and use it in GitHub Desktop.
Atmosphere: Unified @agent API phased roadmap — Phases 0-12 + 2026-04-11 Phase 0 review fixes (P0 AiPipeline HITL gap, D-1 AgentExecutionResult, D-6 ADK cancel, D-9 STRUCTURED_OUTPUT, Phase 12 Semantic Kernel #6)

Atmosphere — Unified @Agent API roadmap — autonomous run report

Session: 2026-04-10 → 2026-04-11 · main branch · worktree-per-phase workflow Latest push: 19aa501d81 (P0 AiPipeline HITL gap fix + D-9 drafts + P1/P2 follow-ups) Previous milestones: e271e89ab6 (Phase 12 SK runtime #6 — 7/7 CI green), c879ae1779 (Phases 8/10/11 SPI shells — 7/7 CI green)


Commits (chronological)

# SHA Scope Title
1 5fb5aac966 Phase 0 feat(ai): unify HITL approval path across runtime bridges
2 db7520a0e3 Phase 0+docs docs sweeps + BuiltInAgentRuntime STRUCTURED_OUTPUT README restore
3 3ceb2a9679 ADK HITL route ADK tool invocation through executeWithApproval (per-request runner rebuild)
4 2380ef8e4f Phase 1 promote ai.tokens.* to typed StreamingSession.usage()
5 0e01c5e3c7 Phase 2 ExecutionHandle SPI + Spring AI Reactor Disposable wrapper
6 27cdc9c69d Phase 3 AgentLifecycleListener SPI + fireXxx helpers
7 c6b63508e1 Phase 4 Content.Audio variant + AgentExecutionContext.parts()
8 392565fe68 Phase 5 StreamingSession.toolCallDelta default sink
9 c83469a478 Phase 6 ToolApprovalPolicy sealed interface (AllowAll/DenyAll/Annotated/Custom)
10 05a4d25e99 Phase 7 capability flags (PROMPT_CACHING, MULTI_AGENT_HANDOFF, CANCELLATION, MODEL_ENUMERATION, TOKEN_USAGE)
11 c879ae1779 Phase 8/10/11 EmbeddingRuntime SPI + ToolArgumentValidator + AgentRuntime.models()
12 a0ebcc6b6a D-1 + D-6 ADK AgentExecutionResult record + ADK native cancel via AdkEventAdapter
13 e271e89ab6 Phase 12 Microsoft Semantic Kernel runtime #6 (new modules/semantic-kernel/)
14 19aa501d81 P0 review fixes AiPipeline HITL gap + D-9 + P1 @Deprecated + P2 Embabel note + P2 contract assertion

Phase 9 (RetryPolicy) was already in tree — see D-8.


CI verification

e271e89ab6 (Phase 12) — 7/7 green ✅

Workflow Run ID Status
CI: Core (JDK 21/26) success
CI: E2E (Playwright) success
CI: Samples success
CI: Benchmarks success
CI: Dependency Graph success
Security: CodeQL success
Deploy: SNAPSHOT success

19aa501d81 (P0 fix) — in progress

7 workflows running at push time; stop rule in effect — the 09:52 EDT wakeup handles the green check.


Coverage

  • ~60 new tests across the roadmap:
    • 15 HITL approval tests (4 runtimes × 3 outcomes + shared helper)
    • 6 TokenUsage / StreamingSession sink tests
    • 4 ADK approval bridge tests
    • 5 ExecutionHandle tests
    • 5 AgentLifecycleListener tests
    • 5 ToolApprovalPolicy tests
    • 4 AgentExecutionResult tests (D-1)
    • 4 SemanticKernel smoke tests (Phase 12)
    • 3 AiPipeline HITL regression tests (P0 fix, 2026-04-11)
  • 7 new SPI types: TokenUsage, ExecutionHandle, AgentLifecycleListener, ToolApprovalPolicy, EmbeddingRuntime, AgentExecutionResult, Content.Audio
  • 3 new helpers: ToolArgumentValidator, Content.audio() factory, AiPipeline.tryResolveApproval()
  • 5 new capability flags: PROMPT_CACHING, MULTI_AGENT_HANDOFF, CANCELLATION, MODEL_ENUMERATION, TOKEN_USAGE
  • ApprovalGateExecutor deleted — single source of truth via ToolExecutionHelper.executeWithApproval
  • 6/6 runtime bridges (LC4j / Spring AI / Koog / ADK / Built-in / Semantic Kernel) — 5 route through unified HITL at the bridge layer; Embabel documented as HITL-exempt
  • 1 new module: modules/semantic-kernel/ with SK-Java 1.4.0 (semantickernel-api + semantickernel-aiservices-openai, both provided scope)

Decisions queued during the run

D-1 — AgentExecutionResult record

Status: RESOLVED in a0ebcc6b6a. Record lives at modules/ai/.../AgentExecutionResult.java. AgentRuntime.generateResult(context) wraps CollectingSession via composition and captures usage() events alongside text. 4 unit tests.

D-2 — Default usage() sink translates to legacy keys

Wire-format compatible — existing Micrometer / budget interceptors unchanged.

D-3 — Skip Embabel for Phase 1 token reporting

Still a follow-up. Reliable wiring requires Embabel JAR inspection; process.blackboard.lastResult() doesn't expose usage directly.

D-4 — Add Koog token reporting (gap fix folded into Phase 1)

Koog's onLLMCallCompleted now reads ResponseMetaInfo.{input,output,total}TokensCount. 5/6 runtimes honest.

D-5 — TOKEN_USAGE capability flag reversal

Initially skipped (Runtime Truth); Phase 7 re-added alongside the other flags for routing/discovery use cases.

D-6 — Native cancel rollout per runtime

Status: Spring AI shipped in Phase 2; ADK shipped in a0ebcc6b6a via AdkEventAdapter.whenDone() + doExecuteWithHandle. LC4j, Built-in, Koog still on the default ExecutionHandle.completed() — documented per-runtime refactors.

D-7 — doExecuteWithHandle template method on AbstractAgentRuntime

Single override point shared by execute and executeWithHandle.

D-8 — Phase 9 RetryPolicy was already shipped

Discovered when the worktree's RetryPolicy.java already implemented the interface.

D-9 — Contract parity assertion + BuiltInAgentRuntime STRUCTURED_OUTPUT fix

Status: RESOLVED in 19aa501d81. ChefFamille's drafts (runtimeWithSystemPromptAlsoDeclaresStructuredOutput + BuiltInAgentRuntime.capabilities() declaring STRUCTURED_OUTPUT) pulled from stash@{0} into the P0 commit.

D-10 — 5 doc sweeps landed mid-run

Rebased cleanly.


2026-04-11 Phase 0 review findings → resolution

ChefFamille reviewed the Phase 0 commits end-to-end and found a critical gap that the per-runtime bridge tests and the e2e wire-format test did not catch.

P0 — AiPipeline HITL gap ✅ FIXED in 19aa501d81

Finding: AiStreamingSession was the only main-source producer of ApprovalStrategy. Every non-websocket entry point (AgentProcessor, CoordinatorProcessor, AgUiHandler, ChannelAiBridge) built AgentExecutionContext via AiPipeline.execute() → the 14-arg constructor → approvalStrategy = null. Runtime bridges correctly routed through executeWithApproval, but the helper fell through to direct execution when the strategy was null. So @RequiresApproval tools silently bypassed gating on A2A / @Coordinator / AG-UI / Slack / Telegram / Discord / WhatsApp / Messenger paths. Violation of Correctness Invariant #7 (Mode Parity).

Fix: AiPipeline now owns an ApprovalRegistry and threads ApprovalStrategy.virtualThread(approvalRegistry) into the 15-arg context constructor on every execute() call. New tryResolveApproval(String) + approvalRegistry() accessors let callers route protocol-specific approval messages back through the parked VT. Regression test AiPipelineHitlTest (3 assertions, 6s total) proves:

  1. Pipeline threads a non-null strategy into the bridge.
  2. @RequiresApproval tools hit the gate and time out safely when no client responds.
  3. Non-@RequiresApproval tools still bypass the gate unchanged.
  4. tryResolveApproval accepts /__approval/<id>/{approve,deny} messages.

Default behavior on non-websocket paths: safe timeout — tools don't execute unless a channel-specific approval UX wires responses through tryResolveApproval.

P1 — @Deprecated(forRemoval=true) on 14-arg constructor ✅ FIXED

Added to AgentExecutionContext with Javadoc pointing at the Phase 0 review gap. Under -Werror, every new new AgentExecutionContext(...) with 14 args surfaces a compile error at PR review. Fixed the one legitimate main-source consumer (OnSessionCloseStrategy — no tools; uses 15-arg with explicit null). 11 legitimate non-HITL callers (LLM judge, test utilities, contract tests, routing tests) got class-level @SuppressWarnings({"deprecation", "removal"}) so the warning only fires on new code.

P1 — D-9 drafts ✅ LANDED

ChefFamille's stash@{0} drafts landed in the P0 commit. BuiltInAgentRuntime.capabilities() now declares STRUCTURED_OUTPUT with a Javadoc explaining both mechanisms (pipeline wrapping + native jsonMode). The new contract assertion runtimeWithSystemPromptAlsoDeclaresStructuredOutput() enforces that every runtime declaring SYSTEM_PROMPT also declares STRUCTURED_OUTPUT (Runtime Truth Invariant #5).

P2 — Embabel README HITL gap note ✅ DONE

One paragraph in modules/embabel/README.md documenting that Embabel delegates tool execution to its own AgentPlatform so @RequiresApproval is not honored on Embabel-backed flows. Points to workarounds (enforce at @AgentAction level, or route sensitive tools through a different runtime).

P2 — hitlPendingApprovalEmitsProtocolEvent contract assertion ✅ DONE

Cross-runtime assertion in AbstractAgentRuntimeContractTest. Any runtime declaring TOOL_CALLING that fails to route through executeWithApproval lights this up. Runtimes without tool calling (Embabel, SK Phase 12) skip cleanly.

Review observations ChefFamille flagged as excellent (kept as-is)

  • Spring AI per-request callback capture over ThreadLocal — cleaner than the plan proposed; recommended pattern for future runtimes.
  • ADK per-request runner rebuild — noted in Javadoc at AdkAgentRuntime:132-135 as the only way to bind HITL context per invocation. Flagged for a P3 performance benchmark.
  • ApprovalGateExecutor fully deleted — zero references remain, confirmed by grep.

Deferred — follow-up work

Phase 12 — SK tool calling

SemanticKernelToolBridge intentionally not shipped in the first Phase 12 commit. SK's @DefineKernelFunction-driven plugin system needs either a compile-time annotation processor or runtime bytecode synthesis to map Atmosphere's dynamic ToolDefinition shape. Phase 12 declares TEXT_STREAMING + SYSTEM_PROMPT + CONVERSATION_MEMORY + TOKEN_USAGE (4/4 for which SK has direct API support). Tool calling is a dedicated follow-up.

Native cancel — LC4j / Built-in / Koog (D-6)

SPI in since Phase 2; Spring AI wired in 0e01c5e3c7; ADK wired in a0ebcc6b6a. Remaining three have per-runtime refactoring notes in the blockers log:

  • LC4j: StreamingChatResponseHandler has no direct HTTP cancel; either subclass OpenAiStreamingChatModel or flip a polled flag (soft-cancel at next token boundary).
  • Built-in: OpenAiCompatibleClient.doStreamWithToolLoop uses blocking HttpClient.send(…, ofInputStream()); track + close the InputStream from another thread to interrupt SSE.
  • Koog: runBlocking { agent.run(...) }CoroutineScope { async { ... } } + captured Job.cancel().

D-3 — Embabel token reporting

Still unresolved. Requires Embabel JAR inspection that was too expensive in previous runs.

ADK per-request runner rebuild cost (P3)

Rebuilding Gemini + LlmAgent + InMemoryRunner per tool-calling request is correct for HITL context binding but expensive at high QPS. Benchmark + cache-the-runner-and-only-rebuild-tools candidate.

AgentExecutionContext.retryPolicy() field threading

RetryPolicy type exists (Phase 9 pre-shipped) but isn't yet referenced from context.

ToolApprovalPolicy consumption

Sealed interface shipped in Phase 6 but not yet wired into executeWithApproval — current default is "annotated" (tool.requiresApproval()). Phase 6 follow-up.

Phase 5 — Native toolCallDelta emission

SPI default sink ships; per-runtime native delta wiring (Spring AI partial frames, LC4j onPartialToolExecutionRequest, Koog StreamFrame.ToolCallDelta, ADK streaming events) is per-runtime follow-up work.


State of main after 19aa501d81

  • 6/6 runtimes integrated: Built-in, LangChain4j, Spring AI, Koog, ADK, Semantic Kernel
  • 12/12 phases + P0 review fixes landed
  • All Correctness Invariants satisfied for the websocket + non-websocket paths (Mode Parity closed)
  • phase_roadmap_blockers.md is the living log of judgment calls, in-tree memory
  • Two secret gists: this report + the Phase 0 plan (c17c13d2bdc3c70f14af82446671cba4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment