Session: 2026-04-10 → 2026-04-11 · main branch · worktree-per-phase workflow
Latest push: 19aa501d81 (P0 AiPipeline HITL gap fix + D-9 drafts + P1/P2 follow-ups)
Previous milestones: e271e89ab6 (Phase 12 SK runtime #6 — 7/7 CI green), c879ae1779 (Phases 8/10/11 SPI shells — 7/7 CI green)
| # | SHA | Scope | Title |
|---|---|---|---|
| 1 | 5fb5aac966 |
Phase 0 | feat(ai): unify HITL approval path across runtime bridges |
| 2 | db7520a0e3 |
Phase 0+docs | docs sweeps + BuiltInAgentRuntime STRUCTURED_OUTPUT README restore |
| 3 | 3ceb2a9679 |
ADK HITL | route ADK tool invocation through executeWithApproval (per-request runner rebuild) |
| 4 | 2380ef8e4f |
Phase 1 | promote ai.tokens.* to typed StreamingSession.usage() |
| 5 | 0e01c5e3c7 |
Phase 2 | ExecutionHandle SPI + Spring AI Reactor Disposable wrapper |
| 6 | 27cdc9c69d |
Phase 3 | AgentLifecycleListener SPI + fireXxx helpers |
| 7 | c6b63508e1 |
Phase 4 | Content.Audio variant + AgentExecutionContext.parts() |
| 8 | 392565fe68 |
Phase 5 | StreamingSession.toolCallDelta default sink |
| 9 | c83469a478 |
Phase 6 | ToolApprovalPolicy sealed interface (AllowAll/DenyAll/Annotated/Custom) |
| 10 | 05a4d25e99 |
Phase 7 | capability flags (PROMPT_CACHING, MULTI_AGENT_HANDOFF, CANCELLATION, MODEL_ENUMERATION, TOKEN_USAGE) |
| 11 | c879ae1779 |
Phase 8/10/11 | EmbeddingRuntime SPI + ToolArgumentValidator + AgentRuntime.models() |
| 12 | a0ebcc6b6a |
D-1 + D-6 ADK | AgentExecutionResult record + ADK native cancel via AdkEventAdapter |
| 13 | e271e89ab6 |
Phase 12 | Microsoft Semantic Kernel runtime #6 (new modules/semantic-kernel/) |
| 14 | 19aa501d81 |
P0 review fixes | AiPipeline HITL gap + D-9 + P1 @Deprecated + P2 Embabel note + P2 contract assertion |
Phase 9 (RetryPolicy) was already in tree — see D-8.
| Workflow | Run ID | Status |
|---|---|---|
| CI: Core (JDK 21/26) | — | success |
| CI: E2E (Playwright) | — | success |
| CI: Samples | — | success |
| CI: Benchmarks | — | success |
| CI: Dependency Graph | — | success |
| Security: CodeQL | — | success |
| Deploy: SNAPSHOT | — | success |
7 workflows running at push time; stop rule in effect — the 09:52 EDT wakeup handles the green check.
- ~60 new tests across the roadmap:
- 15 HITL approval tests (4 runtimes × 3 outcomes + shared helper)
- 6 TokenUsage / StreamingSession sink tests
- 4 ADK approval bridge tests
- 5 ExecutionHandle tests
- 5 AgentLifecycleListener tests
- 5 ToolApprovalPolicy tests
- 4 AgentExecutionResult tests (D-1)
- 4 SemanticKernel smoke tests (Phase 12)
- 3 AiPipeline HITL regression tests (P0 fix, 2026-04-11)
- 7 new SPI types:
TokenUsage,ExecutionHandle,AgentLifecycleListener,ToolApprovalPolicy,EmbeddingRuntime,AgentExecutionResult,Content.Audio - 3 new helpers:
ToolArgumentValidator,Content.audio()factory,AiPipeline.tryResolveApproval() - 5 new capability flags:
PROMPT_CACHING,MULTI_AGENT_HANDOFF,CANCELLATION,MODEL_ENUMERATION,TOKEN_USAGE ApprovalGateExecutordeleted — single source of truth viaToolExecutionHelper.executeWithApproval- 6/6 runtime bridges (LC4j / Spring AI / Koog / ADK / Built-in / Semantic Kernel) — 5 route through unified HITL at the bridge layer; Embabel documented as HITL-exempt
- 1 new module:
modules/semantic-kernel/with SK-Java 1.4.0 (semantickernel-api+semantickernel-aiservices-openai, bothprovidedscope)
Status: RESOLVED in a0ebcc6b6a. Record lives at modules/ai/.../AgentExecutionResult.java. AgentRuntime.generateResult(context) wraps CollectingSession via composition and captures usage() events alongside text. 4 unit tests.
Wire-format compatible — existing Micrometer / budget interceptors unchanged.
Still a follow-up. Reliable wiring requires Embabel JAR inspection; process.blackboard.lastResult() doesn't expose usage directly.
Koog's onLLMCallCompleted now reads ResponseMetaInfo.{input,output,total}TokensCount. 5/6 runtimes honest.
Initially skipped (Runtime Truth); Phase 7 re-added alongside the other flags for routing/discovery use cases.
Status: Spring AI shipped in Phase 2; ADK shipped in a0ebcc6b6a via AdkEventAdapter.whenDone() + doExecuteWithHandle. LC4j, Built-in, Koog still on the default ExecutionHandle.completed() — documented per-runtime refactors.
Single override point shared by execute and executeWithHandle.
Discovered when the worktree's RetryPolicy.java already implemented the interface.
Status: RESOLVED in 19aa501d81. ChefFamille's drafts (runtimeWithSystemPromptAlsoDeclaresStructuredOutput + BuiltInAgentRuntime.capabilities() declaring STRUCTURED_OUTPUT) pulled from stash@{0} into the P0 commit.
Rebased cleanly.
ChefFamille reviewed the Phase 0 commits end-to-end and found a critical gap that the per-runtime bridge tests and the e2e wire-format test did not catch.
Finding: AiStreamingSession was the only main-source producer of ApprovalStrategy. Every non-websocket entry point (AgentProcessor, CoordinatorProcessor, AgUiHandler, ChannelAiBridge) built AgentExecutionContext via AiPipeline.execute() → the 14-arg constructor → approvalStrategy = null. Runtime bridges correctly routed through executeWithApproval, but the helper fell through to direct execution when the strategy was null. So @RequiresApproval tools silently bypassed gating on A2A / @Coordinator / AG-UI / Slack / Telegram / Discord / WhatsApp / Messenger paths. Violation of Correctness Invariant #7 (Mode Parity).
Fix: AiPipeline now owns an ApprovalRegistry and threads ApprovalStrategy.virtualThread(approvalRegistry) into the 15-arg context constructor on every execute() call. New tryResolveApproval(String) + approvalRegistry() accessors let callers route protocol-specific approval messages back through the parked VT. Regression test AiPipelineHitlTest (3 assertions, 6s total) proves:
- Pipeline threads a non-null strategy into the bridge.
@RequiresApprovaltools hit the gate and time out safely when no client responds.- Non-
@RequiresApprovaltools still bypass the gate unchanged. tryResolveApprovalaccepts/__approval/<id>/{approve,deny}messages.
Default behavior on non-websocket paths: safe timeout — tools don't execute unless a channel-specific approval UX wires responses through tryResolveApproval.
Added to AgentExecutionContext with Javadoc pointing at the Phase 0 review gap. Under -Werror, every new new AgentExecutionContext(...) with 14 args surfaces a compile error at PR review. Fixed the one legitimate main-source consumer (OnSessionCloseStrategy — no tools; uses 15-arg with explicit null). 11 legitimate non-HITL callers (LLM judge, test utilities, contract tests, routing tests) got class-level @SuppressWarnings({"deprecation", "removal"}) so the warning only fires on new code.
ChefFamille's stash@{0} drafts landed in the P0 commit. BuiltInAgentRuntime.capabilities() now declares STRUCTURED_OUTPUT with a Javadoc explaining both mechanisms (pipeline wrapping + native jsonMode). The new contract assertion runtimeWithSystemPromptAlsoDeclaresStructuredOutput() enforces that every runtime declaring SYSTEM_PROMPT also declares STRUCTURED_OUTPUT (Runtime Truth Invariant #5).
One paragraph in modules/embabel/README.md documenting that Embabel delegates tool execution to its own AgentPlatform so @RequiresApproval is not honored on Embabel-backed flows. Points to workarounds (enforce at @AgentAction level, or route sensitive tools through a different runtime).
Cross-runtime assertion in AbstractAgentRuntimeContractTest. Any runtime declaring TOOL_CALLING that fails to route through executeWithApproval lights this up. Runtimes without tool calling (Embabel, SK Phase 12) skip cleanly.
- Spring AI per-request callback capture over
ThreadLocal— cleaner than the plan proposed; recommended pattern for future runtimes. - ADK per-request runner rebuild — noted in Javadoc at
AdkAgentRuntime:132-135as the only way to bind HITL context per invocation. Flagged for a P3 performance benchmark. ApprovalGateExecutorfully deleted — zero references remain, confirmed by grep.
SemanticKernelToolBridge intentionally not shipped in the first Phase 12 commit. SK's @DefineKernelFunction-driven plugin system needs either a compile-time annotation processor or runtime bytecode synthesis to map Atmosphere's dynamic ToolDefinition shape. Phase 12 declares TEXT_STREAMING + SYSTEM_PROMPT + CONVERSATION_MEMORY + TOKEN_USAGE (4/4 for which SK has direct API support). Tool calling is a dedicated follow-up.
SPI in since Phase 2; Spring AI wired in 0e01c5e3c7; ADK wired in a0ebcc6b6a. Remaining three have per-runtime refactoring notes in the blockers log:
- LC4j:
StreamingChatResponseHandlerhas no direct HTTP cancel; either subclassOpenAiStreamingChatModelor flip a polled flag (soft-cancel at next token boundary). - Built-in:
OpenAiCompatibleClient.doStreamWithToolLoopuses blockingHttpClient.send(…, ofInputStream()); track + close theInputStreamfrom another thread to interrupt SSE. - Koog:
runBlocking { agent.run(...) }→CoroutineScope { async { ... } }+ capturedJob.cancel().
Still unresolved. Requires Embabel JAR inspection that was too expensive in previous runs.
Rebuilding Gemini + LlmAgent + InMemoryRunner per tool-calling request is correct for HITL context binding but expensive at high QPS. Benchmark + cache-the-runner-and-only-rebuild-tools candidate.
RetryPolicy type exists (Phase 9 pre-shipped) but isn't yet referenced from context.
Sealed interface shipped in Phase 6 but not yet wired into executeWithApproval — current default is "annotated" (tool.requiresApproval()). Phase 6 follow-up.
SPI default sink ships; per-runtime native delta wiring (Spring AI partial frames, LC4j onPartialToolExecutionRequest, Koog StreamFrame.ToolCallDelta, ADK streaming events) is per-runtime follow-up work.
- 6/6 runtimes integrated: Built-in, LangChain4j, Spring AI, Koog, ADK, Semantic Kernel
- 12/12 phases + P0 review fixes landed
- All Correctness Invariants satisfied for the websocket + non-websocket paths (Mode Parity closed)
phase_roadmap_blockers.mdis the living log of judgment calls, in-tree memory- Two secret gists: this report + the Phase 0 plan (
c17c13d2bdc3c70f14af82446671cba4)