Atmosphere Agent Framework — Deep Code-Level GAP Analysis

Revision 2 — Professional-Grade Source-Code Deep Dive

Date: July 2025 Scope: Atmosphere 4.0.x vs. 18+ agent frameworks (Python, TypeScript, Java, .NET) + personal agents Method: Line-by-line source code reading of Atmosphere + actual GitHub source of competing frameworks

Executive Summary
Atmosphere Architecture Deep Dive
Claude Code Agent SDK Deep Dive
Competing Framework Source-Level Analysis
atmosphere-skills Ecosystem
GAP Analysis (22 Gaps)
What Atmosphere Leads On
Recommended Roadmap

Executive Summary

Atmosphere is best-in-class for: runtime portability (7 agent backends swappable via Maven), multi-protocol convergence (MCP + A2A + AG-UI + gRPC + WebTransport), durable HITL with virtual-thread-parked approval gates, multi-transport streaming with auto-fallback, fan-out strategies (AllResponses/FirstComplete/FastestStreamingTexts), and circuit-breaker model routing.

22 gaps identified after exhaustive code-level comparison against LangGraph, CrewAI, AutoGen/AG2, Pydantic AI, Smolagents, Google ADK, OpenAI Agents SDK, Mastra, BeeAgent, Vercel AI SDK, Spring AI, LangChain4j, Semantic Kernel, Claude Code Agent SDK, OpenClaw, and NagaAgent.

Key finding: Atmosphere's core implementation depth (tool approval, agent loop, checkpoint store) is more sophisticated than initial analysis suggested. The gaps are real but narrower than surface-level feature comparison implies.

Atmosphere Architecture Deep Dive

1. Agent Loop — `OpenAiCompatibleClient.doStreamWithToolLoop()`

File: modules/ai/src/main/java/org/atmosphere/ai/llm/OpenAiCompatibleClient.java Lines: 191–364 (174 lines of core loop logic)

The agent loop is a recursive streaming loop with hard limits:

MAX_TOOL_ROUNDS = 5 (line 64)

Execution sequence (traced from source):

Request Phase (line 197–214): Decides between OpenAI Responses API (if responseIdCache has a prior ID for this conversation) or Chat Completions API. The Responses API path enables stateful multi-turn where OpenAI manages message history server-side — a significant optimization that no other Java framework implements.
HTTP Send (line 215): sendWithRetry(requestBody, endpoint, session, request.retryPolicy()) — per-request retry with Retry-After header respect on 429s. The retry policy is injectable per-call (line 368–442), not just global.
SSE Streaming (line 239–287): Line-by-line SSE parsing with ToolCallAccumulator for incremental assembly of tool call arguments that arrive across multiple SSE chunks. The in-flight InputStream is threaded to the caller via an AtomicReference<Closeable> streamSink (line 249–251) — this is how hard cancellation works: the caller closes the HTTP stream from another thread.
Tool Execution (line 295–364): For each accumulated tool call:
- Emits AiEvent.ToolStart(toolName, args) (line 326)
- Fires AgentLifecycleListener.fireToolCall() (line 327–328) — listeners can observe but not block
- Calls ToolExecutionHelper.executeWithApproval() (line 341–343) — the approval gate
- Emits AiEvent.ToolResult(toolName, resultStr) (line 344)
- Adds tool result as ChatMessage.tool() to message history
Recursion (line 350–361): Re-invokes doStreamWithToolLoop() with round+1, carrying updated messages. Terminates when round >= MAX_TOOL_ROUNDS or no tool calls in response.

Hard cancellation mechanism: BuiltInAgentRuntime.doExecuteWithHandle() (line 91–122) creates an AtomicBoolean cancelled + AtomicReference<Closeable> inFlightStream. The cancel() method sets the flag AND closes the HTTP stream, causing the SSE loop to exit with an IOException which is caught at line 163–188 and completed cleanly.

Multi-modal support: Last user message is transformed into OpenAI multi-content array format (line 638–650) supporting Content.Image, Content.Audio, and mixed content.

Prompt caching: CacheHint → prompt_cache_key JSON field (line 161–166), forwarding cached_tokens in usage (line 564–580).

2. Tool Approval Gate — `ToolExecutionHelper.executeWithApproval()`

File: modules/ai/src/main/java/org/atmosphere/ai/tool/ToolExecutionHelper.java Lines: 144–209 (66 lines)

This is Atmosphere's most security-critical code path. Traced step-by-step:

Argument validation (line 149–153): ToolArgumentValidator.validate(tool, args) runs at the boundary before any execution. Validation errors return structured JSON so the LLM can retry with corrected arguments.
Policy resolution (line 154): Uses supplied ToolApprovalPolicy or defaults to annotated() (checks @RequiresApproval annotation).
Fast-path (line 156–158): If policy says no approval needed → execute directly. No overhead.
DenyAll (line 165–168): If policy is DenyAll → reject immediately (security: fail-closed).
Fail-closed when no strategy (line 176–182): Tool requires approval but no ApprovalStrategy is wired → returns error JSON. This was explicitly changed from previous fail-open behavior. This is more secure than any competing framework.
Virtual thread parking (line 184–192): Creates PendingApproval with unique ID + timeout (default 300s from @RequiresApproval.timeout()). Calls strategy.awaitApproval(approval, session) which parks the virtual thread via CompletableFuture.get(timeout) — cheap on Loom, doesn't pin an OS thread.
Outcome handling (line 195–208): APPROVED → execute tool. DENIED → return cancellation JSON. TIMED_OUT → return timeout JSON.

ApprovalRegistry (line 51–150): The client-facing approval flow works via message pattern matching: /__approval/{id}/approve or /__approval/{id}/deny. The resolve() method (line 83–108) validates the prefix, extracts the ID, removes from pending map, and completes the CompletableFuture<Boolean>. Thread-safe via ConcurrentHashMap.

Comparison: OpenAI Agents SDK has MCPToolApprovalFunction (synchronous blocking). Claude Code has 6 permission modes. LangGraph has NO approval framework. CrewAI has NO approval framework. Atmosphere's is the most sophisticated: fail-closed default, virtual-thread-aware, timeout-bounded, structured denial responses.

3. Skill Resolution — Three-Tier with Integrity Verification

File: modules/ai/src/main/java/org/atmosphere/ai/PromptLoader.java Lines: 108–254 (147 lines of resolution logic)

Three-tier resolution with SHA-256 integrity:

Tier 1 — Classpath: META-INF/skills/{name}/SKILL.md, prompts/{name}-skill.md, prompts/{name}.md
Tier 2 — Disk cache: ~/.atmosphere/skills/{name}/SKILL.md
Tier 3 — GitHub: raw.githubusercontent.com/{repo}/{branch}/skills/{name}/SKILL.md

SHA-256 integrity (line 216–254): When fetching from GitHub, computes SHA-256 hash of downloaded content and verifies against registry.json hashes. Configurable via system properties:

atmosphere.skills.repo — GitHub org/repo (default: Atmosphere/atmosphere-skills)
atmosphere.skills.branch — branch (default: main)
atmosphere.skills.offline — disable GitHub fetch

Cache behavior (line 106–113): ConcurrentHashMap with NOT_FOUND_SENTINEL pattern — prevents repeated failed lookups from hammering GitHub.

Graceful degradation (line 137): If skill not found anywhere, falls back to "You are a helpful assistant." with warning log. This means agents always start, even with missing skill files.

4. Skill File Parsing — `SkillFileParser`

File: modules/agent/src/main/java/org/atmosphere/agent/skill/SkillFileParser.java Lines: 69–178 (110 lines)

Key design decision: The entire raw file content IS the system prompt (systemPrompt() returns rawContent verbatim at line 130–132). Sections (## Tools, ## Skills, ## Channels, etc.) are extracted for metadata wiring but don't modify the prompt text.

Section parsing (line 105–112): Tracks ## headers outside fenced code blocks (backtick detection at line 95–98). Uses LinkedHashMap to preserve document order.

Tool cross-referencing (AgentProcessor line 384–399): crossReferenceTools() validates that tools mentioned in the skill file's ## Tools section match registered @AiTool methods. This catches drift between skill documentation and actual tool implementations at startup.

Notable gap: No YAML frontmatter support despite the atmosphere-skills README mentioning it. The parser only handles Markdown sections.

5. Agent Registration Pipeline — `AgentProcessor.handle()`

File: modules/agent/src/main/java/org/atmosphere/agent/processor/AgentProcessor.java Lines: 65–161 (97 lines of registration logic)

12-step registration (traced from source):

Get @Agent annotation (line 68)
Extract agent name → path /atmosphere/agent/{name} (line 73)
Parse skill file via parseSkillFile(annotation) (line 88)
Extract system prompt from skill file (line 89)
Scan for @AiTool methods → build tool registry (line 95–100)
Scan for @Prompt methods → register handler (line 102–106)
Cross-reference tools with skill file (line 108)
Build A2A Skill objects from skill file ## Skills section (line 477–497)
Register at MCP if configured (line 140)
Register at AG-UI if configured (line 143)
Register channels from skill file ## Channels section (line 145)
Create AiEndpointHandler with all wired components (line 150)

Auto-generated @Prompt (line 295, SyntheticPrompt inner class): When an agent has @AiTool methods but no @Prompt method, a synthetic prompt handler is created that simply calls session.stream(message). This means agents can be tool-only.

Headless mode (line 80–84, 168–189): Agents with @Skill methods but no @Prompt are headless — they expose capabilities via A2A/MCP protocols but have no direct user-facing endpoint. This is how agents compose in a fleet.

6. Multi-Agent Orchestration — `DefaultAgentFleet`

File: modules/coordinator/src/main/java/org/atmosphere/coordinator/fleet/DefaultAgentFleet.java Lines: 67–250 (184 lines)

parallel() (line 137–191): Uses Executors.newVirtualThreadPerTaskExecutor() — one virtual thread per agent call. Creates CompletableFuture.supplyAsync() per call, joins with allOf().get(timeoutMs, MILLISECONDS). Per-agent timeout from DefaultAgentProxy.limits(). Cancels siblings on first failure.

pipeline() (line 193–220): Sequential execution. Each call receives _previous_result merged into its args. Aborts on first failure.

route() (line 222–250): Evaluates RoutingSpec conditions against input. Short-circuits on first match. Has otherwise fallback.

evaluate() (separate method): Runs all ResultEvaluator instances, catches individual evaluator exceptions to prevent one broken evaluator from aborting the pipeline.

TODO in code (line 129–135): Comment about replacing with StructuredTaskScope when JEP 525 finalizes.

Transport layer: AgentProxy dispatches to LocalAgentTransport (in-process) or A2aAgentTransport (remote via A2A JSON-RPC). This means fleet agents can be local or remote — transparent to the coordinator.

7. Model Routing & Fan-Out

DefaultModelRouter (modules/ai): Circuit-breaker pattern with configurable maxConsecutiveFailures (default 3) and cooldownPeriod (default 1 minute). Strategies: NONE, FAILOVER, ROUND_ROBIN, CONTENT_BASED. Health tracked via ConcurrentHashMap<String, BackendHealth>.

FanOutStrategy (sealed interface, 3 records):

AllResponses: Streams all model responses in parallel on separate child session IDs
FirstComplete: Keeps first model to finish, cancels all others
FastestStreamingTexts(int threshold): Observes first N streaming events, keeps fastest producer

No competing framework has fan-out strategies. This is unique to Atmosphere.

8. Durable Sessions & Checkpoints

DurableSession (record): token, resourceId, rooms, broadcasters, metadata, createdAt, lastSeen. Immutable with withX() copy methods. The token is sent to the client via X-Atmosphere-Session-Token header.

DurableSessionInterceptor (line 96–146): On WebSocket connect, extracts token from header/query param. If found → restores room/broadcaster state from SessionStore. If not → generates UUID token, creates session, saves to store, returns token in response header.

SessionStore SPI: save(), restore(), remove(), touch(), removeExpired(Duration ttl). Implementations: InMemory, SQLite, Redis.

CheckpointStore SPI (separate from sessions): save(), load(), fork(), list(), delete(), deleteCoordination(). fork() creates child snapshot with parent chain — enables branching workflow execution. SqliteCheckpointStore schema: id TEXT PK, parent_id TEXT, coordination_id TEXT, agent_name TEXT, state_json TEXT, metadata_json TEXT, created_at TEXT.

Key gap: DurableSession stores transport-level state (rooms, broadcasters) but not AI conversation history. CheckpointStore stores workflow state but not conversation state. There's no unified "persistent AI session" that combines: conversation history + tool execution state + working memory + checkpoint. The pieces exist but aren't connected.

9. Memory System

LongTermMemory SPI: saveFact(userId, fact), getFacts(userId, maxFacts), clear(userId). Only InMemoryLongTermMemory implementation exists in-tree. The Javadoc mentions "FactStore (in-memory, Redis, SQLite)" but no Redis or SQLite LongTermMemory implementations ship today — those backends exist only for SessionStore and ConversationPersistence.

SemanticRecallInterceptor (line 57–88): Implements AiInterceptor.preProcess(). Calls ContextProvider.transformQuery() → retrieve() → rerank(). Augments system prompt with retrieved context. Graceful no-op if no ContextProvider available.

EmbeddingRuntime SPI: Exists but not wired into the memory system. LongTermMemory and SemanticRecallInterceptor operate independently — facts are stored as strings, recall uses a separate ContextProvider.

10. Content Safety & Budget Management

ContentSafetyFilter: Pluggable SafetyChecker SPI. Sentence-boundary buffering — doesn't cut mid-sentence. Redaction mode available.

StreamingTextBudgetManager: Per-user/org token budgets enforced during streaming. No competing framework has this built-in.

AiGuardrail SPI: Pre/post processing guardrails. Wired via @AiEndpoint(guardrails = {...}).

ToolArgumentValidator: Schema-based validation at the tool execution boundary. Returns structured errors for LLM retry.

Claude Code Agent SDK Deep Dive

Architecture (from SDK source + official docs)

Two execution modes (from src/claude_agent_sdk/):

query() (stateless): Unidirectional, fire-and-forget. Signature:

async def query(*, prompt: str | AsyncIterable[dict[str, Any]],
                options: ClaudeAgentOptions | None = None,
                transport: Transport | None = None) -> AsyncIterator[Message]

ClaudeSDKClient (bidirectional): Stateful, supports interruption, hooks, and real-time interaction. 23.6 KB implementation file.

Hook System — 18+ Lifecycle Points

PreToolUse, PostToolUse, PostToolUseFailure
UserPromptSubmit, Stop
SubagentStart, SubagentStop
PreCompact, PostCompact
PermissionRequest
SessionStart, SessionEnd
Notification
PreAgentTurn, PostAgentTurn
ToolError
EditApproval

Hooks use HookEvent + HookMatcher pattern — matchers can filter by tool name, permission type, or custom predicates. Return values can modify behavior (e.g., PreToolUse can return deny to block a tool).

Atmosphere comparison: AgentLifecycleListener has 5 events: onStart, onToolCall, onToolResult, onCompletion, onError. No matcher system, no behavior modification from listeners. AiInterceptor provides pre/post processing but isn't as granular as Claude's per-tool hooks.

Permission Model — 6 Explicit Modes

"default" | "acceptEdits" | "plan" | "bypassPermissions" | "dontAsk" | "auto"

Defined as Literal type in types.py (40.8 KB file). Each mode changes which tools require approval vs. auto-execute. Rule-based updates allow dynamic permission changes during execution.

Atmosphere comparison: ToolApprovalPolicy has 4 modes: annotated(), DenyAll, AllowAll, and custom Predicate<String>. Per-tool via @RequiresApproval annotation. Missing: no session-level permission mode, no dynamic permission changes during execution, no plan-mode equivalent.

Subagents — Markdown-Defined with Isolation

Defined in .claude/agents/*.md files
Fresh context windows (no shared memory)
Tool restrictions (whitelist)
Model override per subagent
Background mode with git worktree isolation
SubagentStart/SubagentStop hooks

Atmosphere comparison: @Agent with @Coordinator + AgentFleet. Local agents share JVM memory. Remote agents via A2A. No per-agent tool restrictions (all tools registered globally). No background mode with isolation. However, Atmosphere's fleet orchestration (parallel/pipeline/route) is more sophisticated than Claude's subagent dispatch.

Skills — Agent Skills Open Standard

SKILL.md format (agentskills.io)
Argument substitution: $ARGUMENTS, $0, $1
Path-scoping: skill only available in certain directories
context:fork — creates fresh context window for skill execution
4 scopes: project, user, global, system

Atmosphere comparison: Atmosphere supports SKILL.md format but missing: argument substitution ($ARGUMENTS), path-scoping, context:fork semantics, scoped resolution. Atmosphere adds: SHA-256 integrity verification, three-tier resolution (classpath → disk → GitHub), skill registry with curated hashes.

Tool Search — Dynamic Discovery

10,000+ tool definitions indexed
Definitions withheld from LLM context window
Loaded on demand when needed
Semantic search over tool descriptions

Atmosphere has nothing equivalent. Tools are registered at startup and all injected into every request. For agents with many tools, this bloats the context window.

File Checkpointing — Per-File Change Tracking

Each file change gets a UUID
rewindFiles() API to undo changes
Full change history per file
Granular undo (specific files, not all-or-nothing)

Atmosphere's CheckpointStore is workflow-state-level, not file-level. fork() creates branching snapshots but of serialized state, not file diffs.

Memory — Hierarchical CLAUDE.md

CLAUDE.md files at project/directory/user/system levels
Auto-memory: Claude learns and updates its own memory files
Path-scoped rules: different instructions for different directories
Hierarchical merge: deeper files override shallower ones

Atmosphere's LongTermMemory is flat per-user string facts. No hierarchy, no path-scoping, no auto-memory.

Competing Framework Source-Level Analysis

LangGraph — State Graph Engine

Repo: langchain-ai/langgraph Core file: libs/langgraph/langgraph/graph/state.py

class StateGraph(Generic[StateT, ContextT, InputT, OutputT]):
    """Graph whose nodes communicate via shared state.
    Node signature: State -> Partial<State>
    Optional reducers: (Value, Value) -> Value for multi-writer keys"""

Key implementation details:

State-driven: All node communication via immutable state dict
Nodes are pure functions: State -> Partial<State> — no side effects in the graph engine
Reducers: Handle concurrent writes to same key (associative + commutative)
Channels: LastValue, BinaryOperator, NamedBarrier, Ephemeral — different merge semantics
Interrupt native: First-class feature at channel boundaries via transactional checkpointing
Checkpointing: Checkpointer ABC with InMemorySaver, SQLiteSaver, PostgresSaver — most mature persistence system

What LangGraph does that Atmosphere can't: Cycles (reflection loops), typed state passing between nodes, subgraph nesting, interrupt-and-resume at arbitrary graph points, visualization built-in.

What Atmosphere does that LangGraph can't: Multi-protocol exposure (A2A/MCP/AG-UI), multi-transport streaming, fan-out strategies, content safety pipeline, messaging channels (Slack/Telegram/etc), tool approval gates.

Key insight: LangGraph is a workflow engine, not an agent framework. You build agents ON TOP of it. Atmosphere is an agent framework with workflow capabilities. These are complementary architectures.

OpenAI Agents SDK — Production Agent Framework

Repo: openai/openai-agents-python Core file: src/agents/run.py (Runner class)

async def run(cls, starting_agent: Agent[TContext], input: str | list[TResponseInputItem] | RunState[TContext],
              *, context: TContext | None = None, max_turns: int = DEFAULT_MAX_TURNS,
              hooks: RunHooks[TContext] | None = None, session: Session | None = None) -> RunResult

Agent loop (traced from source):

Invoke agent → get response
If final output → terminate
If handoff → switch agent and loop
Execute tool calls → re-invoke agent
Check max_turns

Session management (3 implementations):

Session ABC: Abstract base
OpenAIConversationsSession: Memory-backed
OpenAIResponsesCompactionSession: Compaction-aware — drops old items to save context window

Tool types: ApplyPatchTool, CodeInterpreterTool, ShellTool, FunctionTool, MCPTool

Guardrails: Input/output tripwires that can halt execution

Handoff system: Agent.handoff_to() — transfers control to another agent with context. Atmosphere has no handoff equivalent — fleet orchestration is coordinator-directed, not agent-initiated.

Approval: MCPToolApprovalFunction — synchronous blocking call. Less sophisticated than Atmosphere's async virtual-thread-parked approval.

Pydantic AI — Type-Safe Agent Framework

Repo: pydantic/pydantic-ai

Key innovations from source:

AgentSpec: Composable agent configuration object — tools, instructions, model, structured output, all declarative
Internal graph (_agent_graph.py): Execution represented as typed node graph: ModelRequestNode, CallToolsNode, UserPromptNode
Result validation: Pydantic models for output types — automatic structured output
OpenTelemetry native: First-class tracing with Logfire integration
Capabilities: capabilities=[Thinking(), WebSearch(), MCP()] — composable units that bundle tools, hooks, instructions

What Pydantic AI has that Atmosphere lacks: Type-safe tool signatures with Pydantic validation, composable capabilities, declarative agent spec, graph-based execution model, built-in eval framework.

CrewAI — Event-Driven Multi-Agent

Repo: crewAIInc/crewAI Core file: lib/crewai/src/crewai/flow/flow.py

Decorator-based graph definition:

@start(condition: str | FlowCondition | Callable | None)
@listen(to: str | list[str])
@router(routes: dict[str, str])
def method(self): ...

Event system: crewai_event_bus pub/sub with typed events: FlowStartedEvent, MethodExecutionFinishedEvent, MethodExecutionPausedEvent.

Memory system: MemoryScope (isolation levels) + MemorySlice (scoped within flow) + unified memory shared across agents.

Condition system: Composable AND_CONDITION, OR_CONDITION, nested conditions for complex routing.

What CrewAI has that Atmosphere lacks: Decorator-based flow definition (more readable than imperative orchestration), event bus for decoupled communication, scoped memory isolation, flow visualization.

Smolagents (HuggingFace) — Code Execution Framework

Repo: huggingface/smolagents

Key innovation: LLM writes Python code as actions instead of JSON tool calls — reported 30% fewer steps.

5 sandbox backends: E2B (cloud), Blaxel, Modal, Docker, Pyodide+Deno.

What Smolagents has that Atmosphere lacks: Code execution sandbox, code-as-action paradigm.

Google ADK — Gemini-Native Framework

Key features: Native Gemini Live audio streaming, built-in search grounding, multi-agent pipeline with SequentialAgent/ParallelAgent/LoopAgent.

What ADK has that Atmosphere lacks: Native audio streaming, search grounding, loop agent with exit conditions.

Vercel AI SDK — Frontend-First Framework

Key features: useVoice() hook for voice agents, generative UI (LLM streams React components), useTools() composable with automatic tool result handling, streaming token-by-token UI updates.

What Vercel AI SDK has that Atmosphere lacks: Voice pipeline, generative UI components, frontend-first streaming primitives.

Microsoft Semantic Kernel — Enterprise .NET Framework

Key features: Process framework with Dapr integration, plugin/function composability, planner (auto-plan from goal), memory with embeddings.

What Semantic Kernel has that Atmosphere lacks: Auto-planner, Dapr process orchestration, native embedding memory.

OpenClaw / gitagent — Personal Agent Ecosystem

Key concept: Agent identity as git-native files — personality, rules, memory, tool preferences versioned in a repo. Portable across frameworks.

Files in a gitagent repo:

.agent/
  identity.yaml      # personality, communication style
  rules.yaml         # behavioral constraints
  memory/            # persistent memory files
  tools/             # tool preferences and configurations
  .agentignore       # files agent shouldn't access

What OpenClaw has that Atmosphere lacks: Portable agent identity, git-native agent definitions, agent-as-repo concept.

atmosphere-skills Ecosystem

Registry (`registry.json`)

7 registered skills with SHA-256 integrity hashes (array format):

{
  "version": "1.0.0",
  "skills": [
    {
      "id": "dentist-agent",
      "name": "Dental Emergency Assistant",
      "description": "Emergency dental assistant with triage, first aid, and multi-channel delivery",
      "category": "healthcare",
      "tags": ["medical", "dental", "triage"],
      "path": "skills/dentist-agent/SKILL.md",
      "atmosphere_sample": "spring-boot-dentist-agent",
      "sha256": "fa78c726e6fd461dd0bf5cea88d65a1e..."
    },
    ...
  ]
}

Each skill links to its corresponding sample application in the main repo.

Skill File Structure (16 files)

Categories: healthcare, business, education, general, tools, rag, evaluation, mcp, agent.

Most complete example (dentist-agent/SKILL.md):

# Dental Office Assistant

You are Dr. Smith's dental office assistant...

## Skills
- Appointment scheduling
- Patient history review
- Insurance verification

## Tools
- searchPatientRecords
- bookAppointment
- checkInsuranceCoverage

## Channels
- web
- slack

## Guardrails
- HIPAA compliance filter
- PII redaction

CLI Integration

atmosphere install spring-boot-dentist-agent    # Downloads + builds sample
atmosphere run spring-boot-dentist-agent --env GEMINI_API_KEY=...  # Run with env vars
atmosphere list                                  # Lists all available samples

The CLI reads samples.json (bundled or fetched from GitHub), downloads tarballs, and runs Maven builds.

What's Missing vs. Claude Code Skills

Feature	Atmosphere Skills	Claude Code Skills
Skill file format	SKILL.md ✅	SKILL.md ✅
Section parsing	## Tools/Skills/Channels ✅	## Tools + custom sections ✅
Argument substitution	❌ Missing	$ARGUMENTS, $0, $1 ✅
Path-scoping	❌ Missing	Scoped to directories ✅
Context fork	❌ Missing	context:fork ✅
Integrity verification	SHA-256 ✅	Not applicable
Three-tier resolution	Classpath → Disk → GitHub ✅	4 scopes: project/user/global/system ✅
Registry	registry.json with hashes ✅	No central registry
YAML frontmatter	❌ Parser doesn't support	❌ Not mentioned

GAP Analysis

GAP #1: Voice / Realtime Agent Support ⛔ CRITICAL

What competitors have (code-level):

OpenAI Agents SDK: RealtimeAgent + VoicePipeline — full STT→Agent→TTS pipeline with WebSocket transport, automatic interruption detection, semantic VAD, SIP/telephony integration
Vercel AI SDK: useVoice() hook with streaming audio I/O
Google ADK: Native Gemini Live audio streaming

What Atmosphere has (verified from source): Content.Audio(byte[], mimeType) for sending audio blobs in multi-modal requests (line 638–650 in OpenAiCompatibleClient). No STT/TTS pipeline, no realtime WebSocket audio streaming, no voice agent abstraction, no VAD.

Atmosphere's advantage: WebTransport/HTTP3 infrastructure is perfect for sub-100ms audio but no agent-level abstraction leverages it.

Recommendation: VoicePipeline SPI with pluggable STT/TTS providers. @AiEndpoint(voice = true) annotation. Leverage existing WebTransport for low-latency audio.

GAP #2: Graph-Based / DAG Workflow Engine ⛔ CRITICAL

What competitors have (code-level):

LangGraph: StateGraph(Generic[StateT, ContextT, InputT, OutputT]) with typed state, reducers for concurrent writes, interrupt at channel boundaries, subgraph nesting, cycle support. Nodes are pure functions State -> Partial<State>.
Pydantic AI: Internal graph in _agent_graph.py with ModelRequestNode, CallToolsNode, UserPromptNode
CrewAI: Decorator-based: @start(), @listen(), @router() with AND_CONDITION/OR_CONDITION
Mastra: .then(), .branch(), .parallel() graph API with suspend/resume

What Atmosphere has (verified from source): @Coordinator + AgentFleet with parallel(), pipeline(), route() — imperative orchestration in the @Prompt method. CoordinationJournal for audit. No declarative graph definition, no cycle support, no typed state passing, no visualization.

Implementation detail: DefaultAgentFleet.parallel() uses newVirtualThreadPerTaskExecutor() + CompletableFuture.allOf().get(timeout). This is efficient but not a graph engine — it's imperative fan-out.

Recommendation: WorkflowGraph builder API with typed state nodes, conditional edges, cycle support, and Mermaid export. Integrate with existing CoordinationJournal for audit trail.

GAP #3: Declarative Agent Specification ⛔ CRITICAL (upgraded from HIGH)

Why upgraded: The convergence of Pydantic AI's AgentSpec, CrewAI's YAML definitions, and the OpenClaw/gitagent movement makes this table-stakes by mid-2026.

What competitors have:

Pydantic AI: AgentSpec — full declarative agent in code/YAML
CrewAI: YAML-based agent + crew + task definitions
OpenClaw/gitagent: Git-native agent identity files

What Atmosphere has: @Agent annotation (code-only). Skill files define system prompts but tool wiring, guardrails, routing require Java code.

Recommendation: atmosphere-agent.yaml spec format. AgentSpecLoader that reads YAML and wires up @Agent equivalent at runtime.

GAP #4: LLM-Output-Driven Agent Handoff 🟡 MEDIUM (downgraded — imperative handoff exists)

What competitors have:

OpenAI Agents SDK: Agent.handoff_to() — agent can transfer control to another agent with context. The loop detects handoff in response and switches agents automatically.
Claude Code: Subagent dispatch with tool restrictions and model override

What Atmosphere has: StreamingSession.handoff(agentName, message) — agent-initiated delegation exists. Implementation in AiStreamingSession (line 482–526): copies conversation history via memory.copyTo(), looks up target handler at /atmosphere/agent/{name}, dispatches via handler.onStateChange(). CAS guard prevents nested handoffs. AiEvent.Handoff emitted to client. AiCapability.HANDOFF declared.

What's still missing vs. OpenAI: OpenAI's handoff is loop-aware — the Runner detects handoff in the response and automatically switches agents. Atmosphere's handoff is imperative (agent code calls session.handoff() directly), not detected from LLM output.

Recommendation: Add LLM-output-driven handoff detection in OpenAiCompatibleClient.doStreamWithToolLoop() — when the LLM emits a handoff tool call, auto-invoke session.handoff(). This closes the gap with OpenAI's model.

GAP #5: Code Execution Sandbox 🔶 HIGH

What competitors have:

Smolagents: 5 sandbox backends (E2B, Blaxel, Modal, Docker, Pyodide+Deno). LLM writes Python as actions — 30% fewer steps.
AutoGen: Built-in Python code executor with Docker isolation
OpenAI Agents SDK: CodeInterpreterTool, ShellTool built-in tools

What Atmosphere has: Nothing. Tool execution is always in-JVM @AiTool methods.

Recommendation: CodeExecutionTool SPI with DockerCodeExecutor, JShellCodeExecutor (JDK-native). Security: timeout, memory limits, network isolation.

GAP #6: Composable Capabilities / Plugins 🔶 HIGH

What competitors have:

Pydantic AI: capabilities=[Thinking(), WebSearch(), MCP()] — composable units bundling tools + hooks + instructions + model settings
OpenAI Agents SDK: Built-in: WebSearchTool, FileSearchTool, ComputerUseTool

What Atmosphere has: Tools (@AiTool), interceptors (AiInterceptor), guardrails (AiGuardrail), context providers (ContextProvider) — separate SPIs, not composable units.

Recommendation: AiCapabilityPack interface: tools(), interceptors(), guardrails(), instructions(), modelSettings(). Built-in packs: WebSearchCapability, ThinkingCapability.

GAP #7: Persistent AI Sessions / Cross-Run Conversation Memory 🔶 HIGH

What competitors have:

OpenAI Agents SDK: Session ABC with OpenAIConversationsSession (memory-backed) and OpenAIResponsesCompactionSession (compaction-aware)
LangGraph: MemorySaver + PostgresSaver with thread-level and cross-thread memory
Mastra: Built-in storage layer for agent state

What Atmosphere has (verified from source):

DurableSession (record): stores token, resourceId, rooms, broadcasters, metadata — transport-level state
SessionStore SPI: save(), restore(), remove(), touch(), removeExpired() — operates on DurableSession
PersistentConversationMemory + ConversationPersistence SPI — conversation history IS persistable via SqliteConversationPersistence (in durable-sessions-sqlite module) and Redis. Loaded via ServiceLoader in CoordinatorProcessor (line 460–466).
CheckpointStore: Workflow state persistence with fork/branch semantics

The remaining gap: While conversation persistence exists, there's no unified "persistent AI session" that bundles conversation history + working memory (LongTermMemory facts) + checkpoint state + transport state into a single restorable unit. The pieces exist but are configured independently.

GAP #8: Session Compaction / Context Window Management 🔶 HIGH (NEW)

What competitors have:

OpenAI Agents SDK: OpenAIResponsesCompactionSession — drops old items to save context window. Automatic compaction.
Claude Code: PreCompact/PostCompact hooks — memory compaction with user-definable behavior

What Atmosphere has: OpenAiCompatibleClient sends full message history every round. No compaction. The Responses API path (line 202–237) relies on OpenAI's server-side state, but this only works with OpenAI, not other providers.

Why it matters: Long conversations exhaust context windows. Without compaction, agents hit token limits and fail.

Recommendation: MessageCompactor SPI with strategies: SlidingWindow(n), SummarizeThenTruncate, TokenBudget(maxTokens). Plugged into AbstractAgentRuntime.assembleMessages().

GAP #9: Lifecycle Hooks System 🔶 HIGH (NEW)

What competitors have:

Claude Code: 18+ hook types with HookEvent + HookMatcher pattern. Hooks can modify behavior (deny tool calls, modify prompts, etc.)
OpenAI Agents SDK: RunHooks[TContext] with typed context. agent_start, agent_end, tool_start, tool_end, handoff
CrewAI: Full event bus with pub/sub: FlowStartedEvent, MethodExecutionFinishedEvent, MethodExecutionPausedEvent

What Atmosphere has: AgentLifecycleListener with 5 events: onStart, onToolCall, onToolResult, onCompletion, onError. Listeners are observe-only — they cannot modify behavior. AiInterceptor has preProcess/postProcess but operates on the request/response, not on individual lifecycle events.

Why it matters: Hooks that can modify behavior (deny a tool call, inject additional context, pause execution) are essential for production agent systems.

Recommendation: AgentHook system with matchers and return values. @PreToolUse(tool = "deleteFile") that returns HookResult.DENY to block tool execution.

GAP #10: Dynamic Tool Discovery / Tool Search 🔶 HIGH (NEW)

What competitors have:

Claude Code: 10,000+ tool definitions indexed. Definitions withheld from context window. Loaded on demand via semantic search.

What Atmosphere has: Tools registered at startup via @AiTool annotations per-agent — AgentProcessor builds per-agent tool registries from each agent class's @AiTool methods, not a global registry. However, all of an agent's tools are injected into every LLM request for that agent. For agents with many tools, this bloats the context window.

Why it matters: As tool catalogs grow, injecting all tool definitions into every request becomes prohibitively expensive. Dynamic tool discovery is essential for tool-rich agents.

Recommendation: ToolIndex with semantic search over tool descriptions. @AiEndpoint(maxToolsPerRequest = 10) with automatic selection. Compatible with MCP tool discovery.

GAP #11: Skill Argument Substitution 🟡 MEDIUM (NEW)

What competitors have:

Claude Code Skills: $ARGUMENTS (full argument string), $0, $1, etc. (positional arguments). Skills can receive parameters.

What Atmosphere has: SkillFileParser treats the entire file as a static system prompt. No variable substitution, no argument passing.

Recommendation: Add {{variable}} substitution to SkillFileParser.systemPrompt(). Support $ARGUMENTS for Claude Code compatibility.

GAP #12: Agent Evaluation Framework 🟡 MEDIUM

What competitors have:

Pydantic AI: Full evals framework with systematic testing, monitoring, Logfire integration
LangGraph + LangSmith: Production-grade evaluation with A/B testing, regression detection
Mastra: Built-in eval framework with custom metrics

What Atmosphere has: LlmJudge (LLM-as-evaluator), ResultEvaluator SPI, SanityCheckEvaluator, contract tests. Good building blocks, not a cohesive eval framework.

Recommendation: AtmosphereEval harness with test cases, metrics, @EvalTest JUnit 5 annotation, dashboard integration.

GAP #13: Computer Use / Browser Automation 🟡 MEDIUM

What competitors have:

OpenAI Agents SDK: ComputerUseTool built-in
Anthropic Claude: Native computer use with screenshot → action loop

What Atmosphere has: Content.Image for vision input but no computer use abstraction.

Recommendation: ComputerUseTool SPI with Playwright/Selenium backends.

GAP #14: Generative UI / Streaming Components 🟡 MEDIUM

What competitors have:

Vercel AI SDK: LLM streams React components, tool calls render as interactive UI widgets

What Atmosphere has: AG-UI module with AgUiEvent, AgUiEventMapper, AgUiStreamingSession. The protocol bridge exists but no server-side component rendering.

Recommendation: UiComponent sealed interface: Card, DataTable, ConfirmationDialog. session.emitComponent() API.

GAP #15: Agent-as-REST-API 🟡 MEDIUM

What competitors have: Every framework allows simple agent.run_sync("prompt") → string response.

What Atmosphere has: Agents primarily via WebSocket. AgentRuntime.generate() exists for sync use but no auto-generated REST endpoints.

Recommendation: Auto-generate POST /atmosphere/agent/{name}/run endpoints.

GAP #16: Prompt Templates 🟡 MEDIUM

What competitors have:

LangChain/LangChain4j: Rich templates with variables, few-shot examples, output parsers
Spring AI: PromptTemplate with variable substitution

What Atmosphere has: Static system prompts or skill files. PostPromptHook for modification. No variable substitution.

Recommendation: PromptTemplate with {{variable}} substitution and few-shot injection.

GAP #17: Semantic / Vector Memory 🟡 MEDIUM

What competitors have:

LangGraph: Cross-thread semantic memory with vector similarity search
CrewAI: Scoped memory with semantic search (MemoryScope + MemorySlice)

What Atmosphere has: LongTermMemory (string facts), SemanticRecallInterceptor (uses ContextProvider SPI), EmbeddingRuntime SPI — all exist but aren't wired together.

Recommendation: Wire EmbeddingRuntime into LongTermMemory for semantic fact storage/recall. VectorLongTermMemory implementation.

GAP #18: Hierarchical Memory / Path-Scoped Rules 🟡 MEDIUM (NEW)

What competitors have:

Claude Code: Hierarchical CLAUDE.md files. Deeper files override shallower. Auto-memory. Path-scoped rules.

What Atmosphere has: Flat per-user LongTermMemory. Single system prompt from skill file.

Recommendation: MemoryHierarchy that merges multiple skill files or memory sources based on context path.

GAP #19: Agent Tracing / Debug Visualization 🟢 LOW

What competitors have:

LangSmith: Full trace visualization
Pydantic Logfire: OpenTelemetry-based with agent-specific views
CrewAI: Event bus tracing with OpenTelemetry

What Atmosphere has: TracingCapturingSession, MicrometerAiMetrics (OpenTelemetry), CoordinationJournal. Data is captured but no visualization UI.

Recommendation: Add trace visualization to admin dashboard. Export to LangSmith/Logfire format.

GAP #20: Fine-Tuning / Distillation Pipeline 🟢 LOW

What competitors have: OpenAI SDK traces designed for fine-tuning. LangSmith dataset collection.

What Atmosphere has: Nothing.

Recommendation: TraceExporter SPI writing JSONL format compatible with OpenAI fine-tuning API.

GAP #21: Personal Agent / Identity Portability 🟢 LOW

What competitors have:

OpenClaw/gitagent: Agent identity as git-native files

What Atmosphere has: @Agent(skillFile = "...") + LongTermMemory.

Recommendation: AgentIdentity record serializable to YAML/JSON for import/export.

GAP #22: Permission Mode System 🟢 LOW (NEW)

What competitors have:

Claude Code: 6 explicit permission modes that change execution behavior globally

What Atmosphere has: Per-tool @RequiresApproval + ToolApprovalPolicy (4 modes). No session-level permission mode.

Recommendation: PermissionMode enum: DEFAULT, ACCEPT_EDITS, PLAN_ONLY, BYPASS, DENY_ALL. Set per-session.

What Atmosphere Leads On (No Gaps)

These are areas where Atmosphere is genuinely ahead of every competing framework:

1. Runtime Portability

7 agent backends (Built-in, Spring AI, LangChain4j, Google ADK, Koog, Semantic Kernel, Embabel) swappable via Maven dependency. AbstractAgentRuntime<C> template method pattern with capabilities() contract testing. No other framework does this.

2. Multi-Protocol Convergence

MCP + A2A + AG-UI + WebSocket + SSE + gRPC + WebTransport/HTTP3 + long-polling in one framework. AgentProcessor auto-registers agents at all configured protocols (line 140–145). A single @Agent class is simultaneously accessible via all protocols.

3. Multi-Transport Streaming with Auto-Fallback

WebTransport/HTTP3, WebSocket, SSE, gRPC, long-poll with transparent fallback. Connection upgrade negotiation is automatic. Nobody else has this depth.

4. Fan-Out Strategies

FanOutStrategy sealed interface with 3 records: AllResponses (parallel streaming), FirstComplete (race + cancel losers), FastestStreamingTexts(threshold) (observe N chunks, pick winner). Unique to Atmosphere.

5. Circuit-Breaker Model Routing

DefaultModelRouter with FAILOVER, ROUND_ROBIN, CONTENT_BASED strategies. Health tracking with configurable maxConsecutiveFailures + cooldownPeriod. Auto-recovery after cooldown expires.

6. Durable HITL with Virtual Thread Parking

ToolExecutionHelper.executeWithApproval() parks virtual threads cheaply via CompletableFuture.get(timeout). Fail-closed default (line 176–182). Structured denial responses. More sophisticated than any competing framework's approval system.

7. Content Safety Pipeline

ContentSafetyFilter with pluggable SafetyChecker, sentence-boundary buffering, redaction mode. StreamingTextBudgetManager with per-user/org token budgets. No competing framework has built-in token budget management.

8. Admin Control Plane

25+ REST endpoints. AgentController.listAgents() returns real-time agent metadata. CoordinatorController.getFleet() returns fleet health. CoordinationJournal queryable audit log.

9. 5 Messaging Channels

Slack, Telegram, Discord, WhatsApp, Messenger — defined in ChannelType enum with per-platform message length limits (e.g., Slack 40K chars, Discord 2K). Wired via @Agent(channels = {...}) or skill file ## Channels section. No other agent framework has built-in messaging channel support.

10. Contract Test Pinning

AbstractAgentRuntimeContractTest.expectedCapabilities() — capability matrix enforced in code. If a runtime claims TOOL_CALLING, the contract test proves it. No drift possible between documentation and implementation.

Summary Matrix

#	Gap	Severity	Effort	Leading Competitors
1	Voice / Realtime Agents	⛔ CRITICAL	Large	OpenAI SDK, Vercel AI, Google ADK
2	Graph-Based Workflows	⛔ CRITICAL	Large	LangGraph, Pydantic AI, CrewAI, Mastra
3	Declarative Agent Spec	⛔ CRITICAL	Medium	Pydantic AI, CrewAI, OpenClaw
4	LLM-Output-Driven Handoff	🟡 MEDIUM	Small	OpenAI SDK
5	Code Execution Sandbox	🔶 HIGH	Medium	Smolagents, AutoGen, OpenAI SDK
6	Composable Capabilities	🔶 HIGH	Medium	Pydantic AI, OpenAI SDK
7	Unified Persistent AI Session	🟡 MEDIUM	Medium	LangGraph, Mastra
8	Session Compaction	🔶 HIGH	Small	OpenAI SDK, Claude Code
9	Lifecycle Hooks System	🔶 HIGH	Medium	Claude Code (18+ hooks), OpenAI SDK, CrewAI
10	Dynamic Tool Discovery	🔶 HIGH	Medium	Claude Code (10K+ tools)
11	Skill Argument Substitution	🟡 MEDIUM	Small	Claude Code Skills
12	Evaluation Framework	🟡 MEDIUM	Medium	Pydantic AI, LangSmith, Mastra
13	Computer Use	🟡 MEDIUM	Medium	OpenAI SDK, Anthropic
14	Generative UI	🟡 MEDIUM	Medium	Vercel AI SDK
15	Agent-as-REST-API	🟡 MEDIUM	Small	All frameworks
16	Prompt Templates	🟡 MEDIUM	Small	LangChain, Spring AI
17	Semantic Vector Memory	🟡 MEDIUM	Small	LangGraph, CrewAI
18	Hierarchical Memory	🟡 MEDIUM	Medium	Claude Code
19	Trace Visualization	🟢 LOW	Medium	LangSmith, Logfire, CrewAI
20	Fine-Tuning Pipeline	🟢 LOW	Small	OpenAI SDK, LangSmith
21	Personal Agent Identity	🟢 LOW	Small	OpenClaw, NagaAgent
22	Permission Mode System	🟢 LOW	Small	Claude Code

Recommended Roadmap

Phase 1 — Table Stakes (Close critical gaps)

Graph-Based Workflows — WorkflowGraph builder API
Declarative Agent Spec — atmosphere-agent.yaml format
Agent Handoff — agent-initiated delegation within fleets

Phase 2 — Competitive Parity (Close high gaps)

Persistent AI Sessions — unified session with conversation + state + memory
Session Compaction — MessageCompactor SPI
Lifecycle Hooks — AgentHook with matchers and behavior modification
Dynamic Tool Discovery — ToolIndex with semantic search
Composable Capabilities — AiCapabilityPack interface

Phase 3 — Market Expansion

Voice / Realtime Agents — VoicePipeline SPI (large effort, high impact)
Code Execution Sandbox — CodeExecutionTool SPI
Evaluation Framework — AtmosphereEval harness

Phase 4 — Differentiation

Skill argument substitution — $ARGUMENTS support
Generative UI — UiComponent sealed interface
Semantic Vector Memory — wire EmbeddingRuntime into LongTermMemory
Agent-as-REST-API — auto-generated REST endpoints

Phase 5 — Future-Proofing

16–22. Remaining LOW gaps as opportunity allows

Appendix: Key Source Files Reference

Component	File	Key Lines
Agent Loop	`OpenAiCompatibleClient.java`	191–364 (doStreamWithToolLoop)
Tool Approval	`ToolExecutionHelper.java`	144–209 (executeWithApproval)
Skill Parser	`SkillFileParser.java`	69–178 (parse, listItems)
Skill Resolution	`PromptLoader.java`	108–254 (three-tier with SHA-256)
Agent Registration	`AgentProcessor.java`	65–161 (12-step pipeline)
Runtime Base	`AbstractAgentRuntime.java`	127–369 (execute, retry, assemble)
Built-in Runtime	`BuiltInAgentRuntime.java`	77–250 (execute, capabilities)
Fleet Orchestration	`DefaultAgentFleet.java`	137–250 (parallel, pipeline, route)
Model Router	`DefaultModelRouter.java`	67–141 (circuit breaker routing)
Fan-Out	`FanOutStrategy.java`	3 sealed records
Durable Sessions	`DurableSessionInterceptor.java`	96–146 (token-based restore)
Checkpoints	`SqliteCheckpointStore.java`	131–240 (save, load, fork, list)
Memory	`LongTermMemory.java`	4 methods (string facts)
Semantic Recall	`SemanticRecallInterceptor.java`	57–88 (preProcess with ContextProvider)
Approval Registry	`ApprovalRegistry.java`	51–150 (register, resolve, await)
Admin	`AgentController.java`	58–115 (listAgents)
CLI	`cli/atmosphere`	Shell script, cmd_list/cmd_run/cmd_install

jfarcand/atmosphere-gap-analysis-v2.md

Atmosphere Agent Framework — Deep Code-Level GAP Analysis

Revision 2 — Professional-Grade Source-Code Deep Dive

Table of Contents

Executive Summary

Atmosphere Architecture Deep Dive

1. Agent Loop — OpenAiCompatibleClient.doStreamWithToolLoop()

2. Tool Approval Gate — ToolExecutionHelper.executeWithApproval()

3. Skill Resolution — Three-Tier with Integrity Verification

4. Skill File Parsing — SkillFileParser

5. Agent Registration Pipeline — AgentProcessor.handle()

6. Multi-Agent Orchestration — DefaultAgentFleet

7. Model Routing & Fan-Out

8. Durable Sessions & Checkpoints

9. Memory System

10. Content Safety & Budget Management

Claude Code Agent SDK Deep Dive

Architecture (from SDK source + official docs)

Hook System — 18+ Lifecycle Points

Permission Model — 6 Explicit Modes

Subagents — Markdown-Defined with Isolation

Skills — Agent Skills Open Standard

Tool Search — Dynamic Discovery

File Checkpointing — Per-File Change Tracking

Memory — Hierarchical CLAUDE.md

Competing Framework Source-Level Analysis

LangGraph — State Graph Engine

OpenAI Agents SDK — Production Agent Framework

Pydantic AI — Type-Safe Agent Framework

CrewAI — Event-Driven Multi-Agent

Smolagents (HuggingFace) — Code Execution Framework

Google ADK — Gemini-Native Framework

Vercel AI SDK — Frontend-First Framework

Microsoft Semantic Kernel — Enterprise .NET Framework

OpenClaw / gitagent — Personal Agent Ecosystem

atmosphere-skills Ecosystem

Registry (registry.json)

Skill File Structure (16 files)

CLI Integration

What's Missing vs. Claude Code Skills

GAP Analysis

GAP #1: Voice / Realtime Agent Support ⛔ CRITICAL

GAP #2: Graph-Based / DAG Workflow Engine ⛔ CRITICAL

GAP #3: Declarative Agent Specification ⛔ CRITICAL (upgraded from HIGH)

GAP #4: LLM-Output-Driven Agent Handoff 🟡 MEDIUM (downgraded — imperative handoff exists)

GAP #5: Code Execution Sandbox 🔶 HIGH

GAP #6: Composable Capabilities / Plugins 🔶 HIGH

GAP #7: Persistent AI Sessions / Cross-Run Conversation Memory 🔶 HIGH

GAP #8: Session Compaction / Context Window Management 🔶 HIGH (NEW)

GAP #9: Lifecycle Hooks System 🔶 HIGH (NEW)

GAP #10: Dynamic Tool Discovery / Tool Search 🔶 HIGH (NEW)

GAP #11: Skill Argument Substitution 🟡 MEDIUM (NEW)

GAP #12: Agent Evaluation Framework 🟡 MEDIUM

GAP #13: Computer Use / Browser Automation 🟡 MEDIUM

GAP #14: Generative UI / Streaming Components 🟡 MEDIUM

GAP #15: Agent-as-REST-API 🟡 MEDIUM

GAP #16: Prompt Templates 🟡 MEDIUM

GAP #17: Semantic / Vector Memory 🟡 MEDIUM

GAP #18: Hierarchical Memory / Path-Scoped Rules 🟡 MEDIUM (NEW)

GAP #19: Agent Tracing / Debug Visualization 🟢 LOW

GAP #20: Fine-Tuning / Distillation Pipeline 🟢 LOW

GAP #21: Personal Agent / Identity Portability 🟢 LOW

GAP #22: Permission Mode System 🟢 LOW (NEW)

What Atmosphere Leads On (No Gaps)

1. Runtime Portability

2. Multi-Protocol Convergence

3. Multi-Transport Streaming with Auto-Fallback

4. Fan-Out Strategies

5. Circuit-Breaker Model Routing

6. Durable HITL with Virtual Thread Parking

7. Content Safety Pipeline

8. Admin Control Plane

9. 5 Messaging Channels

10. Contract Test Pinning

Summary Matrix

Recommended Roadmap

Phase 1 — Table Stakes (Close critical gaps)

Phase 2 — Competitive Parity (Close high gaps)

Phase 3 — Market Expansion

Phase 4 — Differentiation

1. Agent Loop — `OpenAiCompatibleClient.doStreamWithToolLoop()`

2. Tool Approval Gate — `ToolExecutionHelper.executeWithApproval()`

4. Skill File Parsing — `SkillFileParser`

5. Agent Registration Pipeline — `AgentProcessor.handle()`

6. Multi-Agent Orchestration — `DefaultAgentFleet`

Registry (`registry.json`)

Validation of Gist `d217f43e42c6e8dacd07da04f9a7ff0a`