You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comprehensive analysis of 26 competitor repositories analyzed for Decision AI product positioning
Executive Summary
This document provides a structured overview of all competitors analyzed during our research phase. Each competitor is categorized by market segment, with detailed profiles including value propositions, target audiences, key features, and user journey diagrams.
Key Finding: The market is fragmented across multiple niches. No single competitor addresses our full vision of unified context across platforms + modular AI sessions (deployable Claude instances) + trust-focused data science. This represents our blue ocean opportunity.
CURRENT STATE vs FUTURE VISION
CRITICAL: This section clearly distinguishes what EXISTS today versus what is PLANNED for the future.
What EXISTS Today (January 2025)
Component
Status
Description
Discord Bot
IMPLEMENTED
Primary user interface
Workflow Executor
IMPLEMENTED
Claude API with workflow_tools
Fly.io Deployment
IMPLEMENTED
Dynamic machine creation via fly_app_tools
ACP Protocol
IMPLEMENTED
Inter-session communication via SSE
Session Templates
IMPLEMENTED
Supabase database records
Builder Claude
IMPLEMENTED
Containerization service with meta-skills
What is PLANNED (Future Vision)
Component
Status
Description
Decision Packs
PLANNED
GitHub repos as deployable units with pack.yaml manifests
Pack Registry
PLANNED
Searchable index of available packs
Pack Marketplace
PLANNED
Web UI for discovery and deployment
Voice Sessions
PLANNED
Hands-free Discord voice interaction
Memory Layer
PLANNED
Persistent cross-session memory
Builder as Claude Factory: Our Key Differentiator
What makes Decision AI unique: Builder Claude doesn't just containerize code—it constructs entire intelligent environments.
┌─────────────────────────────────────────────────────────────────────────────┐
│ BUILDER AS CLAUDE FACTORY (CURRENT) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ INPUT: User's code repository │
│ (any framework: Marimo, Streamlit, FastAPI, etc.) │
│ │
│ BUILDER CLAUDE CONSTRUCTS: │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 1. Docker image with user's code │ │
│ │ 2. .claude/ directory with: │ │
│ │ ├── CLAUDE.md (execution rules + purpose) │ │
│ │ └── skills/ (generated from repo analysis) │ │
│ │ 3. ACP server for communication │ │
│ │ 4. GitHub session repo (source of truth) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ OUTPUT: Complete Claude Code environment deployed on Fly.io │
│ │
│ KEY INSIGHT: Each build = complete Claude Code environment │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
User Intent Priority (Decision Flow)
Does repo have .claude/?
├── YES → Inherit/merge (user customizations WIN)
│ - Preserve existing skills, hooks, CLAUDE.md
│ - Merge base execution rules
│ - Add missing infrastructure skills
│
└── NO → Did user specify skill preferences?
├── YES → Follow their guidance exactly
│
└── NO → Generate from scratch using meta-skills:
a. Analyze repo (dependencies, code patterns, purpose)
b. Detect framework (Marimo, Streamlit, FastAPI, etc.)
c. Generate domain-specific skills
d. Create CLAUDE.md with execution rules
Git as Source of Truth
Each session gets its own GitHub repository. This replaces container-as-artifact with git as the unit of reproducibility.
┌─────────────────────────────────────────────────────────────────────────────┐
│ GIT AS SOURCE OF TRUTH (CURRENT) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Session Repo Pattern: │
│ • New GitHub repo: github.com/org/session-mmm-{hex} │
│ • All changes tracked: git add -A && git commit after work │
│ • Versioning via tags: git tag "template/my-analysis-v1" │
│ • Full history: Browsable on GitHub, diffable │
│ │
│ Benefits: │
│ • Transparency: Readable source code, not opaque binary images │
│ • Reproducibility: git clone --branch tag = exact state │
│ • Shareability: Link to GitHub repo = shareable, forkable │
│ • Auditable: Every change logged with timestamps, diffs │
│ │
│ Build Result Format: │
│ { │
│ "status": "complete", │
│ "app_name": "template-my-thing", │
│ "image_ref": "registry.fly.io/template-my-thing:v1", │
│ "git_repo": "github.com/org/template-my-thing", │
│ "git_ref": "snapshot/v1" │
│ } │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
No one generates complete Claude environments from repo analysis
Meta-skills construct .claude/ dynamically
Git as Source of Truth
Competitors use opaque container images
Session repos track all changes via git
Unified Context
No one offers context continuity across Discord/Slack/Teams/CLI
Interface Primitives + Shared Memory (PLANNED)
Domain-Specific Evals
No one has insight recovery benchmarks for analytics
MMM Insight Recovery Experiments
Trust-First Data Science
No one combines Bayesian causal + LLM + governance
Trust Differentiators
ACP Protocol
No standard for inter-Claude communication
Our implemented protocol
What We Should Adopt
Pattern
From
Why
ToolCollection
CrewAI
Best-in-class tool management
Thread-as-Boundary
Dust.tt
Essential for chat context
Statistical Evals
Braintrust
Right approach to AI testing
4-Tier Memory
ChatMemory
Complete hierarchy
Manifest Format
Awesome Skills
Proven skill structure
Streaming Progress
Replit
Great deploy UX
Bayesian Foundation
PyMC-Marketing
Trust through uncertainty
Architectural Philosophy: Embodied vs Puppeteer
Repo2Run Pattern (Puppeteer)
🧠 (External LLM) ─────► 📦 (Dumb container)
- LLM remote-controls container
- Container has no intelligence
- Intelligence only during build time
- After build: container is static code
Our Approach (Embodied)
┌──────────────────────┐
│ 🧠 (Claude INSIDE) │
│ 📦 (Container/body) │
└──────────────────────┘
- Claude inhabits the container
- Container is Claude's body
- Intelligence at runtime
- Interactive collaboration with user
Trade-offs
Aspect
Repo2Run
Decision AI
External rollback
Excellent
Requires orchestrator
Deterministic outputs
Yes
No (but adaptive)
Runtime adaptation
No
Yes
Domain expertise
None
Skills loaded in session
User collaboration
None
Interactive
Multi-repo composition
Hard
Flexible merging
Framework support
Python-only
Framework-agnostic
Complete Competitor List
#
Competitor
Category
What They Do (One-Liner)
1
CrewAI
Framework
Multi-agent orchestration with role-based collaboration and tool collections
2
LangGraph
Framework
Stateful graph-based workflows for LLM applications
3
Swarm
Framework
Lightweight multi-agent handoffs (educational, by OpenAI)
4
Claude-Flow
Framework
Enterprise multi-agent swarms with neural learning and MCP
5
AutoGen
Framework
Microsoft's multi-agent conversation framework
6
Pydantic-AI
Framework
Type-safe Python agents with structured outputs
7
VoltAgent
Platform
TypeScript full-stack agent framework with VoltOps observability
8
LLMStack
Platform
No-code visual builder for AI agents and workflows
9
BotSharp
Platform
.NET/C# agent framework with plugin architecture
10
Composio
SDK
500+ app integrations for AI agents
11
Langfuse
Observability
LLM tracing, prompt management, and evaluation
12
Braintrust
Observability
Statistical AI evaluation with regression detection
13
AgentOps
Observability
Agent session replay and cost tracking
14
ChatMemory
Memory
4-tier hierarchical memory for AI assistants
15
Glean
Knowledge
Enterprise permission-aware knowledge search
16
Dust.tt
Chat
Thread-aware Slack AI assistants
17
Clawdbot
Chat
8-platform personal AI assistant (desktop)
18
KIRA
Chat
Privacy-first desktop AI coworker
19
Runbear
Chat
Tiered Slack/Teams bot platform
20
Awesome Claude Skills
Skills
Open-source skill manifest patterns
21
PyMC-Marketing
MMM
Bayesian causal marketing mix modeling
22
Meta Robyn
MMM
Automated MMM with Pareto optimization
23
Replit Templates
Templates
Full project templates with instant deployment
24
Railway Templates
Templates
One-click deployable app templates
25
Render Blueprints
Templates
Infrastructure-as-code deployment templates
26
Vercel Templates
Templates
Frontend/fullstack starter templates
Master Comparison Table
Competitor
Category
Primary Language
Deploy Model
Key Differentiator
Pricing
CrewAI
Framework
Python
Library
Role-based multi-agent + ToolCollection
OSS
LangGraph
Framework
Python
Library
Stateful graphs + checkpointing
OSS + Cloud
Swarm
Framework
Python
Library
Minimal primitives (educational)
OSS
Claude-Flow
Framework
TypeScript
Enterprise
54+ agents + neural learning
OSS
AutoGen
Framework
Python
Library
Conversational multi-agent
OSS
Pydantic-AI
Framework
Python
Library
Type safety + structured outputs
OSS
VoltAgent
Platform
TypeScript
Hybrid
Full-stack + VoltOps console
OSS + Cloud
LLMStack
Platform
Python
Self-hosted
No-code visual builder
OSS + Cloud
BotSharp
Platform
C#
Enterprise
.NET ecosystem + plugins
OSS
Composio
SDK
TypeScript
Multi-framework
500+ integrations
Freemium
Langfuse
Observability
TypeScript
Self-hosted
Tracing + prompt management
OSS + Cloud
Braintrust
Observability
Python
Cloud
Statistical evals + regression
Freemium
AgentOps
Observability
Python
Cloud
Session replay + cost tracking
Freemium
ChatMemory
Memory
Python
Library
4-tier hierarchy + pgvector
OSS
Glean
Knowledge
-
Enterprise
Permission-aware search
Enterprise
Dust.tt
Chat
-
Cloud
Thread-aware Slack AI
Tiered
KIRA
Chat
Python
Desktop
Privacy-first, local-only
OSS
Clawdbot
Chat
TypeScript
Desktop
8-platform personal AI
OSS
Runbear
Chat
-
Cloud
Tiered bot platform
Tiered
Awesome Skills
Skills
-
-
Manifest format pattern
OSS
PyMC-Marketing
MMM
Python
Library
Bayesian causal inference
OSS
Robyn
MMM
R
Library
Automated Pareto optimization
OSS
LightweightMMM
MMM
Python
Library
Google's Bayesian MMM
OSS
Nielsen
MMM
-
Service
Industry standard
Enterprise
Replit Agent
Deploy
-
Cloud
Zero-friction deploy
Freemium
Hex AI
Artifacts
-
Cloud
Professional notebooks
Tiered
v0.dev
Artifacts
-
Cloud
AI-generated UI preview
Freemium
Document generated for Decision AI competitive analysis - January 202526 competitors analyzed across 9 categoriesUpdated to reflect actual current state + Builder as Claude Factory architecture
This document reflects the actual state of Decision Orchestrator as of January 2025.The pack system described is a future vision based on roadmap documents in the codebase.
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARTIFACT ANTI-PATTERNS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PROBLEM: Wall of Text SOLUTION: Structured Artifact │
│ ───────────────────── ──────────────────────── │
│ │
│ "Here's your analysis: ┌──────────────────────────┐ │
│ The ROI for TV is 2.1x │ [Analysis Artifact] │ │
│ which is lower than digital │ │ │
│ at 3.2x but social is only │ ROI Summary Table │ │
│ 1.8x so you should..." │ Key Insight: Digital > TV│ │
│ │ [See Full Report] │ │
│ Can't scan, extract, or act └──────────────────────────┘ │
│ Scannable, actionable │
│ │
│ PROBLEM: No Next Steps SOLUTION: Action Buttons │
│ ────────────────────── ──────────────────────── │
│ │
│ "Here's the code." ┌──────────────────────────┐ │
│ │ [Code Artifact] │ │
│ User: "Now what?" │ │ │
│ │ [Run] [Copy] [Test] │ │
│ └──────────────────────────┘ │
│ │
│ PROBLEM: Lost Artifacts SOLUTION: Artifact Gallery │
│ ─────────────────────── ──────────────────────── │
│ │
│ User: "Where's that chart Session artifacts persisted │
│ you made earlier?" and browsable in sidebar │
│ │
│ Scroll, scroll, scroll... One-click to find any output │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Great artifacts are not just outputs—they're starting points for the next action. The difference between "here's some text" and "here's a structured artifact with clear next steps" is the difference between a chatbot and a productive AI assistant.
These principles apply to BOTH current and future systems:
Principle
Why It Matters
Show Your Work
Explain what was detected and why; builds trust
Smart Defaults
Minimize decisions, but allow overrides
Real Progress
Show actual build status, not fake animations
Graceful Pauses
Interrupt for secrets without losing progress
Actionable Errors
Don't just say "failed"; say what to do
Verify Success
Prove it works (health check, response time)
Clear Next Steps
Always show what comes after deployment
This document reflects the actual state of Decision Orchestrator as of January 2025.Builder Claude with meta-skills is IMPLEMENTED. Voice sessions and pack-based deployment are PLANNED.
How Decision AI routes requests through workflow classification, not model tiers
CRITICAL: This is NOT a Three-Tier System
Many AI systems use a three-tier model routing approach (Haiku → Sonnet → Opus based on complexity). Decision AI uses a fundamentally different approach: workflow-based routing.
Aspect
Traditional Three-Tier
Decision AI Workflows
Routing basis
Request complexity
Message intent + channel scope
Decision point
Token cost optimization
Which workflow(s) to execute
Classification
Simple → Medium → Complex
Ambient detection + active triggers
Execution
Single model call
Workflow executor with tools
The Workflow Classification System (IMPLEMENTED)
Decision AI uses LLM-based classification to match messages to configured workflows:
┌─────────────────────────────────────────────────────────────────────────────┐
│ WORKFLOW CLASSIFICATION FLOW (ACTUAL) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Discord message arrives │
│ └── Message context: author, channel, content, attachments │
│ │
│ 2. Get applicable workflows for this scope │
│ └── Query: discord_workflow_scope (server_id, channel_id) │
│ └── Filter: is_enabled = true │
│ │
│ 3. Classify message against workflows │
│ └── OpenAI gpt-4.1 (fast model for classification) │
│ └── Structured output: WorkflowClassificationResult │
│ └── Returns: list of workflow_ids that match │
│ │
│ 4. Execute matching workflows │
│ └── Claude Agent Service with workflow instructions + tools │
│ └── MCP server provides tool access based on tool_slugs │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Workflow Database Schema
-- discord_workflow: Defines what workflows doCREATETABLEdiscord_workflow (
id UUID PRIMARY KEY,
name TEXTNOT NULL, -- "Build Session", "Voice Assistant"
instructions TEXTNOT NULL, -- System prompt for Claude
tool_slugs JSONB DEFAULT '[]', -- ["CUSTOM_FLY_LAUNCH_SESSION", ...]
config JSONB DEFAULT '{}', -- Workflow-specific config
output_schema JSONB, -- Optional structured output
interaction_mode TEXT DEFAULT 'autonomous', -- 'autonomous' | 'interactive' | 'hybrid'
trigger_type TEXT DEFAULT 'ambient', -- 'ambient' | 'active'
is_enabled BOOLEAN DEFAULT true
);
-- discord_workflow_scope: Where workflows applyCREATETABLEdiscord_workflow_scope (
id UUID PRIMARY KEY,
workflow_id UUID REFERENCES discord_workflow(id),
server_id BIGINT, -- NULL = all servers
channel_id BIGINT, -- NULL = all channels in server
is_enabled BOOLEAN DEFAULT true
);
Trigger Types: Ambient vs Active
Decision AI distinguishes between two trigger modes:
Workflows can operate in different interaction modes:
Autonomous Mode
User: "Build github.com/user/repo as a streamlit app"
│
▼
Workflow executes without interruption:
1. Launch builder session
2. Send build instructions
3. Wait for completion
4. Report results
│
▼
User: "✅ Session deployed at https://mmm-abc123.fly.dev"
Interactive Mode
User: "Help me set up an analysis environment"
│
▼
Workflow pauses for decisions:
"Which template would you like?
1. mmm-studio (interactive notebook)
2. mmm-deepagent (autonomous analysis)
3. decision-pack-compiler (custom builds)"
│
▼
User: "mmm-studio"
│
▼
Workflow continues with user choice
Hybrid Mode
User: "Build this repo with authentication"
│
▼
Workflow executes autonomously but pauses on:
- Missing required secrets (ANTHROPIC_API_KEY)
- Ambiguous framework detection
- Critical errors requiring user decision
│
▼
Human-in-the-loop approval when needed
Example: Different behaviors in different channels
#general channel:
└── Generic assistant workflow (ambient, basic tools)
#mmm-analysis channel:
└── MMM workflow (ambient, full session tools)
#voice channel:
└── Voice workflow (active on join, voice tools)
#builds channel:
└── Builder workflow (active, builder tools)
Adding New Workflows
To add a new workflow:
Define the workflow in database
INSERT INTO discord_workflow (name, instructions, tool_slugs, trigger_type)
VALUES (
'My New Workflow',
'You are an assistant that helps with...',
'["TOOL_A", "TOOL_B"]',
'ambient'
);
Scope it to channels
INSERT INTO discord_workflow_scope (workflow_id, server_id, channel_id)
VALUES ('workflow-uuid', 123456789, 987654321);
Ensure tools exist in MCP server
Add tool handler in workflow_tools/
Register in create_mcp_server_for_workflows()
Common Pitfalls
┌─────────────────────────────────────────────────────────────────────────────┐
│ WORKFLOW ROUTING PITFALLS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PROBLEM: Multiple workflows match SOLUTION: Priority or exclusion │
│ ────────────────────────────── ───────────────────────────────│
│ │
│ "Build this repo" matches: Add workflow priority field │
│ - Generic assistant or use interaction_mode to │
│ - Builder workflow determine which takes precedence│
│ │
│ PROBLEM: Classification too slow SOLUTION: Active triggers │
│ ───────────────────────────── ────────────────────────────── │
│ │
│ Every message → LLM classification Use trigger_type='active' │
│ adds latency for explicit commands │
│ (/build, /voice, etc.) │
│ │
│ PROBLEM: Wrong scope SOLUTION: Scope validation │
│ ────────────────────── ──────────────────────── │
│ │
│ Workflow runs in wrong channel Always verify scope before │
│ (e.g., voice in text channel) execution │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Workflow-based routing gives Decision AI the flexibility to behave differently across contexts while maintaining a unified architecture. The key insight is that routing isn't about model capability—it's about matching user intent to the right set of tools and instructions.
@dataclassclassVoiceSession:
"""Voice session with thread support."""session_id: strthread_id: str# Discord thread for artifactscurated_context: str="You are a helpful voice assistant."conversation_log: list=field(default_factory=list)
voice_client: Any=None# Discord voice connectionlast_processed_index: int=0# Supervisor progress markercurrent_mode: str="discovery"# "discovery" or "delivery"text_only_mode: bool=False# Skip STT, users type in thread# Embed state for UIqueue_embed_id: int|None=Nonestatus_embed_id: int|None=Nonequeued_indices: list=field(default_factory=list)
processed_indices: list=field(default_factory=list)
Supervisor Loop Pattern
The supervisor runs as a background coroutine, polling every 5 seconds:
classSupervisorLoop:
"""Continuous polling loop that curates context for the Fast Agent."""asyncdef_poll_cycle(self) ->None:
"""Execute one poll cycle - analyze state and update context."""# Get gap messages (new since last processed)conv_log=self.session.conversation_loglast_idx=self.session.last_processed_indexgap_messages=conv_log[last_idx:]
ifnotgap_messages:
return# Nothing new# Fetch thread context for additional infothread_context=awaitself._fetch_thread_context()
# Build prompt for Claudeuser_message=self._build_poll_prompt(
conv_log, gap_messages, last_idx, thread_context
)
# Run Claude Agent with full tool accessawaitself._run_supervisor_agent(user_message)
# Update progress markerself.session.last_processed_index=len(conv_log)
Supervisor Tools
supervisor_tools= [
# Thread communication"CUSTOM_VOICE_SEND_TO_THREAD", # Post messages/artifacts to thread"CUSTOM_VOICE_MESSAGE_EDIT", # Edit previous messages"CUSTOM_VOICE_MESSAGE_DELETE", # Delete messages# Session management"CUSTOM_VOICE_SESSION_WRITE", # Update curated_context"CUSTOM_VOICE_SESSION_STOP", # End the voice session"CUSTOM_VOICE_CREATE_ARTIFACT", # Create rich artifacts# Discord read access"CUSTOM_DISCORD_READ_THREAD", # Read thread history"CUSTOM_DISCORD_READ_CHANNEL", # Read channel history"CUSTOM_DISCORD_PARSE_LINK", # Extract content from Discord links
]
Curated Context Format
The supervisor writes structured context that the Fast Agent can quickly parse:
## CURATED SUMMARY## Research Question[Original user query]## Summary[High-level findings answering the user's question]## Detailed Findings### [Component/Area 1]- Finding with reference ([file.ext:line](link))
- Connection to other components
- Implementation details
### [Component/Area 2]
...
## Code References-`path/to/file.py:123` - Description of what's there
-`another/file.ts:45-67` - Description of the code block
## Architecture Insights[Patterns, conventions, and design decisions discovered]## Historical Context (from conversation history)[Relevant insights from conversation history]## Open Questions[Any areas that need further investigation]## RECENT CONTEXT (since last poll)[Key points from gap_messages Fast Agent should know]## VOICE GUIDELINES[Any specific instructions for this turn]
Text-Only Mode
Voice sessions can operate in text-only mode (no STT/TTS):
┌─────────────────────────────────────────────────────────────────────────────┐
│ TEXT-ONLY VOICE SESSION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ When text_only_mode = True: │
│ │
│ • Users type in the Discord thread │
│ • Fast Agent still responds quickly (text instead of voice) │
│ • Supervisor still analyzes and curates │
│ • Same 3-Claude architecture, just no audio │
│ │
│ Use case: Users who prefer typing, noisy environments, accessibility │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
ACP Communication Pattern
Claude sessions communicate via the Agent Communication Protocol:
In Decision AI, sessions provide natural thread boundaries:
┌─────────────────────────────────────────────────────────────────────────────┐
│ DECISION AI SESSION MEMORY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ORCHESTRATOR (Discord) │
│ └── User interacts via Discord messages │
│ └── Each Discord thread can map to one or more sessions │
│ │
│ SESSION (Fly.io) │
│ └── Each session = isolated context │
│ └── Session's .claude/CLAUDE.md = persistent instructions │
│ └── Session's skills/ = domain knowledge │
│ └── Git repo = full state history │
│ │
│ CONTEXT FLOW: │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Orchestrator │ ACP │ Session A │ │
│ │ Context │ ──────► │ Context │ │
│ │ │ │ │ │
│ │ • User prefs │ │ • CLAUDE.md │ │
│ │ • Thread hist │ │ • Skills │ │
│ │ • Task state │ │ • Git history │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ KEY INSIGHT: Session = Thread with its own persistent brain │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Thread Types Summary
Thread Type
Purpose
Claude Involvement
Persistence
Workflow Thread
Group workflow execution messages
Orchestrator only
Message duration
Builder Thread
Track build progress
Builder Claude via ACP
Until cleanup
Session Thread
Interactive session communication
Session Claude via ACP
Session lifetime
Voice Thread
Artifacts + detailed output
Supervisor posts here
Session lifetime
Context Window Management
Fast Agent Context Window (Optimized for Speed)
Fast Agent sees:
┌─────────────────────────────────────────────────────────────────────────────┐
│ [System Prompt: ~500 tokens] │
│ [Curated Context: ~2000 tokens - compressed by Supervisor] │
│ [Recent Conversation: last 5-10 messages - ~1000 tokens] │
│ [User's Current Message] │
└─────────────────────────────────────────────────────────────────────────────┘
Total: ~4000 tokens → Fast response possible
Supervisor Context Window (Optimized for Depth)
Supervisor sees:
┌─────────────────────────────────────────────────────────────────────────────┐
│ [System Prompt: ~1500 tokens] │
│ [Thread History: full Discord thread - potentially large] │
│ [Full Conversation Log: all voice exchanges] │
│ [Previous Curated Context] │
│ [Gap Messages: new since last poll] │
│ [Tool Call Results: from research/actions] │
└─────────────────────────────────────────────────────────────────────────────┘
Total: Can be large, but Opus handles it. Polling frequency limits growth.
Common Patterns
Pattern 1: Research → Delivery
User: "How does the authentication system work?"
Supervisor:
1. Detects research question
2. Uses tools to read codebase
3. Writes curated_context with findings
4. Posts detailed analysis to thread
Fast Agent:
1. Reads curated_context
2. Delivers verbal summary
3. Points user to thread for details
Pattern 2: Action → Confirmation
User: "Start an MMM session for me"
Supervisor:
1. Detects action request
2. Calls CUSTOM_FLY_LAUNCH_SESSION
3. Updates curated_context with result
4. Posts session URL to thread
Fast Agent:
1. Confirms action taken
2. Provides session URL verbally
Pattern 3: Multi-Turn Discovery
User: "I want to analyze my marketing data"
Fast Agent:
"What kind of analysis? Budget optimization,
channel attribution, or trend forecasting?"
User: "Channel attribution"
Supervisor:
1. Notes the preference
2. Updates curated_context with mode
Fast Agent:
"Great! Do you have your data ready, or
should I help you format it first?"
Design Principles
Principle
Implementation
Separation of concerns
Fast Agent speaks, Supervisor thinks
Latency optimization
Fast Agent has minimal context
Thread as artifact store
Detailed info goes to thread, not voice
Resilience
Session persists even if voice disconnects
Transparency
Thread shows what Supervisor is doing
Session isolation
Each session = separate context, no bleed
The key insight of Decision AI's conversation design is that different contexts require different conversation strategies. Voice needs speed (Fast Agent), complex research needs depth (Supervisor), and persistent artifacts need a home (threads). The 3-Claude architecture separates these concerns while keeping them coordinated.
How AI agents persist memory and context across sessions
CRITICAL: This is PLANNED - NOT YET BUILT
IMPORTANT: The sophisticated memory layer described in this document is a FUTURE VISION. The current Decision AI implementation uses session-level persistence (via git repos and .claude/ directories) but does NOT yet have cross-session user memory, organizational memory, or the retrieval mechanisms described here.
What EXISTS Today
Memory Type
Status
How It Works
Session Memory
IMPLEMENTED
Git repo tracks all session changes
Session Context
IMPLEMENTED
.claude/CLAUDE.md + skills/
Template Reuse
IMPLEMENTED
Save session as template, relaunch later
Voice Curated Context
IMPLEMENTED
Supervisor Loop updates curated_context
User Memory
NOT YET BUILT
Planned for future
Org Memory
NOT YET BUILT
Planned for future
Cross-Session
NOT YET BUILT
Planned for future
The Core Problem
AI agents are stateless by default. Every request starts fresh with zero memory of past interactions. This creates a frustrating experience where users repeat themselves and context is lost.
The solution: A layered memory architecture that stores, retrieves, and injects relevant context at the right moments.
Key insight: Each layer inherits from the layer above. A single request would have access to all five layers, merged into one coherent context.
Current Implementation: Session Memory Only
Decision AI currently implements session-level memory via git repos and .claude/ directories:
┌─────────────────────────────────────────────────────────────────────────────┐
│ CURRENT SESSION MEMORY (IMPLEMENTED) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ SESSION (Fly.io machine) │
│ ├── /workspace/app/ # User's code │
│ ├── /workspace/.claude/ # Session's brain │
│ │ ├── CLAUDE.md # Execution rules + purpose │
│ │ └── skills/ # Domain-specific knowledge │
│ └── .git/ # Full history │
│ │
│ GIT REPO (github.com/org/session-mmm-{hex}) │
│ └── All changes tracked with commits │
│ └── Tags for snapshots: "template/my-analysis-v1" │
│ └── Full browsable history │
│ │
│ PERSISTENCE PATTERNS: │
│ • Session running: Full context in Claude's window + .claude/ │
│ • Session stopped: Git repo preserves state │
│ • Template saved: Can relaunch from saved state │
│ │
│ VOICE SESSION MEMORY: │
│ • Supervisor curates context into curated_context │
│ • Gap messages tracked with last_processed_index │
│ • Thread history provides additional context │
│ │
│ LIMITATIONS: │
│ • No cross-session user memory ("remember my preferences") │
│ • No "remember this for next time" │
│ • Each new session starts fresh (unless launched from template) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Proposed pgvector Schema (FUTURE)
Based on the MEMORY_LAYER_ARCHITECTURE.md design document in the codebase:
-- Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;
-- Memory entries with scope and permissionsCREATETABLEmemories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULLREFERENCES orgs(id),
project_id UUID REFERENCES projects(id), -- NULL = org-wide
session_id UUID REFERENCES sessions(id), -- NULL = persistent-- Content
content TEXTNOT NULL,
embedding vector(1536) NOT NULL,
-- Metadata
memory_type TEXTNOT NULL, -- 'fact', 'decision', 'context', 'preference'
source TEXT, -- 'user', 'claude', 'system'
tags TEXT[],
-- Permissions
visibility TEXT DEFAULT 'project', -- 'session', 'project', 'org'-- Timestamps
created_at TIMESTAMPTZ DEFAULT NOW(),
accessed_at TIMESTAMPTZ DEFAULT NOW(),
expires_at TIMESTAMPTZ-- NULL = permanent
);
-- Optimized index for vector search within scopeCREATEINDEXidx_memories_org_embeddingON memories USING ivfflat (embedding vector_cosine_ops)
WITH (lists =100);
Permission Model (FUTURE)
┌─────────────────────────────────────────────────────────────┐
│ ORG SCOPE │
│ visibility = 'org' │
│ Accessible to all projects/sessions in org │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ PROJECT SCOPE │ │
│ │ visibility = 'project' │ │
│ │ Accessible to all sessions in project │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ SESSION SCOPE │ │ │
│ │ │ visibility = 'session' │ │ │
│ │ │ Only accessible to current session │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Scoped Retrieval (FUTURE)
The retrieval function respects permission boundaries:
CREATE OR REPLACEFUNCTIONsearch_memories(
p_org_id UUID,
p_project_id UUID,
p_session_id UUID,
p_query_embedding vector(1536),
p_limit INT DEFAULT 10,
p_memory_types TEXT[] DEFAULT NULL,
p_min_similarity FLOAT DEFAULT 0.7
)
RETURNS TABLE (
id UUID,
content TEXT,
memory_type TEXT,
similarity FLOAT,
scope TEXT
) AS $$
BEGIN
RETURN QUERY
SELECTm.id,
m.content,
m.memory_type,
1- (m.embedding<=> p_query_embedding) as similarity,
CASE
WHEN m.session_id= p_session_id THEN 'session'
WHEN m.project_id= p_project_id THEN 'project'
ELSE 'org'
END as scope
FROM memories m
WHEREm.org_id= p_org_id
AND (
(m.visibility='session'ANDm.session_id= p_session_id)
OR (m.visibility='project'AND (m.project_id= p_project_id ORm.project_id IS NULL))
ORm.visibility='org'
)
AND (m.expires_at IS NULLORm.expires_at> NOW())
AND (p_memory_types IS NULLORm.memory_type= ANY(p_memory_types))
AND (1- (m.embedding<=> p_query_embedding)) >= p_min_similarity
ORDER BY
CASE
WHEN m.session_id= p_session_id THEN 1
WHEN m.project_id= p_project_id THEN 2
ELSE 3
END,
m.embedding<=> p_query_embedding
LIMIT p_limit;
END;
$$ LANGUAGE plpgsql;
Memory Types (FUTURE)
Type
Description
Default Visibility
TTL
fact
Learned information
project
permanent
decision
Architectural choices
project
permanent
context
Conversation context
session
24h
preference
User preferences
project
permanent
policy
Org-wide rules
org
permanent
Retrieval Patterns (FUTURE)
1. Session Context Retrieval
# Get relevant context for current conversationmemories=awaitstore.search(
query="user preferences for code style",
org_id=org_id,
project_id=project_id,
session_id=session_id,
types=['preference', 'context'],
limit=5
)
Layer appropriately - Store at the narrowest scope that makes sense
Extract selectively - Not everything is worth remembering
Retrieve relevantly - Only inject what helps the current task
Decay gracefully - Old memories may be wrong
Respect privacy - Users control their own data
Resolve conflicts - Clear rules when memories contradict
Common Anti-Patterns
Problem
Symptom
Solution
Memory Hoarding
Slow retrieval, noisy context
Selective extraction
Stale Memories
Wrong preferences applied
Decay policies, validation
Scope Leakage
User A sees User B's data
Enforce scope at query time
Over-Retrieval
100 memories for simple question
Tiered injection
No User Control
"What does it know about me?"
Memory dashboard
Memory transforms AI from a stateless tool into an intelligent partner. The goal isn't perfect recall—it's the right memory at the right time.
Currently, Decision AI implements session-level memory. Cross-session memory is planned for future development based on the pgvector schema design already in the codebase.
When to use:
Automated Only Human Required
────────────── ──────────────
• Format correctness • CLAUDE.md quality
• Build success • Skill appropriateness
• Deployment health • User experience
• Safety filters • Edge case decisions
• Tool compliance • Insight relevance
Key Principles
Principle
Why
Statistical significance
Don't alert on noise
Multi-dimensional
Multiple scorers, not single metric
Regression focus
Catch degradation early
Dataset versioning
Content-addressed test sets
Fast feedback
Quick evals block PRs
Insight recovery
Voice quality is measurable
Evaluation is about confidence that the system works. In Decision AI, we measure builder reliability, session compliance, and voice insight recovery to ensure quality across all components.
This document provides a realistic roadmap based on the ACTUAL current state of Decision AI and the FUTURE vision from thoughts/ documents in the codebase.
KEY INSIGHT: The original roadmap described building from scratch. However, significant infrastructure ALREADY EXISTS, including Builder Claude with meta-skills, workflow routing, and voice session architecture. This updated roadmap acknowledges what's built and focuses on what's needed next.
The system uses a configuration spectrum from hardcoded to dynamic:
HARDCODED ←─────────────────────────────────────────────────────→ DYNAMIC
Template Apps Supabase Templates Git-based Templates
(in code) (DB records) (external repos)
TEMPLATE_APPS = { session_templates Clone, analyze,
"mmm-studio"... table with fly_app use .claude/ from
} reference user's repo
Git as Source of Truth
Every session creates a GitHub repo
All changes tracked with commits
Templates can be saved as tags/branches
Full browsable history for audit
Builder IS the Session
Builder deploys copy of itself → New Session (same image, different app name)
No separate "base session image"—Builder Claude uses meta-skills to construct the right environment for any repo.
Next Actions
Immediate (This Week)
✅ Builder Claude already operational
Review error handling in current builds
Add build caching to avoid redundant deploys
Document Discord commands for users
Short-term (Next 2 Weeks)
Improve template listing UX in Discord
Add git_repo_url field to templates
Benchmark insight recovery in voice sessions
Medium-term (Weeks 3-6)
Implement voice enhancement features
Design pack.yaml schema
Create 3-5 reference packs (MMM, data exploration, API dev)
Summary
What
Status
Timeline
Core Infrastructure
✅ DONE
Complete
Builder Claude
✅ DONE
Complete
Git Session Repos
✅ DONE
Complete
Voice Architecture
✅ DONE
Complete
Workflow Routing
✅ DONE
Complete
Stabilization
🔨 NEXT
Weeks 1-3
Enhanced Templates
📋 PLANNED
Weeks 4-6
Voice Enhancement
📋 PLANNED
Weeks 7-10
Pack System
📋 PLANNED
Weeks 11-14
Memory Layer
📋 PLANNED
Weeks 15+
Key insight: The system is MORE complete than expected. Voice architecture, workflow routing, and Builder Claude with meta-skills are all already implemented. The remaining work focuses on:
This document reflects the actual state of Decision AI as of January 2026.Timeline estimates are based on building incrementally on existing infrastructure.Voice architecture, workflow routing, and Builder Claude are IMPLEMENTED, not planned.
This directory contains a comprehensive analysis of Decision AI's architecture, comparing it to competitors and documenting the actual implemented system.