Skip to content

Instantly share code, notes, and snippets.

@clsandoval
Created January 16, 2026 09:00
Show Gist options
  • Select an option

  • Save clsandoval/70506a5a2928582ed181a43d5b0bc41e to your computer and use it in GitHub Desktop.

Select an option

Save clsandoval/70506a5a2928582ed181a43d5b0bc41e to your computer and use it in GitHub Desktop.
LangMem Integration Analysis for Decision AI

LangMem Integration Strategy

How Decision AI can leverage LangMem for intelligent memory management across voice sessions, Decision Packs, and multi-agent orchestration


Executive Summary

Why LangMem?

LangMem provides a production-ready foundation for the memory layer Decision AI needs but hasn't yet built. Instead of implementing memory extraction, storage, and retrieval from scratch, LangMem offers:

  1. Memory Manager - Extracts semantic, episodic, and procedural memories from conversations
  2. Store Manager - Automatically persists memories to any BaseStore (including AsyncPostgresStore/Supabase)
  3. Memory Tools - Allows agents to consciously manage their own memories
  4. Summarization - Handles long context through intelligent summarization
  5. Prompt Optimizer - Refines system prompts based on conversation feedback
  6. Namespace Scoping - Hierarchical memory organization by user/pack/session

The 70% Fit Verdict

LangMem covers ~70% of Decision AI's memory needs out of the box:

Need LangMem Coverage Gap
Memory extraction from conversations Full coverage -
Semantic search over memories Full coverage -
User preference tracking Full coverage -
Thread/session summarization Full coverage -
Namespace-based scoping Full coverage -
Custom extraction schemas Full coverage -
Voice session context Partial Need streaming-aware extraction
Multi-agent memory bridge Partial Need custom bridging layer
Decision outcome tracking Not covered Need custom implementation
Pack-level shared memory Partial Need permission model
Hot-path voice latency Partial Need optimization layer

Recommendation: Adopt LangMem as the foundation, build ~30% custom on top.


What LangMem Provides

1. Memory Manager (create_memory_manager)

The core extraction engine that processes conversations and generates structured memories:

from langmem import create_memory_manager
from pydantic import BaseModel

class DecisionMemory(BaseModel):
    """Custom schema for Decision AI memories."""
    content: str
    decision_type: str  # 'preference', 'fact', 'outcome', 'pattern'
    confidence: float
    source_context: str | None = None

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[DecisionMemory],
    instructions="""Extract user preferences, decisions made,
    and patterns from conversations. Note confidence levels.""",
    enable_inserts=True,
    enable_updates=True,
    enable_deletes=True,
)

# Extract memories from conversation
memories = manager.invoke({
    "messages": conversation,
    "existing": previous_memories  # For update/consolidation
})

Key features:

  • Custom Pydantic schemas for structured extraction
  • Automatic deduplication and consolidation
  • Update/delete existing memories when information changes
  • Works with any LLM provider

2. Store Manager (create_memory_store_manager)

Handles persistence automatically with LangGraph's BaseStore:

from langmem import create_memory_store_manager
from langgraph.store.postgres import AsyncPostgresStore

# Use Supabase as the backing store
store = AsyncPostgresStore(
    connection_string=os.getenv("SUPABASE_DB_URL"),
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

memory_manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("decision_ai", "{user_id}", "memories"),
    store=store,
    query_limit=10,
)

3. Memory Tools

Allow agents to consciously manage their own memories:

from langmem import create_manage_memory_tool, create_search_memory_tool

tools = [
    create_manage_memory_tool(
        namespace=("decision_ai", "{user_id}", "memories")
    ),
    create_search_memory_tool(
        namespace=("decision_ai", "{user_id}", "memories")
    ),
]

# Agent can now call:
# - manage_memory(action="insert", content="User prefers dark mode")
# - search_memory(query="user preferences")

4. Summarization (SummarizationNode)

Manages long context through intelligent summarization:

from langmem.short_term import SummarizationNode, summarize_messages

summarization_node = SummarizationNode(
    token_counter=model.get_num_tokens_from_messages,
    model=summarization_model,
    max_tokens=4096,
    max_tokens_before_summary=8192,
    max_summary_tokens=1024,
)

5. Prompt Optimizer (create_prompt_optimizer)

Refines system prompts based on conversation outcomes:

from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="metaprompt",
    config={"max_reflection_steps": 3}
)

# Improve prompt based on user feedback
optimized = optimizer.invoke({
    "trajectories": [(conversation, {"user_score": 0.3})],
    "prompt": current_system_prompt
})

6. Reflection Executor

Background memory processing with debouncing:

from langmem import ReflectionExecutor

executor = ReflectionExecutor(memory_manager)

# Defer processing until conversation settles
executor.submit(
    {"messages": conversation},
    after_seconds=300  # Wait 5 minutes for activity to settle
)

What We Need to Build Custom

1. Voice Context Manager

LangMem processes complete conversations, but voice sessions need streaming-aware memory extraction:

class VoiceContextManager:
    """Custom layer for voice session memory management."""

    def __init__(self, memory_manager: MemoryStoreManager):
        self.memory_manager = memory_manager
        self.pending_utterances: list = []
        self.last_extraction_time: datetime = None

    async def on_utterance(self, utterance: dict):
        """Handle real-time voice utterances."""
        self.pending_utterances.append(utterance)

        # Don't extract on every utterance - batch them
        if self._should_extract():
            await self._extract_memories()

    def _should_extract(self) -> bool:
        """Determine if we should trigger extraction."""
        # Extract after silence, after N utterances, or after time threshold
        return (
            len(self.pending_utterances) >= 10 or
            self._silence_detected() or
            self._time_threshold_exceeded()
        )

    async def _extract_memories(self):
        """Extract memories from buffered utterances."""
        messages = self._format_as_messages(self.pending_utterances)

        # Use background extraction to not block voice
        await self.memory_manager.ainvoke({"messages": messages})

        self.pending_utterances = []
        self.last_extraction_time = datetime.now()

Why custom: Voice has unique requirements:

  • Can't wait for conversation "end" - it's continuous
  • Utterances arrive in real-time
  • Must not add latency to Fast Agent responses
  • Need to coordinate with Supervisor Loop timing

2. Multi-Agent Memory Bridge

Decision AI's 3-Claude voice architecture needs memory bridging:

class MultiAgentMemoryBridge:
    """Bridge memories between Fast Agent, Supervisor, and Session."""

    def __init__(
        self,
        store: BaseStore,
        user_id: str,
        session_id: str
    ):
        self.store = store
        self.user_id = user_id
        self.session_id = session_id

        # Different namespaces for different agents
        self.namespaces = {
            "user": ("decision_ai", user_id, "user_memories"),
            "session": ("decision_ai", user_id, session_id, "session_memories"),
            "supervisor": ("decision_ai", user_id, session_id, "supervisor_context"),
        }

    async def get_fast_agent_context(self) -> str:
        """Get minimal context for Fast Agent (latency-critical)."""
        # Only retrieve high-priority user memories
        user_memories = await self.store.asearch(
            self.namespaces["user"],
            query="user preferences and important facts",
            limit=5,
            filter={"priority": "high"}
        )
        return self._format_for_fast_agent(user_memories)

    async def get_supervisor_context(self) -> list:
        """Get full context for Supervisor (depth-critical)."""
        # Retrieve all relevant memories across namespaces
        results = await asyncio.gather(
            self.store.asearch(self.namespaces["user"], limit=20),
            self.store.asearch(self.namespaces["session"], limit=20),
        )
        return self._merge_results(results)

    async def promote_session_memory(self, memory_id: str):
        """Promote a session memory to user-level (persists across sessions)."""
        session_mem = await self.store.aget(
            self.namespaces["session"],
            memory_id
        )
        if session_mem:
            await self.store.aput(
                self.namespaces["user"],
                key=f"promoted_{memory_id}",
                value=session_mem.value
            )

Why custom: LangMem doesn't know about our multi-agent architecture:

  • Different agents need different memory views
  • Memory promotion (session to user) is Decision AI specific
  • Latency requirements differ by agent role

3. Decision Outcome Tracker

Track decisions and their outcomes for learning:

class DecisionOutcomeTracker:
    """Track decisions, outcomes, and learn patterns."""

    def __init__(self, memory_manager: MemoryStoreManager):
        self.memory_manager = memory_manager
        self.pending_decisions: dict[str, PendingDecision] = {}

    async def record_decision(
        self,
        decision_id: str,
        context: str,
        options_presented: list[str],
        option_chosen: str,
        rationale: str | None = None
    ):
        """Record a decision that was made."""
        decision = PendingDecision(
            id=decision_id,
            context=context,
            options=options_presented,
            chosen=option_chosen,
            rationale=rationale,
            timestamp=datetime.now()
        )
        self.pending_decisions[decision_id] = decision

        # Store as memory
        await self.memory_manager.aput(
            key=f"decision_{decision_id}",
            value={
                "kind": "Decision",
                "content": {
                    "context": context,
                    "chosen": option_chosen,
                    "rationale": rationale,
                    "outcome": "pending"
                }
            }
        )

    async def record_outcome(
        self,
        decision_id: str,
        outcome: str,
        satisfaction: float,  # 0-1
        learnings: str | None = None
    ):
        """Record the outcome of a previous decision."""
        if decision_id in self.pending_decisions:
            decision = self.pending_decisions.pop(decision_id)

            # Update the decision memory with outcome
            existing = await self.memory_manager.aget(f"decision_{decision_id}")
            if existing:
                existing.value["content"]["outcome"] = outcome
                existing.value["content"]["satisfaction"] = satisfaction
                existing.value["content"]["learnings"] = learnings

                await self.memory_manager.aput(
                    key=f"decision_{decision_id}",
                    value=existing.value
                )

            # Extract patterns if low satisfaction
            if satisfaction < 0.5:
                await self._extract_failure_pattern(decision, outcome)

    async def _extract_failure_pattern(self, decision, outcome):
        """Extract patterns from failed decisions for future avoidance."""
        # Use LLM to extract learnings
        pattern = await self._analyze_failure(decision, outcome)

        await self.memory_manager.aput(
            key=f"pattern_{uuid4()}",
            value={
                "kind": "Pattern",
                "content": {
                    "type": "failure_pattern",
                    "context": decision.context,
                    "pattern": pattern,
                    "avoid_in_future": True
                }
            }
        )

Why custom: LangMem extracts memories but doesn't track decision outcomes:

  • We need to link decisions to results
  • Learning from failures is Decision AI core functionality
  • Pattern extraction from outcomes is domain-specific

4. Streaming-Aware Summarization

Adapt summarization for voice streaming:

class StreamingSummarizer:
    """Summarization that works with voice streaming."""

    def __init__(self, base_summarizer: SummarizationNode):
        self.base_summarizer = base_summarizer
        self.running_summary: RunningSummary | None = None
        self.buffer: list = []

    async def on_message(self, message: dict) -> str | None:
        """Process incoming message, return summary if triggered."""
        self.buffer.append(message)

        # Check if we need to summarize
        if self._should_summarize():
            result = await self.base_summarizer.ainvoke({
                "messages": self.buffer,
                "running_summary": self.running_summary
            })

            self.running_summary = result.running_summary
            self.buffer = result.remaining_messages

            return result.summary

        return None

    def get_context_for_fast_agent(self) -> str:
        """Get current summary + recent buffer for Fast Agent."""
        summary_text = self.running_summary.summary if self.running_summary else ""
        recent = self._format_recent_messages(self.buffer[-5:])

        return f"""## Summary
{summary_text}

## Recent Context
{recent}"""

Why custom: Standard summarization waits for "end" of conversation:

  • Voice sessions are continuous - no clear end
  • Need incremental summarization
  • Must coordinate with Fast Agent context updates

5. Pack Memory Loader

Load Decision Pack memories into session context:

class PackMemoryLoader:
    """Load and manage Decision Pack shared memories."""

    def __init__(self, store: BaseStore):
        self.store = store

    async def load_pack_context(
        self,
        pack_id: str,
        user_id: str,
        query: str | None = None
    ) -> PackContext:
        """Load pack-level and user-level memories for a session."""

        # Pack-level memories (shared across all users of this pack)
        pack_memories = await self.store.asearch(
            ("packs", pack_id, "shared_memories"),
            query=query,
            limit=10
        )

        # User's memories within this pack
        user_pack_memories = await self.store.asearch(
            ("packs", pack_id, "users", user_id, "memories"),
            query=query,
            limit=10
        )

        # Pack's learned patterns (from all users, anonymized)
        pack_patterns = await self.store.asearch(
            ("packs", pack_id, "patterns"),
            query=query,
            limit=5
        )

        return PackContext(
            pack_memories=pack_memories,
            user_memories=user_pack_memories,
            patterns=pack_patterns
        )

    async def contribute_to_pack(
        self,
        pack_id: str,
        memory: dict,
        anonymize: bool = True
    ):
        """Contribute a memory back to the pack's shared knowledge."""
        if anonymize:
            memory = self._anonymize_memory(memory)

        await self.store.aput(
            ("packs", pack_id, "shared_memories"),
            key=str(uuid4()),
            value=memory
        )

Why custom: Decision Packs have unique memory requirements:

  • Pack-level shared memories (templates, patterns)
  • User-specific memories within pack context
  • Memory contribution/learning from usage
  • Permission model for shared vs private

Integration Architecture Diagram

+-------------------------------------------------------------------------------------+
|                        LANGMEM INTEGRATION ARCHITECTURE                              |
+-------------------------------------------------------------------------------------+
|                                                                                      |
|                              LANGMEM CORE                                            |
|   +-----------------------------------------------------------------------+         |
|   |                                                                       |         |
|   |  +--------------+  +--------------+  +--------------+  +--------------+         |
|   |  |   Memory     |  |    Store     |  |   Memory     |  |    Prompt    |         |
|   |  |   Manager    |  |   Manager    |  |    Tools     |  |   Optimizer  |         |
|   |  +------+-------+  +------+-------+  +------+-------+  +------+-------+         |
|   |         |                 |                 |                 |                 |
|   |         +-----------------+-----------------+-----------------+                 |
|   |                           |                 |                                    |
|   +---------------------------+-----------------+------------------------------------+
|                               |                 |                                    |
|   ==========================================================================================
|                                                                                      |
|                         CUSTOM DECISION AI LAYER                                     |
|   +-----------------------------------------------------------------------+         |
|   |                                                                       |         |
|   |  +-----------------+  +-----------------+  +-----------------+        |         |
|   |  | Voice Context   |  |  Multi-Agent    |  |    Decision     |        |         |
|   |  | Manager         |  |  Memory Bridge  |  | Outcome Tracker |        |         |
|   |  |                 |  |                 |  |                 |        |         |
|   |  | * Streaming     |  | * Fast Agent    |  | * Record        |        |         |
|   |  |   extraction    |  |   context       |  |   decisions     |        |         |
|   |  | * Utterance     |  | * Supervisor    |  | * Track         |        |         |
|   |  |   batching      |  |   context       |  |   outcomes      |        |         |
|   |  | * Latency       |  | * Memory        |  | * Extract       |        |         |
|   |  |   optimization  |  |   promotion     |  |   patterns      |        |         |
|   |  +-----------------+  +-----------------+  +-----------------+        |         |
|   |                                                                       |         |
|   |  +-----------------+  +-----------------+                             |         |
|   |  |   Streaming     |  |  Pack Memory    |                             |         |
|   |  |   Summarizer    |  |  Loader         |                             |         |
|   |  |                 |  |                 |                             |         |
|   |  | * Incremental   |  | * Pack-level    |                             |         |
|   |  |   summaries     |  |   memories      |                             |         |
|   |  | * Voice-aware   |  | * User-pack     |                             |         |
|   |  |   thresholds    |  |   memories      |                             |         |
|   |  +-----------------+  | * Contribute    |                             |         |
|   |                       |   back          |                             |         |
|   |                       +-----------------+                             |         |
|   +-----------------------------------------------------------------------+         |
|                                                                                      |
|   ==========================================================================================
|                                                                                      |
|                         ASYNCPOSTGRESSTORE (SUPABASE)                                |
|   +-----------------------------------------------------------------------+         |
|   |                                                                       |         |
|   |   +-------------------------------------------------------------+    |         |
|   |   |                      NAMESPACE HIERARCHY                     |    |         |
|   |   |                                                              |    |         |
|   |   |   decision_ai/                                               |    |         |
|   |   |   +-- {user_id}/                                             |    |         |
|   |   |   |   +-- memories/          # User-level persistent         |    |         |
|   |   |   |   +-- preferences/       # User preferences              |    |         |
|   |   |   |   +-- patterns/          # Learned user patterns         |    |         |
|   |   |   |   +-- {session_id}/      # Session-specific              |    |         |
|   |   |   |       +-- session_memories/                              |    |         |
|   |   |   |       +-- supervisor_context/                            |    |         |
|   |   |   |                                                          |    |         |
|   |   |   +-- packs/                                                 |    |         |
|   |   |       +-- {pack_id}/                                         |    |         |
|   |   |           +-- shared_memories/  # Pack-level knowledge       |    |         |
|   |   |           +-- patterns/         # Pack-learned patterns      |    |         |
|   |   |           +-- users/            # Per-user within pack       |    |         |
|   |   |               +-- {user_id}/                                 |    |         |
|   |   |                   +-- memories/                              |    |         |
|   |   |                                                              |    |         |
|   |   +-------------------------------------------------------------+    |         |
|   |                                                                       |         |
|   |   pgvector extension for semantic search                              |         |
|   |   Automatic embedding on insert                                       |         |
|   |                                                                       |         |
|   +-----------------------------------------------------------------------+         |
|                                                                                      |
+-------------------------------------------------------------------------------------+

Key LangMem Concepts for Decision AI

Memory Types

LangMem supports three memory types that map to Decision AI needs:

Memory Type Human Analogy Decision AI Use Case
Semantic Facts & Knowledge User preferences, domain knowledge, pack patterns
Episodic Past Experiences Previous decisions, conversation history, outcomes
Procedural How to do things System prompts, agent behaviors, learned procedures

Semantic Memory in Decision AI

class UserPreference(BaseModel):
    """Semantic memory: User preferences."""
    category: str  # 'communication', 'analysis', 'data_format'
    preference: str
    strength: float  # How strongly they prefer this
    learned_from: str  # Where we learned this

class DomainKnowledge(BaseModel):
    """Semantic memory: Domain knowledge from packs."""
    domain: str  # 'mmm', 'financial_planning', etc.
    fact: str
    confidence: float
    source: str

Episodic Memory in Decision AI

class DecisionEpisode(BaseModel):
    """Episodic memory: A decision that was made."""
    context: str
    options: list[str]
    chosen: str
    outcome: str | None
    satisfaction: float | None
    timestamp: datetime

class ConversationSummary(BaseModel):
    """Episodic memory: Summary of a conversation."""
    session_id: str
    main_topics: list[str]
    key_decisions: list[str]
    action_items: list[str]
    user_sentiment: str

Procedural Memory in Decision AI

class AgentBehavior(BaseModel):
    """Procedural memory: How agents should behave."""
    trigger: str  # When this applies
    behavior: str  # What to do
    reasoning: str  # Why
    learned_from: list[str]  # Feedback that shaped this

Namespace-Based Scoping

LangMem namespaces enable Decision AI's permission model:

# User-level memories (persist across sessions)
USER_NAMESPACE = ("decision_ai", "{user_id}", "memories")

# Session-level memories (ephemeral, within single session)
SESSION_NAMESPACE = ("decision_ai", "{user_id}", "{session_id}", "session_memories")

# Pack-level shared (all users of this pack)
PACK_SHARED_NAMESPACE = ("packs", "{pack_id}", "shared_memories")

# User within pack (user's experience with this specific pack)
PACK_USER_NAMESPACE = ("packs", "{pack_id}", "users", "{user_id}", "memories")

# Supervisor working context (Fast Agent reads this)
SUPERVISOR_NAMESPACE = ("decision_ai", "{user_id}", "{session_id}", "supervisor_context")

Extraction Schemas for Memories

Define what memories look like for Decision AI:

from pydantic import BaseModel, Field
from typing import Literal

class DecisionAIMemory(BaseModel):
    """Base schema for all Decision AI memories."""
    content: str
    memory_type: Literal["preference", "fact", "decision", "pattern", "episode"]
    confidence: float = Field(ge=0, le=1, default=0.8)
    source: str  # 'user_stated', 'inferred', 'pack_default'

class UserPreferenceMemory(DecisionAIMemory):
    """User preference memory."""
    memory_type: Literal["preference"] = "preference"
    category: str  # 'response_style', 'data_format', 'analysis_depth'
    strength: float = Field(ge=0, le=1)  # How strongly they prefer

class DecisionOutcomeMemory(DecisionAIMemory):
    """Memory of a decision and its outcome."""
    memory_type: Literal["decision"] = "decision"
    context: str
    options_presented: list[str]
    option_chosen: str
    outcome: str | None = None
    satisfaction: float | None = None

class PatternMemory(DecisionAIMemory):
    """Learned pattern memory."""
    memory_type: Literal["pattern"] = "pattern"
    pattern_type: Literal["success", "failure", "preference"]
    conditions: list[str]  # When this pattern applies
    recommendation: str  # What to do/avoid

Hot-Path vs Cold-Path Processing

LangMem distinguishes between:

Path Timing Decision AI Use
Hot-Path During conversation Fast Agent memory tools
Cold-Path Background, after activity Supervisor memory extraction
# HOT PATH: Fast Agent uses memory tools during conversation
@entrypoint(store=store)
async def fast_agent(message: str, curated_context: str):
    # Search memories as part of response generation
    memories = await search_memory_tool.ainvoke({
        "query": message,
        "limit": 3
    })

    response = await llm.ainvoke(
        [{"role": "system", "content": curated_context}] +
        [{"role": "user", "content": f"Memories: {memories}\n\n{message}"}]
    )
    return response

# COLD PATH: Supervisor extracts memories in background
async def supervisor_background_extraction(conversation_log: list):
    # Process after conversation settles
    await reflection_executor.submit(
        {"messages": conversation_log},
        after_seconds=300  # 5 minute delay
    )

Voice Session Memory Flow

How memories flow through the 3-Claude voice architecture:

+-------------------------------------------------------------------------------------+
|                    VOICE SESSION MEMORY FLOW                                         |
+-------------------------------------------------------------------------------------+
|                                                                                      |
|  USER SPEAKS                                                                         |
|      |                                                                               |
|      v                                                                               |
|  +-----------------+                                                                 |
|  |  Voice Context  |  Buffer utterances, batch for extraction                        |
|  |  Manager        |  Don't block voice response                                     |
|  +--------+--------+                                                                 |
|           |                                                                          |
|           | writes to                                                                |
|           v                                                                          |
|  +-----------------+         +-----------------+                                     |
|  | conversation_log | <----- |  FAST AGENT     |  Responds immediately              |
|  +--------+--------+         |  (Haiku)        |  Reads curated_context             |
|           |                  +--------+--------+  NO memory writes                  |
|           |                           |                                              |
|           | polls every 5s            | reads                                        |
|           |                           |                                              |
|           v                           v                                              |
|  +-----------------+         +-----------------+                                     |
|  |   SUPERVISOR    | ------> | curated_context |                                     |
|  |   (Opus)        | writes  +-----------------+                                     |
|  +--------+--------+                                                                 |
|           |                                                                          |
|           | extracts memories via                                                    |
|           |                                                                          |
|           v                                                                          |
|  +-------------------------------------------------------------------------+        |
|  |                          LANGMEM STACK                                   |        |
|  |                                                                          |        |
|  |   +--------------+    +--------------+    +--------------+              |        |
|  |   |   Memory     | -> |    Store     | -> |  Supabase    |              |        |
|  |   |   Manager    |    |   Manager    |    |  (pgvector)  |              |        |
|  |   +--------------+    +--------------+    +--------------+              |        |
|  |                                                                          |        |
|  |   Extraction -------> Persistence -------> Semantic Search              |        |
|  |                                                                          |        |
|  +-------------------------------------------------------------------------+        |
|                                                                                      |
|  ====================================================================================|
|                                                                                      |
|  MEMORY PROMOTION FLOW:                                                              |
|                                                                                      |
|  Session Memory --> [User confirms value] --> User Memory (persists)                |
|                                                                                      |
|  Example:                                                                            |
|  "User mentioned they prefer morning meetings" (session)                             |
|       |                                                                              |
|       v                                                                              |
|  [User schedules morning meeting + positive feedback]                                |
|       |                                                                              |
|       v                                                                              |
|  "User prefers morning meetings" (promoted to user-level)                            |
|                                                                                      |
+-------------------------------------------------------------------------------------+

Memory Retrieval During Voice

async def prepare_fast_agent_context(
    session: VoiceSession,
    memory_bridge: MultiAgentMemoryBridge
) -> str:
    """Prepare context for Fast Agent - must be FAST."""

    # Get pre-curated context from supervisor
    curated = session.curated_context

    # Optionally add high-priority memories (cached, not queried)
    priority_memories = await memory_bridge.get_cached_priority_memories()

    # Keep total context small for latency
    return f"""## Instructions
You are a helpful voice assistant. Respond conversationally.

## Curated Context
{curated}

## Key User Preferences
{priority_memories}

## Guidelines
- Be concise (voice)
- Reference thread for details
- Don't repeat what supervisor already covered
"""

Decision Pack Memory

How Decision Packs access and contribute to memory:

+-------------------------------------------------------------------------------------+
|                    DECISION PACK MEMORY ARCHITECTURE                                 |
+-------------------------------------------------------------------------------------+
|                                                                                      |
|                              PACK: "MMM Analyst"                                     |
|  +-------------------------------------------------------------------------+        |
|  |                                                                          |        |
|  |   PACK-LEVEL MEMORIES (shared across all users)                          |        |
|  |   namespace: ("packs", "mmm-analyst", "shared_memories")                 |        |
|  |                                                                          |        |
|  |   +-----------------------------------------------------------+         |        |
|  |   | * "MMM models work best with 2+ years of data"            |         |        |
|  |   | * "Always check for data seasonality before modeling"     |         |        |
|  |   | * "Users often confuse ROAS with contribution"            |         |        |
|  |   | * "PyMC-Marketing requires specific data format"          |         |        |
|  |   +-----------------------------------------------------------+         |        |
|  |                                                                          |        |
|  +-------------------------------------------------------------------------+        |
|                              |                                                       |
|                              | loads at session start                                |
|                              v                                                       |
|  +-------------------------------------------------------------------------+        |
|  |                                                                          |        |
|  |   USER-PACK MEMORIES (user's experience with this pack)                  |        |
|  |   namespace: ("packs", "mmm-analyst", "users", "{user_id}", "memories")  |        |
|  |                                                                          |        |
|  |   +-----------------------------------------------------------+         |        |
|  |   | * "User prefers Bayesian over frequentist explanations"   |         |        |
|  |   | * "User's data has weekly seasonality"                    |         |        |
|  |   | * "User typically analyzes Q4 holiday periods"            |         |        |
|  |   | * "User's company uses BigQuery for data"                 |         |        |
|  |   +-----------------------------------------------------------+         |        |
|  |                                                                          |        |
|  +-------------------------------------------------------------------------+        |
|                              |                                                       |
|                              | session extracts & contributes                        |
|                              v                                                       |
|  +-------------------------------------------------------------------------+        |
|  |                                                                          |        |
|  |   PACK PATTERNS (learned from all users, anonymized)                     |        |
|  |   namespace: ("packs", "mmm-analyst", "patterns")                        |        |
|  |                                                                          |        |
|  |   +-----------------------------------------------------------+         |        |
|  |   | * "When users ask about attribution, they usually want    |         |        |
|  |   |    channel contribution breakdown"                        |         |        |
|  |   | * "Data formatting errors are the #1 session failure"     |         |        |
|  |   | * "Users who start with small datasets often scale up"    |         |        |
|  |   +-----------------------------------------------------------+         |        |
|  |                                                                          |        |
|  +-------------------------------------------------------------------------+        |
|                                                                                      |
|  ====================================================================================|
|                                                                                      |
|   CONTRIBUTION FLOW:                                                                 |
|                                                                                      |
|   Session discovers useful pattern                                                   |
|       |                                                                              |
|       v                                                                              |
|   Supervisor extracts as memory                                                      |
|       |                                                                              |
|       v                                                                              |
|   [Is pattern generalizable?]                                                        |
|       |                                                                              |
|       +-- YES --> Anonymize + contribute to pack patterns                            |
|       |                                                                              |
|       +-- NO ---> Store in user-pack namespace only                                  |
|                                                                                      |
+-------------------------------------------------------------------------------------+

Pack Memory Loading Code

class PackSession:
    """A session using a Decision Pack with memory support."""

    def __init__(
        self,
        pack_id: str,
        user_id: str,
        memory_store: BaseStore
    ):
        self.pack_id = pack_id
        self.user_id = user_id
        self.store = memory_store
        self.pack_loader = PackMemoryLoader(memory_store)

        # Memory managers for different scopes
        self.pack_memory_manager = create_memory_store_manager(
            "anthropic:claude-3-5-sonnet-latest",
            namespace=("packs", pack_id, "users", user_id, "memories"),
            store=memory_store,
        )

    async def initialize_session_context(self) -> str:
        """Load all relevant memories for session start."""
        pack_context = await self.pack_loader.load_pack_context(
            self.pack_id,
            self.user_id,
            query=None  # Load general context
        )

        return self._format_pack_context(pack_context)

    async def process_conversation(self, messages: list):
        """Extract memories from conversation."""
        # Extract to user-pack namespace
        await self.pack_memory_manager.ainvoke({"messages": messages})

        # Check for contribution opportunities
        await self._check_contribution_opportunities(messages)

    async def _check_contribution_opportunities(self, messages: list):
        """Check if any learnings should be contributed to pack."""
        # Use LLM to identify generalizable patterns
        patterns = await self._extract_generalizable_patterns(messages)

        for pattern in patterns:
            if pattern.is_generalizable and pattern.confidence > 0.8:
                await self.pack_loader.contribute_to_pack(
                    self.pack_id,
                    memory=pattern.as_dict(),
                    anonymize=True
                )

Implementation Phases

Phase 1: Foundation (Weeks 1-4)

Goal: Replace current session-only memory with LangMem-backed persistent memory.

Task Details Dependencies
Set up AsyncPostgresStore Connect LangMem to Supabase Supabase Pro (pgvector)
Implement basic user memories User preferences across sessions Store setup
Integrate with Supervisor Supervisor extracts memories in background User memories
Memory search in Fast Agent Fast Agent can search user memories User memories

Success criteria:

  • User says "remember I prefer dark mode" -> persists across sessions
  • Supervisor automatically extracts preferences from conversation
  • Fast Agent can retrieve relevant memories
# Phase 1 implementation sketch
from langmem import create_memory_store_manager, create_search_memory_tool
from langgraph.store.postgres import AsyncPostgresStore

# Setup
store = AsyncPostgresStore(
    connection_string=os.getenv("SUPABASE_DB_URL"),
    index={"dims": 1536, "embed": "openai:text-embedding-3-small"}
)

user_memory_manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("decision_ai", "{user_id}", "memories"),
    store=store,
)

# In Supervisor loop
async def supervisor_poll_cycle(session: VoiceSession):
    # ... existing supervisor logic ...

    # NEW: Extract memories from gap messages
    if gap_messages:
        await user_memory_manager.ainvoke({
            "messages": self._format_gap_messages(gap_messages)
        }, config={"configurable": {"user_id": session.user_id}})

Phase 2: Voice Optimization (Weeks 5-8)

Goal: Optimize memory for voice session latency requirements.

Task Details Dependencies
Voice Context Manager Streaming-aware memory extraction Phase 1
Multi-Agent Memory Bridge Different views for different agents Phase 1
Memory caching layer Pre-fetch high-priority memories Voice Context Manager
Streaming summarization Incremental summaries for voice Phase 1

Success criteria:

  • Fast Agent response time unchanged (<1s)
  • Supervisor uses full memory context
  • Session memories promote to user memories when valuable
# Phase 2 implementation sketch

class OptimizedVoiceMemory:
    """Phase 2: Optimized memory for voice sessions."""

    def __init__(self, store: BaseStore, user_id: str, session_id: str):
        self.bridge = MultiAgentMemoryBridge(store, user_id, session_id)
        self.voice_manager = VoiceContextManager(
            create_memory_store_manager(
                "anthropic:claude-3-5-sonnet-latest",
                namespace=("decision_ai", user_id, session_id, "session"),
                store=store,
            )
        )

        # Pre-cached memories for fast access
        self._priority_cache: list = []
        self._cache_refresh_task: asyncio.Task = None

    async def start_session(self):
        """Initialize memory for voice session."""
        # Pre-fetch priority memories
        self._priority_cache = await self.bridge.get_fast_agent_context()

        # Start background cache refresh
        self._cache_refresh_task = asyncio.create_task(
            self._refresh_cache_loop()
        )

    def get_fast_agent_context(self) -> str:
        """Get context for Fast Agent - synchronous, from cache."""
        return self._priority_cache  # No await - instant

Phase 3: Decision Packs & Outcomes (Weeks 9-12)

Goal: Full Decision Pack memory and outcome tracking.

Task Details Dependencies
Pack Memory Loader Load pack-level memories Phase 1
Decision Outcome Tracker Track decisions and outcomes Phase 1
Pattern extraction Extract patterns from outcomes Outcome Tracker
Pack contribution flow Anonymize and contribute learnings Pack Loader

Success criteria:

  • Packs have shared memories that improve over time
  • Decisions are tracked with outcomes
  • Patterns emerge from successful/failed decisions
  • Users contribute anonymized learnings back to packs
# Phase 3 implementation sketch

class DecisionPackWithMemory:
    """Phase 3: Full Decision Pack memory integration."""

    def __init__(
        self,
        pack_manifest: PackManifest,
        user_id: str,
        store: BaseStore
    ):
        self.manifest = pack_manifest
        self.user_id = user_id
        self.store = store

        # Memory components
        self.pack_loader = PackMemoryLoader(store)
        self.outcome_tracker = DecisionOutcomeTracker(
            create_memory_store_manager(
                "anthropic:claude-3-5-sonnet-latest",
                namespace=("packs", pack_manifest.id, "users", user_id, "decisions"),
                store=store,
            )
        )

    async def run_session(self, initial_query: str):
        """Run a pack session with full memory support."""

        # Load pack context
        pack_context = await self.pack_loader.load_pack_context(
            self.manifest.id,
            self.user_id,
            query=initial_query
        )

        # ... session execution ...

        # After session: process outcomes
        for decision in self.session_decisions:
            await self.outcome_tracker.record_decision(
                decision_id=decision.id,
                context=decision.context,
                options_presented=decision.options,
                option_chosen=decision.chosen
            )

        # Later: record outcomes
        async def on_user_feedback(decision_id: str, feedback: dict):
            await self.outcome_tracker.record_outcome(
                decision_id=decision_id,
                outcome=feedback["outcome"],
                satisfaction=feedback["satisfaction"]
            )

Summary

LangMem Provides (70%)

  1. Memory Manager - Extract structured memories from conversations
  2. Store Manager - Persist to Supabase via AsyncPostgresStore
  3. Memory Tools - Agent-controlled memory management
  4. Summarization - Long context management
  5. Prompt Optimizer - Learn from feedback
  6. Namespace Scoping - Hierarchical memory organization

We Build Custom (30%)

  1. Voice Context Manager - Streaming-aware extraction
  2. Multi-Agent Memory Bridge - Different views for 3-Claude architecture
  3. Decision Outcome Tracker - Link decisions to results
  4. Streaming Summarizer - Incremental voice summaries
  5. Pack Memory Loader - Decision Pack shared memory

Adoption Path

Phase Timeline Outcome
Phase 1 Weeks 1-4 Basic persistent user memory
Phase 2 Weeks 5-8 Optimized for voice latency
Phase 3 Weeks 9-12 Full pack memory + outcomes

Bottom line: LangMem gives us a production-ready foundation. We build the Decision AI-specific layer on top. Total effort: ~12 weeks to full memory system vs ~6+ months building from scratch.


LangMem transforms Decision AI from "stateless sessions" to "learning agents that remember". The 70/30 split lets us leverage battle-tested components while building the custom intelligence layer that makes Decision AI unique.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment