Skip to content

Instantly share code, notes, and snippets.

@ianphil
Created March 28, 2026 14:41
Show Gist options
  • Select an option

  • Save ianphil/550d53d7715568c14b2cd617c086c581 to your computer and use it in GitHub Desktop.

Select an option

Save ianphil/550d53d7715568c14b2cd617c086c581 to your computer and use it in GitHub Desktop.
Claude Subconscious — Reverse-Engineered Product Specification

Claude Subconscious — Product Specification

Reverse-engineered from source: letta-ai/claude-subconscious v2.0.2 Generated: 2026-03-28


1. Problem Statement

Claude Code is an AI coding assistant that operates in ephemeral sessions. Every session starts from zero — no memory of past conversations, learned user preferences, project context, or unfinished work. Users must re-explain their codebase, repeat preferences, and re-establish context every time.

There is no built-in mechanism for Claude Code to accumulate institutional knowledge across sessions, detect behavioral patterns, or proactively surface relevant context before a user asks for it.

2. Product Summary

Claude Subconscious is a persistent background agent that gives Claude Code a long-term memory. It observes session transcripts, reads the codebase, accumulates knowledge over time, and injects contextual guidance back into Claude Code before each prompt — without ever blocking the user's workflow.

It is not a memory database or a logging service. It is a second agent running underneath Claude Code with its own tools, reasoning, and personality — one that builds rapport, develops opinions, and participates in an ongoing dialogue across sessions.

3. Actors

Actor Description
User Developer using Claude Code. Sees whispered guidance inline. Can address Subconscious directly.
Claude Code The primary AI coding assistant. Receives injected context from Subconscious via stdout. Can address Subconscious in responses.
Subconscious Agent A Letta-hosted agent with persistent memory, tool access, and its own system prompt. Observes asynchronously, responds on next sync cycle.
Letta Platform Cloud or self-hosted server that hosts the agent, stores memory blocks, manages conversations, and routes messages.

4. Core Capabilities

4.1 Persistent Cross-Session Memory

The agent maintains structured memory across eight domains:

Memory Block What It Captures
core_directives Agent behavioral guidelines and processing logic
guidance Active message to whisper to the next Claude Code session
user_preferences Coding style, tool preferences, communication patterns
project_context Architecture decisions, codebase knowledge, known gotchas
session_patterns Recurring behaviors, time-based patterns, common struggles
pending_items Unfinished work, explicit TODOs, follow-up items
self_improvement Guidelines for evolving its own memory architecture
tool_guidelines How to use available tools effectively

Memory persists indefinitely on the Letta platform. A single agent brain is shared across all projects by default.

4.2 Session Transcript Observation

After each Claude Code response, the full transcript is sent to the Subconscious agent asynchronously. The transcript includes:

  • User messages (verbatim)
  • Assistant responses (including thinking blocks)
  • Tool uses and their results (summarized/truncated for readability)
  • Session summaries

The agent processes this material actively — extracting preferences from user corrections, noting stuck patterns, tracking architectural decisions, and identifying unfinished work.

4.3 Contextual Guidance Injection

Before each user prompt is processed, the system injects the agent's accumulated context into Claude Code's prompt via stdout. Two modes control what Claude sees:

Mode Injected Content
whisper (default) Only messages from Subconscious — lightweight, speaks when it has something to say
full Memory blocks (first prompt) + diffs of changed blocks (subsequent) + messages

Content is wrapped in XML tags (<letta_message>, <letta_memory_blocks>, <letta_memory_update>) and injected via stdout. Nothing is written to disk — no CLAUDE.md modifications, no file-based side effects.

4.4 Mid-Workflow Updates

During long tool-use workflows, the system checks for new messages or memory changes before each tool execution. If the agent has updated its guidance or memory while Claude Code was working, the updates are injected as additionalContext mid-stream — addressing "workflow drift" in long sessions.

4.5 Codebase-Aware Tool Access

The Subconscious agent has real tool access via the Letta Code SDK, configurable in three tiers:

Mode Available Tools Use Case
read-only (default) Read, Grep, Glob, web_search, fetch_webpage Safe background research and file exploration
full All tools including Bash, Edit, Write, Task Full autonomy — agent can make changes and spawn sub-agents
off None Listen-only — processes transcripts without client-side tools

This means the agent can read your files, search your codebase, and browse the web while processing transcripts — not just passively ingest text.

4.6 Two-Way Dialogue

The system supports bidirectional communication:

  • Claude Code → Subconscious: Claude Code can address the agent directly in responses. The agent sees everything in the transcript.
  • Subconscious → Claude Code: The agent's messages are injected before the next prompt. The user sees them too.
  • User → Subconscious: Users can address the agent through Claude Code. The agent responds on the next sync cycle.

This is designed as an ongoing dialogue, not one-way surveillance.

4.7 Conversation Isolation

Each Claude Code session gets its own Letta conversation thread. This provides:

  • Session-scoped context: Messages within a session stay in that conversation
  • Shared memory: All conversations feed into the same agent brain (memory blocks are global)
  • Parallel sessions: Multiple Claude Code sessions can run simultaneously, each with their own conversation, all updating the same agent

4.8 Zero-Configuration Setup

The system requires only one credential (LETTA_API_KEY) and handles everything else:

  1. Auto-imports a bundled default agent if none configured
  2. Auto-detects available models on the Letta server
  3. Auto-selects the best available model if the configured one isn't present
  4. Creates conversations automatically per session
  5. Manages all state files without user intervention

5. Observable Behavior

5.1 Session Lifecycle

Session Start
  → Agent notified with project path, session ID, timestamp
  → Legacy CLAUDE.md content cleaned up
  → Conversation created (or existing one reused)
  → TTY banner displayed: agent name, model, mode, URL

Before Each Prompt
  → Memory blocks fetched (diffs computed against last snapshot)
  → New agent messages retrieved
  → Content injected via stdout as XML
  → State snapshot updated

Before Each Tool Use
  → Quick check for new messages or memory changes
  → If updates found, inject as additionalContext
  → Silent no-op if nothing changed (fast path)

After Each Response
  → Full transcript extracted from JSONL session file
  → Formatted as XML and written to temp payload file
  → Background worker spawned (detached, non-blocking)
  → Worker sends payload to agent via Letta Code SDK
  → Agent processes transcript with tool access
  → Agent updates memory blocks as needed
  → Worker updates state file on success

5.2 What the User Sees

On session start (TTY output — not captured by Claude):

   👁️  Subconscious connecting...
   Agent: Subconscious (agent-xxxx)
   Model: anthropic/claude-sonnet-4-5
   Mode: whisper | SDK Tools: read-only
   🔗 https://app.letta.com/agents/agent-xxx?conversation=conv-xxx

Before prompts (injected into Claude's context):

<letta_message from="Subconscious" timestamp="2026-01-26T20:37:14+00:00">
You've asked about error handling in async contexts three times this week.
Consider reviewing error handling architecture holistically.
</letta_message>

5.3 What the User Does NOT See

  • No files written to disk (no CLAUDE.md modification)
  • No blocking delays (transcript processing is fully async)
  • No configuration files to maintain (auto-managed)
  • No popups or console windows on Windows (silent launcher)

6. User Flows

6.1 First-Time Setup

  1. User installs plugin via marketplace or clones repo
  2. User sets LETTA_API_KEY environment variable
  3. User starts a Claude Code session
  4. Plugin auto-imports the bundled Subconscious agent
  5. Agent ID saved to ~/.letta/claude-subconscious/config.json
  6. First few sessions: agent observes but has minimal context to whisper
  7. Over time: agent accumulates preferences, project knowledge, patterns
  8. Eventually: agent proactively surfaces relevant context before each prompt

6.2 Multi-Project Usage

  1. User works in Project A → conversations stored in project-a/.letta/claude/
  2. User switches to Project B → conversations stored in project-b/.letta/claude/
  3. Same agent brain serves both → memory blocks shared across projects
  4. Agent develops cross-project awareness over time

6.3 Custom Agent

  1. User creates custom agent on Letta platform (or via ADE)
  2. User sets LETTA_AGENT_ID in environment or .envrc
  3. Plugin uses that agent instead of the default
  4. Agent's own memory architecture and system prompt apply

6.4 Self-Hosted Deployment

  1. User runs their own Letta server
  2. User sets LETTA_BASE_URL to their server address
  3. Plugin auto-detects available models on that server
  4. All API calls route to the self-hosted instance

7. Constraints

7.1 Architectural Constraints

  • Hook-based integration: Bound to Claude Code's plugin hook lifecycle (SessionStart, UserPromptSubmit, PreToolUse, Stop). Cannot intercept arbitrary events.
  • Stdout-only injection: All context delivery to Claude Code happens via stdout XML. No direct memory manipulation of Claude Code's context window.
  • Async transcript delivery: Transcripts are sent after Claude responds, not during. The agent always observes one response behind.
  • Hook timeouts: SessionStart and PreToolUse have 5s limits. UserPromptSubmit has 10s. Stop has 120s. Operations must complete within these windows (or be delegated to background workers).

7.2 Platform Dependencies

  • Letta Platform required: Agent hosting, memory storage, conversation management, and model inference all depend on the Letta server (cloud or self-hosted).
  • Node.js ≥ 18: Runtime requirement for all hook scripts.
  • Letta Code SDK: The @letta-ai/letta-code-sdk package is required for transcript delivery with tool access.

7.3 Operational Constraints

  • Cold start: Agent starts with minimal context. Takes several sessions to accumulate useful knowledge.
  • One agent per config: A single agent ID is stored globally. Per-project agents require explicit LETTA_AGENT_ID overrides.
  • State locality: Conversation mappings are stored in the project directory (.letta/claude/). Moving projects or changing working directories loses conversation continuity.
  • No real-time streaming: Agent guidance arrives at prompt boundaries and tool-use boundaries — not mid-generation.

8. Non-Goals

  • Not a replacement for Claude Code: Subconscious observes and advises — it does not take over coding tasks (unless SDK tools mode is set to "full").
  • Not a code generation tool: The agent's purpose is context and memory, not producing code artifacts.
  • Not a CLAUDE.md manager: Deliberately avoids writing to disk. Earlier versions synced to CLAUDE.md; this was removed in favor of stdout injection.
  • Not a conversation log: The agent processes and forgets transcripts — it extracts signal into memory blocks, not verbatim storage.
  • Not a real-time pair programmer: Communication is asynchronous and batched at hook boundaries, not interactive.
  • Not model-locked: Supports any model available on the Letta server (OpenAI, Anthropic, Google, ZAI). Auto-selects if configured model unavailable.

9. Integration Surface

9.1 Claude Code Plugin System

  • Registration: hooks/hooks.json defines four lifecycle hooks
  • Execution: Each hook invokes a TypeScript script via tsx (wrapped by silent-npx.cjs for cross-platform support)
  • I/O contract: Hooks receive JSON on stdin, produce output on stdout (XML for context injection, JSON for PreToolUse)

9.2 Letta Server API

Six REST endpoints consumed:

Endpoint Method Purpose
/conversations/ POST Create conversation for session
/conversations/{id}/messages GET Fetch agent messages
/agents/{id} GET Fetch agent + memory blocks
/agents/{id} PATCH Update tags, model config
/agents/import POST Import agent from .af file
/models/ GET List available models

9.3 Letta Code SDK

  • resumeSession() — Resume conversation with tool restrictions
  • session.send() / session.stream() — Send transcript, stream response
  • Tool permission model: allowed/disallowed tool lists per session

9.4 Environment Variables

Variable Required Purpose
LETTA_API_KEY Yes Authentication with Letta platform
LETTA_MODE No Output mode: whisper, full, off
LETTA_AGENT_ID No Override agent selection
LETTA_BASE_URL No Self-hosted server URL
LETTA_MODEL No Override model selection
LETTA_CONTEXT_WINDOW No Override context window size
LETTA_HOME No Base directory for state files
LETTA_SDK_TOOLS No Tool access level: read-only, full, off

10. Platform Compatibility

Platform Support Notes
macOS Full Primary development target
Linux Full Requires tmpfs workaround if /tmp on separate filesystem
Windows Full Custom SilentLauncher.exe eliminates console window flashes via PseudoConsole (ConPTY)

This specification describes the observable product behavior of Claude Subconscious as derived from its source code. It is implementation-agnostic where possible and focuses on what the software does, not how it does it internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment