Claude Subconscious — Product Specification

Reverse-engineered from source: letta-ai/claude-subconscious v2.0.2 Generated: 2026-03-28

1. Problem Statement

Claude Code is an AI coding assistant that operates in ephemeral sessions. Every session starts from zero — no memory of past conversations, learned user preferences, project context, or unfinished work. Users must re-explain their codebase, repeat preferences, and re-establish context every time.

There is no built-in mechanism for Claude Code to accumulate institutional knowledge across sessions, detect behavioral patterns, or proactively surface relevant context before a user asks for it.

2. Product Summary

Claude Subconscious is a persistent background agent that gives Claude Code a long-term memory. It observes session transcripts, reads the codebase, accumulates knowledge over time, and injects contextual guidance back into Claude Code before each prompt — without ever blocking the user's workflow.

It is not a memory database or a logging service. It is a second agent running underneath Claude Code with its own tools, reasoning, and personality — one that builds rapport, develops opinions, and participates in an ongoing dialogue across sessions.

3. Actors

Actor	Description
User	Developer using Claude Code. Sees whispered guidance inline. Can address Subconscious directly.
Claude Code	The primary AI coding assistant. Receives injected context from Subconscious via stdout. Can address Subconscious in responses.
Subconscious Agent	A Letta-hosted agent with persistent memory, tool access, and its own system prompt. Observes asynchronously, responds on next sync cycle.
Letta Platform	Cloud or self-hosted server that hosts the agent, stores memory blocks, manages conversations, and routes messages.

4. Core Capabilities

4.1 Persistent Cross-Session Memory

The agent maintains structured memory across eight domains:

Memory Block	What It Captures
core_directives	Agent behavioral guidelines and processing logic
guidance	Active message to whisper to the next Claude Code session
user_preferences	Coding style, tool preferences, communication patterns
project_context	Architecture decisions, codebase knowledge, known gotchas
session_patterns	Recurring behaviors, time-based patterns, common struggles
pending_items	Unfinished work, explicit TODOs, follow-up items
self_improvement	Guidelines for evolving its own memory architecture
tool_guidelines	How to use available tools effectively

Memory persists indefinitely on the Letta platform. A single agent brain is shared across all projects by default.

4.2 Session Transcript Observation

After each Claude Code response, the full transcript is sent to the Subconscious agent asynchronously. The transcript includes:

User messages (verbatim)
Assistant responses (including thinking blocks)
Tool uses and their results (summarized/truncated for readability)
Session summaries

The agent processes this material actively — extracting preferences from user corrections, noting stuck patterns, tracking architectural decisions, and identifying unfinished work.

4.3 Contextual Guidance Injection

Before each user prompt is processed, the system injects the agent's accumulated context into Claude Code's prompt via stdout. Two modes control what Claude sees:

Mode	Injected Content
whisper (default)	Only messages from Subconscious — lightweight, speaks when it has something to say
full	Memory blocks (first prompt) + diffs of changed blocks (subsequent) + messages

Content is wrapped in XML tags (<letta_message>, <letta_memory_blocks>, <letta_memory_update>) and injected via stdout. Nothing is written to disk — no CLAUDE.md modifications, no file-based side effects.

4.4 Mid-Workflow Updates

During long tool-use workflows, the system checks for new messages or memory changes before each tool execution. If the agent has updated its guidance or memory while Claude Code was working, the updates are injected as additionalContext mid-stream — addressing "workflow drift" in long sessions.

4.5 Codebase-Aware Tool Access

The Subconscious agent has real tool access via the Letta Code SDK, configurable in three tiers:

Mode	Available Tools	Use Case
read-only (default)	Read, Grep, Glob, web_search, fetch_webpage	Safe background research and file exploration
full	All tools including Bash, Edit, Write, Task	Full autonomy — agent can make changes and spawn sub-agents
off	None	Listen-only — processes transcripts without client-side tools

This means the agent can read your files, search your codebase, and browse the web while processing transcripts — not just passively ingest text.

4.6 Two-Way Dialogue

The system supports bidirectional communication:

Claude Code → Subconscious: Claude Code can address the agent directly in responses. The agent sees everything in the transcript.
Subconscious → Claude Code: The agent's messages are injected before the next prompt. The user sees them too.
User → Subconscious: Users can address the agent through Claude Code. The agent responds on the next sync cycle.

This is designed as an ongoing dialogue, not one-way surveillance.

4.7 Conversation Isolation

Each Claude Code session gets its own Letta conversation thread. This provides:

Session-scoped context: Messages within a session stay in that conversation
Shared memory: All conversations feed into the same agent brain (memory blocks are global)
Parallel sessions: Multiple Claude Code sessions can run simultaneously, each with their own conversation, all updating the same agent

4.8 Zero-Configuration Setup

The system requires only one credential (LETTA_API_KEY) and handles everything else:

Auto-imports a bundled default agent if none configured
Auto-detects available models on the Letta server
Auto-selects the best available model if the configured one isn't present
Creates conversations automatically per session
Manages all state files without user intervention

5. Observable Behavior

5.1 Session Lifecycle

Session Start
  → Agent notified with project path, session ID, timestamp
  → Legacy CLAUDE.md content cleaned up
  → Conversation created (or existing one reused)
  → TTY banner displayed: agent name, model, mode, URL

Before Each Prompt
  → Memory blocks fetched (diffs computed against last snapshot)
  → New agent messages retrieved
  → Content injected via stdout as XML
  → State snapshot updated

Before Each Tool Use
  → Quick check for new messages or memory changes
  → If updates found, inject as additionalContext
  → Silent no-op if nothing changed (fast path)

After Each Response
  → Full transcript extracted from JSONL session file
  → Formatted as XML and written to temp payload file
  → Background worker spawned (detached, non-blocking)
  → Worker sends payload to agent via Letta Code SDK
  → Agent processes transcript with tool access
  → Agent updates memory blocks as needed
  → Worker updates state file on success

5.2 What the User Sees

On session start (TTY output — not captured by Claude):

   👁️  Subconscious connecting...
   Agent: Subconscious (agent-xxxx)
   Model: anthropic/claude-sonnet-4-5
   Mode: whisper | SDK Tools: read-only
   🔗 https://app.letta.com/agents/agent-xxx?conversation=conv-xxx

Before prompts (injected into Claude's context):

<letta_message from="Subconscious" timestamp="2026-01-26T20:37:14+00:00">
You've asked about error handling in async contexts three times this week.
Consider reviewing error handling architecture holistically.
</letta_message>

5.3 What the User Does NOT See

No files written to disk (no CLAUDE.md modification)
No blocking delays (transcript processing is fully async)
No configuration files to maintain (auto-managed)
No popups or console windows on Windows (silent launcher)

6. User Flows

6.1 First-Time Setup

User installs plugin via marketplace or clones repo
User sets LETTA_API_KEY environment variable
User starts a Claude Code session
Plugin auto-imports the bundled Subconscious agent
Agent ID saved to ~/.letta/claude-subconscious/config.json
First few sessions: agent observes but has minimal context to whisper
Over time: agent accumulates preferences, project knowledge, patterns
Eventually: agent proactively surfaces relevant context before each prompt

6.2 Multi-Project Usage

User works in Project A → conversations stored in project-a/.letta/claude/
User switches to Project B → conversations stored in project-b/.letta/claude/
Same agent brain serves both → memory blocks shared across projects
Agent develops cross-project awareness over time

6.3 Custom Agent

User creates custom agent on Letta platform (or via ADE)
User sets LETTA_AGENT_ID in environment or .envrc
Plugin uses that agent instead of the default
Agent's own memory architecture and system prompt apply

6.4 Self-Hosted Deployment

User runs their own Letta server
User sets LETTA_BASE_URL to their server address
Plugin auto-detects available models on that server
All API calls route to the self-hosted instance

7. Constraints

7.1 Architectural Constraints

Hook-based integration: Bound to Claude Code's plugin hook lifecycle (SessionStart, UserPromptSubmit, PreToolUse, Stop). Cannot intercept arbitrary events.
Stdout-only injection: All context delivery to Claude Code happens via stdout XML. No direct memory manipulation of Claude Code's context window.
Async transcript delivery: Transcripts are sent after Claude responds, not during. The agent always observes one response behind.
Hook timeouts: SessionStart and PreToolUse have 5s limits. UserPromptSubmit has 10s. Stop has 120s. Operations must complete within these windows (or be delegated to background workers).

7.2 Platform Dependencies

Letta Platform required: Agent hosting, memory storage, conversation management, and model inference all depend on the Letta server (cloud or self-hosted).
Node.js ≥ 18: Runtime requirement for all hook scripts.
Letta Code SDK: The @letta-ai/letta-code-sdk package is required for transcript delivery with tool access.

7.3 Operational Constraints

Cold start: Agent starts with minimal context. Takes several sessions to accumulate useful knowledge.
One agent per config: A single agent ID is stored globally. Per-project agents require explicit LETTA_AGENT_ID overrides.
State locality: Conversation mappings are stored in the project directory (.letta/claude/). Moving projects or changing working directories loses conversation continuity.
No real-time streaming: Agent guidance arrives at prompt boundaries and tool-use boundaries — not mid-generation.

8. Non-Goals

Not a replacement for Claude Code: Subconscious observes and advises — it does not take over coding tasks (unless SDK tools mode is set to "full").
Not a code generation tool: The agent's purpose is context and memory, not producing code artifacts.
Not a CLAUDE.md manager: Deliberately avoids writing to disk. Earlier versions synced to CLAUDE.md; this was removed in favor of stdout injection.
Not a conversation log: The agent processes and forgets transcripts — it extracts signal into memory blocks, not verbatim storage.
Not a real-time pair programmer: Communication is asynchronous and batched at hook boundaries, not interactive.
Not model-locked: Supports any model available on the Letta server (OpenAI, Anthropic, Google, ZAI). Auto-selects if configured model unavailable.

9. Integration Surface

9.1 Claude Code Plugin System

Registration: hooks/hooks.json defines four lifecycle hooks
Execution: Each hook invokes a TypeScript script via tsx (wrapped by silent-npx.cjs for cross-platform support)
I/O contract: Hooks receive JSON on stdin, produce output on stdout (XML for context injection, JSON for PreToolUse)

9.2 Letta Server API

Six REST endpoints consumed:

Endpoint	Method	Purpose
`/conversations/`	POST	Create conversation for session
`/conversations/{id}/messages`	GET	Fetch agent messages
`/agents/{id}`	GET	Fetch agent + memory blocks
`/agents/{id}`	PATCH	Update tags, model config
`/agents/import`	POST	Import agent from .af file
`/models/`	GET	List available models

9.3 Letta Code SDK

resumeSession() — Resume conversation with tool restrictions
session.send() / session.stream() — Send transcript, stream response
Tool permission model: allowed/disallowed tool lists per session

9.4 Environment Variables

Variable	Required	Purpose
`LETTA_API_KEY`	Yes	Authentication with Letta platform
`LETTA_MODE`	No	Output mode: whisper, full, off
`LETTA_AGENT_ID`	No	Override agent selection
`LETTA_BASE_URL`	No	Self-hosted server URL
`LETTA_MODEL`	No	Override model selection
`LETTA_CONTEXT_WINDOW`	No	Override context window size
`LETTA_HOME`	No	Base directory for state files
`LETTA_SDK_TOOLS`	No	Tool access level: read-only, full, off

10. Platform Compatibility

Platform	Support	Notes
macOS	Full	Primary development target
Linux	Full	Requires tmpfs workaround if `/tmp` on separate filesystem
Windows	Full	Custom SilentLauncher.exe eliminates console window flashes via PseudoConsole (ConPTY)

This specification describes the observable product behavior of Claude Subconscious as derived from its source code. It is implementation-agnostic where possible and focuses on what the software does, not how it does it internally.

ianphil/product-spec.md

Select an option

No results found