Skip to content

Instantly share code, notes, and snippets.

@jimmytherobot-ai
Last active April 25, 2026 13:43
Show Gist options
  • Select an option

  • Save jimmytherobot-ai/e905e5e2667868ca47d11309d193b648 to your computer and use it in GitHub Desktop.

Select an option

Save jimmytherobot-ai/e905e5e2667868ca47d11309d193b648 to your computer and use it in GitHub Desktop.
AI Agent Memory System Architecture — OpenClaw + Obsidian + CouchDB

Architecture: AI Agent Memory System

How we built persistent memory for an AI assistant using OpenClaw, Obsidian, and self-hosted infrastructure.


The Problem

AI assistants wake up blank every session. No memory of yesterday's conversation, last week's decisions, or the context that makes them useful. Every interaction starts from zero.

We built a three-layer memory system that gives our agent (the agent) persistent, searchable memory across sessions — without relying on any third-party cloud services.


System Overview

┌─────────────────────────────────────────────────────┐
│                    HUMAN (Laptop)                     │
│                                                       │
│   Obsidian Vault ◄──── LiveSync (CouchDB) ────►     │
│   - Browse all memory                                 │
│   - Edit/correct agent's notes                        │
│   - Research library                                  │
│   - Shared workspace                                  │
└───────────────────────┬───────────────────────────────┘
                        │ Tailscale (encrypted P2P)
                        │
┌───────────────────────▼───────────────────────────────┐
│                  MAC STUDIO (Always-on)                │
│                                                        │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │   OpenClaw   │  │   Obsidian   │  │   CouchDB    │ │
│  │   Gateway    │  │    Vault     │  │  (Docker)    │ │
│  │             │  │              │  │              │ │
│  │  Agent ◄────┼──┤  Memory/     │  │  LiveSync    │ │
│  │  Runtime    │  │  Research/   │  │  E2E Encrypted│ │
│  │             │  │  MEMORY.md   │  │              │ │
│  └──────┬──────┘  └──────────────┘  └──────────────┘ │
│         │                                              │
│  ┌──────▼──────────────────────────────────────────┐  │
│  │              THREE-LAYER MEMORY                  │  │
│  │                                                  │  │
│  │  Layer 1: Knowledge Graph  (life/areas/)         │  │
│  │  Layer 2: Daily Notes      (memory/YYYY-MM-DD)   │  │
│  │  Layer 3: Tacit Knowledge  (MEMORY.md)           │  │
│  └──────────────────────────────────────────────────┘  │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │              INGESTION PIPELINES                  │  │
│  │                                                   │  │
│  │  Telegram Export ──► Daily conversation logs       │  │
│  │  Fireflies ──► Meeting transcripts                │  │
│  │  X/Twitter ──► Research notes (via bird CLI)      │  │
│  │  Web links ──► Summarized & filed to Research/    │  │
│  │  Email ──► Inbox monitoring & alerts              │  │
│  │  Calendar ──► Upcoming event awareness            │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────┘

Three-Layer Memory

Layer 1: Knowledge Graph (life/areas/)

Entity-based storage for people, companies, and projects.

life/areas/
├── people/
│   ├── alex-chen/
│   │   ├── summary.md      ← Living summary (rewritten weekly)
│   │   └── items.json      ← Atomic timestamped facts
│   └── sarah-park/
├── companies/
│   └── acme-protocol/
└── projects/
    └── gov-token/

How it works:

  • Every new fact gets appended to items.json with a timestamp
  • Facts are never deleted — superseded facts get marked with "status": "superseded"
  • summary.md is auto-regenerated weekly from active facts
  • Agent loads summary.md first (fast context), drills into items.json when needed

Example items.json entry:

{
  "id": "fact-20260220-001",
  "ts": "2026-02-20T14:00:00Z",
  "type": "role",
  "content": "Growth Lead at Acme Protocol",
  "source": "direct",
  "status": "active"
}

Layer 2: Daily Notes (memory/YYYY-MM-DD.md)

Raw event logs. What happened, when, with whom. Written continuously during sessions.

# 2026-02-20

## Morning
- Reviewed GPU financing news briefing (a major GPU financing deal)
- Fixed Telegram export auth (session expired, re-authed)
- Set up Obsidian Git sync between Studio and laptop

## Afternoon
- SEO audit of acme.xyz homepage (double-slash bug in canonical URLs)
- Analyzed a DeFi protocol integration doc
- Drafted outreach to a crypto podcast

Layer 3: Tacit Knowledge (MEMORY.md)

The distilled essence. Not facts about the world — patterns about how the human operates. Preferences, lessons learned, behavioral rules, things that only matter in context.

## Behavioral Rules
- ALWAYS confirm before sending emails
- Respect off-hours boundaries
- Article generation uses sub-agents, never block main session
- Markdown tables don't render on Telegram — use table-snap skill

## Lessons Learned
- Writing style matters a lot — Professional tone, no platitudes
- Technical deep-dives are less interesting than investor angles
- Always use sub-agents for complex tasks — faster iteration

The hierarchy: Layer 1 answers "what do I know about X?" Layer 2 answers "what happened recently?" Layer 3 answers "how should I behave?"


Sync Architecture

Obsidian LiveSync (CouchDB)

Both devices sync through a self-hosted CouchDB instance running in Docker on the Mac Studio.

  • End-to-end encrypted — data is encrypted before leaving the device
  • Real-time — changes appear within seconds
  • Self-hosted — no data on third-party servers
  • Conflict resolution — automatic merge for simple conflicts
Laptop (Obsidian) ◄──── CouchDB (Docker) ────► Studio (Obsidian)
                         localhost:5984
                    Accessible via Tailscale

Why not Obsidian Sync / iCloud / GitHub?

Method Problem
Obsidian Sync $4/mo, data on Obsidian's servers
iCloud Sync conflicts, no encryption control, slow
GitHub Private data in git history forever, 5-min delay
Syncthing Good but no Obsidian-native conflict resolution
LiveSync + CouchDB Free, self-hosted, encrypted, real-time, conflict-aware

Research Pipeline

Every link shared by the human is automatically:

  1. Logged to a Google Sheet (link collection with date, source, summary, tags)
  2. Summarized using web fetch + AI analysis
  3. Filed as a note in Research/ with subfolders by source type:
Research/
├── X/              ← Tweets (45+ notes)
├── Articles/       ← Blog posts, news
├── GitHub/         ← Repositories
├── Tools/          ← Product sites
├── Video/          ← YouTube
└── Docs/           ← Google Docs, Notion pages

Each note follows a standard template:

---
date: 2026-02-20
url: https://example.com
source: Author Name
tags: relevant, tags, here
---

# Title

## Summary
2-3 paragraph overview

## Key Takeaways
- Bullet points

## Relevance
How this connects to current work

Ingestion Sources

Source Method Frequency Output
Telegram Telethon API export Daily (cron) Encrypted daily logs
Meetings Fireflies webhook Real-time Transcripts in vault
X/Twitter bird CLI On-demand Research notes
Web links web_fetch + AI Every shared link Research notes
Email gog CLI (Gmail) Heartbeat checks Alerts for urgent items
Calendar gog CLI Every 2 hours (cron) Upcoming event awareness

Session Lifecycle

Session Start
    │
    ├── Read SOUL.md (personality)
    ├── Read USER.md (human context)
    ├── Read NOW.md (active work)
    ├── Read memory/today.md + yesterday.md
    └── Read MEMORY.md (tacit knowledge)
    │
    ▼
Active Session
    │
    ├── Write to memory/YYYY-MM-DD.md (continuous)
    ├── Update knowledge graph (new facts)
    ├── File research notes (shared links)
    └── Log feedback (SkillRL loop)
    │
    ▼
Session End / Context Limit
    │
    ├── Update NOW.md (lifeboat for next session)
    └── Commit memory files to vault

SkillRL Feedback Loop

Inspired by the SkillRL paper — structured learning from both failures and successes.

Interaction → Structured Log → Distillation → Rule Injection → Better Agent
     ↑                                                              │
     └──────────────────────────────────────────────────────────────┘

Every correction, failure, or insight gets logged as structured JSONL:

{
  "ts": "2026-02-20T14:30:00Z",
  "type": "correction",
  "category": "communication",
  "what": "Sent birthday message on wrong date",
  "fix": "Check MEMORY.md dates before sending greetings",
  "lesson": "Don't trust stale cron reminders — verify against source data",
  "severity": "low"
}

A distillation cron runs every 2 days, extracts patterns, and injects rules into the agent's configuration. The agent literally improves itself from its mistakes.


Infrastructure

Component Tech Purpose
Agent Runtime OpenClaw + Claude AI assistant framework
Memory Storage Markdown files (plain text) Portable, version-controlled
Knowledge Graph JSON + Markdown Entity-based fact storage
Vault Sync CouchDB + LiveSync Real-time encrypted sync
Network Tailscale Encrypted P2P between devices
Compute Mac Studio (M-series, always-on) Runs everything
Containers OrbStack (Docker) CouchDB, other services
Scheduling OpenClaw Cron Automated pipelines

Total cost: $0/month (excluding compute hardware and API usage)


Key Design Decisions

  1. Plain text over databases. Markdown files are portable, human-readable, and work with any tool. No vendor lock-in.

  2. Three layers, not one. Different memory types serve different retrieval patterns. Quick entity lookups ≠ "what happened yesterday" ≠ "how should I behave."

  3. Human can see everything. The Obsidian vault is shared — the human can browse, edit, and correct any memory. This builds trust and catches errors.

  4. Self-hosted everything. Personal data stays on personal hardware. No cloud services for storage or sync.

  5. Write-first, organize-later. Capture everything in daily notes. Distill into knowledge graph and MEMORY.md periodically. Don't let organization overhead prevent capture.

  6. Never delete, only supersede. Facts change. Old facts aren't wrong — they're historical. Mark them superseded, don't erase them.


Built with OpenClaw, Obsidian, CouchDB, and Tailscale. No proprietary services required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment