Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save VivianBalakrishnan/a7d4eec3833baee4971a0ee54b08f322 to your computer and use it in GitHub Desktop.

Select an option

Save VivianBalakrishnan/a7d4eec3833baee4971a0ee54b08f322 to your computer and use it in GitHub Desktop.
NanoClaw — Personal Claude Assistant (second brain for a diplomat)

NanoClaw — Personal Claude Assistant

A self-hosted, compounding-memory AI assistant running on a Raspberry Pi.


What Is This?

NanoClaw is a personal AI assistant built on Anthropic's Claude that runs entirely on a Raspberry Pi. It connects to messaging channels (WhatsApp, Telegram, Slack, Discord), processes voice and images, schedules recurring tasks, and — unlike a standard chatbot — accumulates knowledge over time through a structured memory system.


The Core Problem It Solves

Standard LLM assistants are stateless. They forget everything between sessions. The typical fix is RAG over a document store, but RAG retrieves chunks of raw text, not synthesised knowledge.

This system does three things instead:

  1. Extract — pull discrete facts, insights, and style preferences from raw documents (speeches, articles, conversations) into a graph database. Each entry is a self-contained, retrievable statement.
  2. Synthesise — compile those facts into human-readable wiki pages organised by entity, concept, and timeline.
  3. Recall — on every agent invocation, run a semantic query against the graph using the user's message as input. Relevant entries are injected as context before the agent responds.

The result is an agent that gets smarter over time, surfaces what it knows automatically, and can cite specific stored facts when it explains its reasoning.


Architecture

Raw sources          →    mnemon graph         →    wiki pages
(transcripts/            (structured facts,         (narrative syntheses,
 articles,                graph nodes,               human-readable,
 web clips)               semantic retrieval)        cross-referenced)
         ↑                      ↑                          ↑
   ingest pipeline        auto-recall hook          synthesise operation

Layer 1 — Raw Sources

Archival files, never modified after storage:

  • Speech transcripts in markdown
  • Articles saved from URL ingest or mobile web clipping (via Obsidian Web Clipper)

Layer 2 — mnemon Knowledge Graph

A SQLite-backed graph database where each entry has: content, category, importance score, tags, timestamp, and graph edges to related entries. Queried semantically via local vector embeddings (Ollama + nomic-embed-text).

Two stores:

  • Global — shared knowledge across all groups; read-only for non-main agents
  • Local — per-group memory, writable only by that group's agent

Layer 3 — Wiki Pages

Synthesised markdown files compiled from mnemon facts. Not raw extracts — full narrative pages with cross-references, organised into entities/, concepts/, and timelines/ subdirectories. Browsable in Obsidian on macOS and iOS.


Technology Stack

Component Purpose
NanoClaw (Node.js + TypeScript) Orchestrator: message loop, container management, channel routing
Claude Agent SDK Runs agent logic inside isolated Docker containers per group
Baileys WhatsApp Web protocol (no business API needed)
mnemon Custom CLI knowledge graph tool (SQLite + graph traversal)
Ollama + nomic-embed-text Local vector embeddings for semantic recall — runs on Pi, no cloud calls
whisper.cpp Local voice transcription — converts voice notes to text on-device
sharp Image resize and processing before multimodal Claude calls
OneCLI Credential proxy — containers never see raw API keys
SQLite Message store, group registry, task scheduler
systemd Process management (nanoclaw service + article watcher)

Key Capabilities

Messaging channels: WhatsApp, Gmail (read + send), and Web are active. Telegram, Slack, and Discord are available as installable skill branches.

Multimodal: Voice notes transcribed locally via whisper.cpp before the agent sees them. Images resized and passed as multimodal content to Claude.

Memory: Every agent invocation triggers a semantic recall against the knowledge graph. Relevant facts surface automatically as a system reminder — the agent never has to decide to "look something up."

Task scheduler: Cron, interval, and one-time tasks. Supports a bash pre-check script to avoid waking the agent unnecessarily — keeps API usage minimal.

Multi-group isolation: Each registered group gets an isolated Docker container, filesystem, local mnemon store, and Claude session. Groups cannot read each other's memory or messages.

Subagent teams: Agents can spawn specialised child agents for parallel work (research, web browsing, data extraction) via Claude's experimental agent teams feature.

Web interface: Multi-conversation portal (port 3080, configurable) for conversations outside WhatsApp. Full Markdown rendering, multiple simultaneous group conversations, conversation history.

Obsidian integration: Wiki pages sync to an Obsidian vault on macOS/iOS via iCloud + rsync bridge. Web clips from the Obsidian mobile app flow in the other direction — clip → iCloud → Pi → ingest pipeline.


Security Design

  • Containers never see raw API keys. An HTTP proxy (OneCLI) intercepts container HTTPS traffic and injects credentials at request time.
  • Sender allowlist — optional per-chat control over who can trigger the agent. Two modes: trigger (non-allowed senders' messages stored for context but can't trigger) or drop (messages not stored at all).
  • Mount allowlist — stored outside the project root so containers cannot read it. Controls which host directories can be mounted into containers; blocks sensitive path patterns (.ssh, .aws, *.pem, etc.).
  • Per-group IPC namespacing — each group can only send messages to its own JID. Source identity is verified by directory path, not message content. Main group has elevated privileges.
  • Group folder validation — folder names are strictly validated (alphanumeric, hyphens, underscores only; no path traversal).

Interesting Design Decisions

Why Docker containers per group, not one process? Isolation. Each group gets a clean environment, its own filesystem, and its own Claude session. A runaway agent in one group can't affect others. Container lifetime is tied to conversation activity — they shut down after idle timeout.

Why iCloud + rsync for Obsidian sync, not git? iOS git clients (obsidian-git, isomorphic-git) have unreliable auth and clone failures in practice. iCloud is native to iOS, zero-config, and free. rsync is directional and battle-tested. A Mac Mini acts as the bridge (always on, same LAN as the Pi).

Why mnemon + wiki pages, not just RAG? RAG retrieves text chunks; mnemon stores synthesised facts as discrete nodes. The wiki layer adds human-readable narratives that can be reviewed and corrected. The two-tier design (mnemon for recall, wiki for synthesis) separates retrieval from presentation. The wiki pattern is inspired by Andrej Karpathy's LLM Wiki concept — extracting structured knowledge from raw sources rather than indexing them whole.

Why local embeddings? The knowledge base contains personal and policy-sensitive content. Running nomic-embed-text locally on the Pi means no document content leaves the network. The 274MB model runs fast enough on the Pi 5 for this use case.

Why whisper.cpp locally? Same reason — voice notes contain private conversations. Running whisper.cpp locally keeps audio on-device. The base model is fast enough on the Pi 5 for practical use.

Why a task pre-check script? Each agent invocation uses API credits. For tasks like "check if there are new PRs" or "did anything change?", a bash script can answer the question without waking the LLM. The agent only runs when the script signals wakeAgent: true.


Project Structure (overview)

src/                    orchestrator source (TypeScript)
  channels/             WhatsApp, Telegram, Slack, Discord, Gmail adapters
  container-runner.ts   Docker container lifecycle management
  task-scheduler.ts     cron/interval/once scheduler
  ipc.ts                inter-process messaging (JSON file drops)
container/
  agent-runner/         agent entrypoint running inside containers
  skills/               container-side skill markdown files
  mnemon                compiled mnemon binary
groups/
  global/               shared knowledge (CLAUDE.md, wiki, transcripts, articles)
  {channel}_{group}/    per-group files (CLAUDE.md, attachments, conversations)
data/
  sessions/{group}/     per-group Claude sessions, local mnemon, IPC streams
  ipc/{group}/          message and task drop directories
scripts/
  watch-articles.sh     inotifywait watcher → IPC ingest task on new article
docs/
  obsidian-setup/       Mac Mini rsync scripts and launchd plist

Status

Actively running on a Raspberry Pi 5 (aarch64) as a personal assistant. The system is in daily use — processing messages, running scheduled briefings, ingesting articles, and building up the knowledge graph continuously.

NanoClaw is open source: github.com/qwibitai/nanoclaw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment