Status: PLANNED Created: 2026-02-16
Molty dispatches tasks and delivers notifications but is blind to what actually happens during runs and discussions. Claude generates rich artifacts (JSONL ACKs with step notes, context summaries, output files) but nobody reads or aggregates them. Molty cannot answer "what happened today?" or "what's the status of project X?" because there's no memory layer between raw events and conversational knowledge.
Add a post-completion observer to simple_runner.js that triggers after every RUN_COMPLETE and DISCUSS_COMPLETE. The observer collects the raw bundle (task description, JSONL ACK notes, output file, context summary) and sends it to Gemini Flash for extraction. The AI distills what happened, what decisions were made, what failed, and what to remember — then appends it to Molty's memory (both as a daily markdown digest and indexed into clawdbot's semantic memory via memory_search). A new /checkin command in reply_handler.js lets Swap ask Molty for a conversational summary of recent activity.
| Aspect | Current State | After Implementation |
|---|---|---|
| What Molty knows about runs | Just "RUN_COMPLETE" status | Full summary: what changed, decisions, blockers, outcomes |
| What Molty knows about discussions | Nothing after DISCUSS_COMPLETE | Key conclusions, decisions, action items |
| "What happened today?" | Manual — Swap reads Telegram history | /checkin gives conversational daily summary |
| "Status of project X?" | Check GitHub/dashboard manually | Molty recalls recent activity per project |
| Cross-run context | Each run starts fresh | Molty can reference prior work in conversations |
| Memory storage | 4 hand-written diary entries | Auto-growing daily digests + searchable memory |
- Claude completes a
/runtask — writes RUN_COMPLETE ACK to JSONL - simple_runner.js detects RUN_COMPLETE, collects the raw bundle (task.json, JSONL notes, output.md, context.md)
- simple_runner.js calls Gemini Flash with the bundle + extraction prompt
- AI returns structured memory: project, outcome, decisions, blockers, what to remember
- Memory gets appended to
memory/events/YYYY-MM-DD.jsonland written tomemory/diary/YYYY-MM-DD.md - Memory also gets indexed into clawdbot's semantic memory store for future conversations
- Swap types
/checkinin any Telegram thread - Molty reads today's diary + recent events, synthesizes a conversational status update
Event Capture (simple_runner.js):
- After RUN_COMPLETE: collect task description, all JSONL step notes, output file content, context summary
- After DISCUSS_COMPLETE: collect discussion topic, all turn files, final output summary
- Bundle all collected data into a single prompt payload
AI Extraction (Gemini Flash):
- Send bundle to Gemini Flash with extraction prompt
- Prompt asks for: project name, one-line outcome, key decisions made, blockers hit, files/features changed, what Molty should remember
- Parse structured response (JSON format)
Memory Storage:
- Append extracted event to
memory/events/YYYY-MM-DD.jsonl(one event per line) - Append human-readable entry to
memory/diary/YYYY-MM-DD.md(daily diary, one section per event) - Index into clawdbot semantic memory using
memory_search-compatible format
Checkin Command (reply_handler.js):
/checkin— reads today's diary, summarizes via Gemini Flash, sends conversational response to thread/checkin <project>— filters events by project name, gives project-specific status
- Weekly digest aggregation (summarize the week every Sunday)
- Memory pruning — archive events older than 30 days to monthly summaries
- Dashboard widget showing recent memory entries
- Mid-run observation (watching RUNNING ACKs in real-time)
- Tmux scrollback capture and analysis
- Cross-project dependency tracking
- Proactive notifications ("you haven't touched project X in 5 days")
| Risk | Impact | Mitigation |
|---|---|---|
| Gemini Flash API failures block completion notifications | High — delays Telegram delivery | Fire-and-forget: memory extraction runs async, doesn't block ACK delivery |
| Extraction quality inconsistent | Med — garbage memories | Structured prompt with strict JSON schema; validate output before storing |
| Memory files grow unbounded | Low — disk space | Daily files stay small (5-15 events/day); weekly digest + monthly archival |
| Clawdbot semantic memory index gets stale | Med — Molty forgets | Re-index diary files on gateway restart; memory entries include timestamps |
| Bundle too large for Gemini Flash context | Low — extraction fails | Cap bundle at 8K tokens; truncate JSONL to last 20 step notes |
| simple_runner.js restart during extraction | Low — lost memory | Write raw bundle to pending/ dir first; process on next startup if missed |
- RUN_COMPLETE triggers memory extraction with correct bundle
- DISCUSS_COMPLETE triggers memory extraction with correct bundle
- Bundle includes: task description, step notes, output file, context summary
- Extraction is fire-and-forget (does not block ACK delivery to Telegram)
- Gemini Flash call returns structured JSON with required fields
- Extraction prompt produces consistent, useful summaries
- Failed extractions are logged but don't crash simple_runner
- Bundle size is capped to fit within Gemini Flash context window
- Events appended to
memory/events/YYYY-MM-DD.jsonl - Human-readable diary entry written to
memory/diary/YYYY-MM-DD.md - Memory indexed into clawdbot semantic memory store
- Memory entries include: timestamp, project, runId, outcome, decisions, blockers
-
/checkinreturns today's activity summary in the requesting thread -
/checkin <project>filters to project-specific activity - Response is conversational and under 3000 chars (Telegram-friendly)
- Empty days return "Nothing completed today" gracefully
- E2E-1: Run a
/run echo hellotask → verify memory event appears in today's JSONL + diary - E2E-2: Complete a
/discusssession → verify discuss memory event appears in today's JSONL + diary - E2E-3: Run
/checkinafter a completed run → Molty responds with accurate summary of the run - E2E-4: Run
/checkin moltbot-webafter a run on that project → response is filtered to that project only - E2E-5: Run
/checkinwith no activity today → Molty responds gracefully ("nothing completed today") - E2E-6: Trigger RUN_COMPLETE with Gemini Flash API down → ACK delivery still works, error logged
- E2E-7: Complete 3 runs in one day → diary has 3 sections,
/checkinsummarizes all 3 - E2E-8: Ask Molty in conversation "what did we do on project X recently?" → semantic memory search returns relevant events
- Send
/run echo testand wait for completion — memory file created - Check
memory/events/has today's JSONL file with at least one entry - Check
memory/diary/has today's markdown file with readable content - Send
/checkin— get a response (not an error)
- RUN_COMPLETE memory entry has correct project name and outcome
- DISCUSS_COMPLETE memory entry captures discussion conclusions
-
/checkinresponse mentions the correct tasks completed today -
/checkin <project>only mentions events for that project - Memory entry includes decisions and blockers from the run
- simple_runner heartbeat still running after memory extraction
- ACK delivery timing unchanged (no noticeable delay from async extraction)
- reply_handler still handles /run and /discuss commands normally
- Gateway logs show no errors from memory indexing
- Existing Molty conversations work — no regression from new memory tools
Each step is a separate /run task:
- Add memory extraction to simple_runner.js — After RUN_COMPLETE/DISCUSS_COMPLETE, collect bundle, call Gemini Flash, write to events JSONL + diary markdown
- Add /checkin command to reply_handler.js — Parse command, read diary, call Gemini Flash for summary, send response to thread
- Index memories into clawdbot semantic store — After writing diary entry, also write to clawdbot-compatible memory location for semantic search
- Test with real runs — Execute test tasks, verify full pipeline end-to-end