Karpathy's LLM Wiki nailed something. LLMs are perfect for the bookkeeping that kills knowledge bases. Updating cross-references, flagging contradictions, keeping pages consistent - that's what they're good at. Humans curate, LLMs maintain. The wiki compounds because maintenance cost drops to zero.
I've been running a version of this for months. One thing kept breaking.
LLM Wiki has three layers: raw sources (immutable), the wiki (LLM-maintained), and a schema (configuration). The schema tells the LLM how to structure the wiki.
But it doesn't know who you are. Everything gets the same treatment.
After a few months of use, I noticed things going wrong. Every ingested source created wiki pages whether I needed them or not. Important stuff - a decision I made, a pattern I spotted - sat next to routine summaries at equal weight. And I started being selective about what I ingested, which defeats the whole point.
The root issue: extraction quality on day 180 is the same as day 1. The system never learns what matters to you.
Add one layer between raw sources and the wiki. A prompt that tells the LLM who you are and what matters to you, so it can score information by relevance before creating wiki entries.
The filter is just a prompt. It says things like: "I'm a startup founder. Decisions about product and strategy are high priority. Market signals are moderate. Routine team updates are low." The LLM uses this to decide what's worth a wiki page and what isn't.
A founder's filter prompt emphasizes decisions and product insights. An investor's filter prompt emphasizes market signals and deal flow. Same meeting transcript goes in. Different wiki pages come out.
A founder hearing "Series A closed at $10M" barely registers it - a routine milestone. An investor flags it immediately - a high-salience market signal. Without the identity layer, both get the same summary page. With it, the wiki reflects what actually matters to each person.
This single question - "who is this wiki for?" - eliminates about 75% of relevance uncertainty. No RAG system asks it. Every retrieval system treats all users the same.
A static filter prompt is better than nothing, but it's still static. The other half of the idea: the filter should learn from what proved useful.
Periodically, run an introspection pass. Look at what the filter extracted over the past few weeks. Which wiki pages led to actual decisions? Which were noise you never looked at again? What did you miss that mattered? Rewrite the filter prompt based on what you observed. Apply the updated prompt to future ingestion only.
This is Bayesian inference, structurally. The identity is your prior. Raw inputs are observations. Signal utility is evidence. The evolved filter is your posterior. The human approval step on filter updates prevents the system from hallucinating itself into a corner.
A static filter has O(T) cumulative regret - noise grows linearly forever. An evolving filter achieves O(√T) - it converges. The longer you use it, the sharper it gets.
LLM Wiki keeps everything. Every ingested source gets pages. Pages get cross-referenced. The wiki only grows.
But a good knowledge system also forgets. Your brain does this deliberately - sharp-wave ripples gate which experiences get consolidated into long-term memory. Most of what you experience in a day doesn't make the cut. That's not a bug. That's what makes recall work.
The filter is the forgetting mechanism. Low-scoring sources don't get wiki pages. They stay in raw input - nothing is deleted - but they don't clutter the wiki. The wiki becomes a curated view of what matters, not a complete record of everything you've ever read.
This separation matters. Raw sources are your complete event log - append-only, never deleted. The wiki is your working memory - intentionally filtered. You can always go back to raw sources and re-extract with a different filter. But your day-to-day retrieval surface stays clean.
Counterintuitively, showing the LLM less produces better reasoning. Research shows 30%+ accuracy drops as context grows from 300 to 113K tokens. A filtered wiki that's 25x smaller than the raw corpus loses about 15% of information but gains about 43% in attention efficiency. Less is more.
If you're running Karpathy's LLM Wiki and want to try this:
- Before ingestion, write a filter prompt: who is this wiki for, and what types of information matter most to them?
- On each ingestion, have the LLM score the source against the filter prompt. Low scores don't get wiki pages. They stay in raw sources.
- Every few weeks, run an introspection pass. Review which pages proved useful. Rewrite the filter prompt.
- Don't let the LLM update the filter autonomously. Keep a human in the loop.
One invariant: filter changes apply only to future data. Never retroactively re-filter. Historical extractions stay valid from their time period.
LLM Wiki solved maintenance. The missing pieces are identity and forgetting.
A knowledge base that doesn't know who you are treats everything as equally important. One that starts with "I'm a founder" and evolves from there converges on what actually matters. Same inputs, better extraction, and it gets sharper over time instead of noisier.