An extension of Andrej Karpathy’s LLM Wiki pattern that makes lint work at scale — and addresses a deeper problem the base pattern doesn’t fully solve.
In the base LLM Wiki pattern, sources are ingested one at a time. Each new source creates or updates wiki pages. This works well, but it introduces a structural bias: early sources disproportionately shape the wiki.
The first source to introduce a concept creates its page — setting the terminology, the framing, the scope. Later sources get reconciled into that frame rather than challenging it. The model tends to update and extend existing pages rather than restructure them. Over time, the wiki’s ontology reflects the order things arrived, not the relative importance or representativeness of sources.
This is an epistemic problem, not a structural one. It doesn’t show up as a broken link or a missing page. It shows up as concept pages that have a particular slant, blind spots in how topics are connected, and a wiki that quietly overfits to whichever sources came first.
Karpathy’s pattern includes a lint operation — a periodic health check for orphaned pages, broken links, contradictions, and stale claims. Lint addresses the structural layer of the bias problem: it can surface contradictions between pages, flag missing concepts, and catch claims that newer sources have superseded.
What lint can’t do is detect framing bias directly. It won’t tell you that concepts/attention-mechanism.md is written from the perspective of source #3 rather than the field at large. That requires something different — periodic full reprocessing, or explicit prompts asking Claude to steelman alternative framings of key pages.
Randomising the lint order is a partial bridge between these two layers. By ensuring lint doesn’t always encounter sources in the same sequence, it’s more likely to surface contradictions between early and late sources that order-biased passes would miss. It won’t fix framing, but it makes the structural repair more honest.
Even setting aside bias, "lint the wiki" breaks down on a mature wiki for a second reason: context limits. On a wiki built from 50+ sources, a single lint pass exhausts the context window mid-run. The model stops being thorough without telling you what it missed.
Three additions to the base pattern:
1. Batched sessions. Lint runs 5 sources per session, protecting the context window and making each pass reliable and completable.
2. Randomised order. The lint order is shuffled once by the LLM — not alphabetically, not by ingestion order — and stored in lint-status.md. The order is fixed across sessions so progress is resumable and the randomisation is stable. This counteracts topic clustering and makes cross-cutting connections more likely to surface.
3. Persistent scratchpad. Findings accumulate in lint-scratch.md across sessions. Issues are annotated FIXED rather than deleted, creating a changelog. The model reads the scratchpad at the start of each session, so it’s aware of patterns found in previous batches — including issues that recur across multiple sources.
The interface stays simple: "lint the wiki" or "continue lint". All complexity lives in CLAUDE.md and the two meta files.
Lint — even well-designed lint — cannot fully repair framing bias baked in at ingest time. If your wiki was built from 100+ sources without linting, the structural layer can be repaired with this approach. The conceptual framing layer requires additional work:
- Periodic full reprocessing — delete
wiki/and re-ingest all ofraw/from scratch, ideally in a different order - Explicit challenge prompts — ask Claude to argue against the current framing of high-traffic concept pages
- Targeted rewrite sessions — for pages you suspect are over-indexed on early sources, ask Claude to rewrite them using only late-ingested sources, then reconcile
wiki/
lint-status.md ← randomised source order + per-source status
lint-scratch.md ← running findings log across all batches
CLAUDE.md ← add the lint workflow block from Bias-aware-stateful-lint-workflow.md
- Add the lint workflow block from
CLAUDE-lint-section.mdto your existingCLAUDE.md. - Ask Claude to create
wiki/lint-status.md:
“Create lint-status.md. List all sources from the index in a randomised order — not alphabetical, not ingestion order. Label this seed 42. Mark all sources
[ ].”
- Create an empty
wiki/lint-scratch.mdusing the template. - Say
"lint the wiki"to begin.
| Marker | Meaning |
|---|---|
[ ] |
Unchecked |
[~] |
Partial — hub page with many inbound links, needs revisiting once all contributing sources are linted |
[x] |
Done |
Hub pages are marked [~] when first touched and only promoted to [x] once all sources that link to them have been linted. This prevents falsely marking a heavily-linked concept page as clean after a single pass.
The scratchpad gives you a natural indicator of wiki health. As the wiki matures, the rate of new findings per batch should drop. When a full pass surfaces only format issues and no contradictions or missing pages, the structural layer has stabilised. Framing bias won’t show up in this signal — that requires the separate interventions above.
Built on Andrej Karpathy’s LLM Wiki gist. This is an extension, not a replacement — the base pattern, folder structure, and ingest/query workflows are his.