Paul Smith paulsmith

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

Terminals should generate the 256-color palette from the user's base16 theme.

If you've spent much time in the terminal, you've probably set a custom base16 theme. They work well. You define a handful of colors in one place and all your programs use them.

The drawback is that 16 colors is limiting. Complex and color-heavy programs struggle with such a small palette.

This document explains how I would implement an SSA-based compiler if I was writing one today.

This document is intentionally opinionated. It just tells you how I would do it. This document is intended for anyone who has read about SSA and understands the concept, but is confused about how exactly to put it into practice. If you're that person, then I'm here to show you a way to do it that works well for me. If you're looking for a review of other ways to do it, I recommend this post.

My approach works well when implementing the compiler in any language that easily permits cyclic mutable data structures. I know from experience that it'll work great in C++, C#, or Java. The memory management of this approach is simple (and I'll explain it), so you won't have to stress about use after frees.

I like my approach because it leads to an ergonomic API by minimizing the amount of special cases you have to worry about. Most of the compiler is analyses and transformations ov

A traditional table-based DFA implementation looks like this:

uint8_t table[NUM_STATES][256]

uint8_t run(const uint8_t *start, const uint8_t *end, uint8_t state) {
    for (const uint8_t *s = start; s != end; s++)
        state = table[state][*s];
    return state;
}

Intel 4004, first microprocessor: http://www.computerhistory.org/collections/catalog/102658187
Intel 8008: http://www.computerhistory.org/collections/catalog/102657982
Intel 8080: http://www.computerhistory.org/collections/catalog/102658123
Z80: http://www.computerhistory.org/collections/catalog/102658073
Federico Faggin, SGT inventor, chip designer for 4004, Z80: http://www.computerhistory.org/collections/catalog/102658025
Bill Mensch, chip designer on 6800/6502/65C02/65816: http://www.computerhistory.org/collections/catalog/102739969
Motorolla 68000: http://www.computerhistory.org/collections/catalog/102658109
3dfx, Voodoo, the seminal GPU: http://www.computerhistory.org/collections/catalog/102746834
LSI Logic, EDA/fabless innovator: http://www.computerhistory.org/collections/catalog/102746194
VLSI Technologies, EDA/fabless innovator: http://www.computerhistory.org/collections/catalog/102746456

	import argparse
	import random
	import sys

	from transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache
	import torch

	parser = argparse.ArgumentParser()
	parser.add_argument("question", type=str)
	parser.add_argument(

	# Clone llama.cpp
	git clone https://github.com/ggerganov/llama.cpp.git
	cd llama.cpp

	# Build it
	make clean
	LLAMA_METAL=1 make

	# Download model
	export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin

	#include <stdio.h>
	#include <stdlib.h>
	#include <stdint.h>
	#ifdef _MSC_VER
	#include <intrin.h> /* for rdtscp and clflush */
	#pragma optimize("gt",on)
	#else
	#include <x86intrin.h> /* for rdtscp and clflush */
	#endif

	# many settings from https://raw.githubusercontent.com/mathiasbynens/dotfiles/master/.macos
	# many settings from https://raw.githubusercontent.com/thoughtbot/laptop/master/mac
	# my previous install notes at https://gist.github.com/llimllib/ee591266e05bd880629a4e7511a61bb3

	fancy_echo() {
	local fmt="$1"; shift

	# shellcheck disable=SC2059
	printf "\n$fmt\n" "$@"
	}

	#include <stdlib.h>
	#include <string.h>
	#include <stdarg.h>
	#include <stdio.h>
	#include <stdint.h>

	#define WIN32_LEAN_AND_MEAN
	#include <windows.h>

	void error(const char *str) {