Skip to content

Instantly share code, notes, and snippets.

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

@HtutLynn
HtutLynn / add_to_zshrc.sh
Last active May 22, 2026 04:44 — forked from karpathy/add_to_zshrc.sh
Git Commit Message AI
# -----------------------------------------------------------------------------
# AI-powered Git Commit Function
# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
# 1) gets the current staged changed diff
# 2) sends them to an LLM to write the git commit message
# 3) allows you to easily accept, edit, regenerate, cancel
# But - just read and edit the code however you like
# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/
# Unalias gcm if it exists (to prevent conflicts)
@karpathy
karpathy / add_to_zshrc.sh
Created August 25, 2024 20:43
Git Commit Message AI
# -----------------------------------------------------------------------------
# AI-powered Git Commit Function
# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
# 1) gets the current staged changed diff
# 2) sends them to an LLM to write the git commit message
# 3) allows you to easily accept, edit, regenerate, cancel
# But - just read and edit the code however you like
# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/
gcm() {
@devinschumacher
devinschumacher / how-to-make-invisible-characters.md
Last active April 28, 2026 07:55
The "invisible" / "blank" / "empty space" characters
tags
zero width space
zero width non-joiner
zero width joiner
soft hyphen

Invisible "blank" Characters: How to write invisible characters (zero width space, zero width non-joiner, zero width joiner & soft hyphen

I think only 2 of these are "copy-able":

  1. Zero-width space:
  2. Zero-width non-joiner: ‌‌‌
@Donavan
Donavan / segmentation-101_part-1.md
Last active October 6, 2025 08:16
Segmentation 101, Part 1: Why your strategy matters

Segmentation 101, part 1: Why your strategy matters

I recent did some more exploring with a local LLM tool that would import your documents into a vector store.  Given the promising initial results with a handful of docs I wanted to see how it handled more / different data.  I decided to copy over the text files containing Expanse trivia and answers I use as a regression suite to test my own "Q&A over documents" process. I wanted to see what types of questions it could answer from that content...

The Problem With Generic Segmentation

The strategy employed by this tool used double newlines as their segmentation boundary condition. A strategy that works well for many types of content however for this content that was a terrible choice as the text in the files are formatted with numbered questions followed by their answers like this:

1. Long winded question with establishing context

 

@taruma
taruma / feidlambda_v0_4.scala
Last active March 25, 2026 12:09
Official GIST feidlambda (feid) v0.4
/*
feidlambda v0.4.0 - LOGIC / UTILITIES FUNCTIONS BY FIAKO ENGINEERING
OFFICIAL GIST (feidlambda v0.4.x):
https://gist.github.com/taruma/b4df638ecb7af48ab63691951481d6b2
REPOSITORY:
https://github.com/fiakoenjiniring/feidlambda
CONTRIBUTOR: @taruma, @iingLK
TESTED: Microsoft Excel 365 v2304
*/
@taybenlor
taybenlor / write.rs
Created September 12, 2022 10:30
This program writes whatever you type into STDIN to the file specified in arg1.
use std::io::Read;
use std::{env, fs, io};
fn main() {
let filename = env::args().nth(1).expect("no filename provided");
let stdin = io::stdin();
let mut buf = String::new();
stdin
.lock()

Datasette tutorial written by GPT3

Prompt: a step by step tutorial for getting started with Datasette

This is a guide for getting started with Datasette. Datasette is a tool for creating and publishing data-driven websites. It is designed to make it easy to publish structured data, such as the results of a database query, in a way that is highly visible and interactive.

Datasette can be used to create websites that allow users to explore and visualize data, or to build applications that expose data via APIs. It can also be used as a static site generator, creating a completely static HTML website that can be deployed anywhere.

This guide will cover the basics of how to install and use Datasette. It will also show you how to create a simple data-driven website using Datasette.

@OlivierLDff
OlivierLDff / Readme.md
Last active October 17, 2025 16:43
🚀 Git Bash Emojis (Windows)

Open git bash with admin privilege.

cd "C:/Program Files/Git/usr/share/mintty"
mkdir -p emojis
cd emojis
curl https://raw.githubusercontent.com/wiki/mintty/mintty/getemojis > getemojis
./getemojis -d