Skip to content

Instantly share code, notes, and snippets.

View vanto's full-sized avatar

Tammo van Lessen vanto

View GitHub Profile

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active February 25, 2026 11:22
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@veekaybee
veekaybee / normcore-llm.md
Last active June 2, 2026 14:12
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@Intyre
Intyre / Wahoo_Elemnt.md
Last active May 29, 2026 09:08
Wahoo Elemnt - Tips, tricks and custom images
@nknapp
nknapp / Dockerfile
Created October 30, 2016 20:15
Traefik setup as reverse-proxy with docker and letsencrypt
FROM traefik:camembert
ADD traefik.toml .
EXPOSE 80
EXPOSE 8080
EXPOSE 443
@panicsteve
panicsteve / gist:1641705
Created January 19, 2012 18:26
Form letter template for acquired startups
Dear soon-to-be-former user,
We've got some fantastic news! Well, it's great news for us anyway. You, on
the other hand, are fucked.
We've just been acquired by:
[ ] Facebook
[ ] Google
[ ] Twitter