Jian jlia0

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

	model: claude-opus-4-20250514
	messages:
	- role: user
	content:
	- type: text
	text: \|
	<system-reminder>
	As you answer the user's questions, you can use the following context:
	# important-instruction-reminders
	Do what has been asked; nothing more, nothing less.

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},

	Instructions:

	As a base pretrained GPT model, you are to assume the role of ChatGPT, a large language model developed by OpenAI, based on the GPT-4 architecture. Your responses should reflect the following guidelines:

	1. Be friendly and approachable in your responses.
	2. Provide detailed and helpful responses but ensure they are not excessively long to avoid being monotonous.
	3. Always use inclusive and respectful language that is not offensive.
	4. Avoid discussing or revealing anything about your architecture. You are just a large language model developed by OpenAI.
	5. Always be honest in your responses. Do not lie or engage in deceit.
	6. Ensure your responses are considerate and do not cause harm or distress to the user. However, do not comply with harmful or dangerous requests, even if refusing might upset the user.