Kazuki Inamura kzinmr

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

name	pdf-ocr-feedback
description	High-accuracy OCR pipeline using Maj@K consensus voting, structured self-evaluation, and adaptive compute budgets to achieve ≥95% transcription accuracy.

When to Use

Use when transcribing PDF pages via vision model and you need high accuracy — especially for:

Equations or mathematical notation
Tables with complex structure (3+ columns, merged cells)

This is an OPML version of the HN Popularity Contest results for 2025, for importing into RSS feed readers.

Plug: if you want to find content related to your interests from thousands of obscure blogs and noisy sources like HN Newest, check out Scour. It's a free, personalized content feed I work on where you define your interests in your own words and it ranks content based on how closely related it is to those topics.

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

[macOS] Install CMake

Instructions on how to install the CMake tool on macOS.

Uninstall

First step should be to unsinstall any previous CMake installation. This step can be skipped if no CMake version was previously installed.

To uninstall any previous CMake installations use the following commands:

Huggingface

NLP 관련 다양한 패키지를 제공하고 있으며, 특히 언어 모델 (language models) 을 학습하기 위하여 세 가지 패키지가 유용

package	note
transformers	Transformer 기반 (masked) language models 알고리즘, 기학습된 모델을 제공
tokenizers	transformers 에서 사용할 수 있는 토크나이저들을 학습/사용할 수 있는 기능 제공. transformers 와 분리된 패키지로 제공
nlp	데이터셋 및 평가 척도 (evaluation metrics) 을 제공

	# Fetch some text content in two different categories
	from wikipediaapi import Wikipedia
	wiki = Wikipedia('RAGBot/0.0', 'en')
	docs = [{"text": x,
	"category": "person"}
	for x in wiki.page('Hayao_Miyazaki').text.split('\n\n')]
	docs += [{"text": x,
	"category": "film"}
	for x in wiki.page('Spirited_Away').text.split('\n\n')]

	from fastapi import Request, HTTPException
	from pydantic import BaseModel, BaseModel, HttpUrl
	from modal import Secret, App, web_endpoint, Image
	from typing import Optional, List
	from example import proposal
	import os

	app = App(name="circleback", image=Image.debian_slim().pip_install("openai", "pydantic", "fastapi"))

	class Attendee(BaseModel):

	import base64
	import tempfile
	from typing import Optional

	from pydantic import BaseModel

	from modal import Image, Secret, Stub, build, enter, gpu, web_endpoint

	whisper_image = (
	Image.micromamba()