Skip to content

Instantly share code, notes, and snippets.

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

@OmerFarukOruc
OmerFarukOruc / claude.md
Last active April 28, 2026 16:51
AI Agent Workflow Orchestration Guidelines

AI Coding Agent Guidelines (claude.md)

These rules define how an AI coding agent should plan, execute, verify, communicate, and recover when working in a real codebase. Optimize for correctness, minimalism, and developer experience.


Operating Principles (Non-Negotiable)

  • Correctness over cleverness: Prefer boring, readable solutions that are easy to maintain.
  • Smallest change that works: Minimize blast radius; don't refactor adjacent code unless it meaningfully reduces risk or complexity.
@nemorize
nemorize / korean-subway-station-list.json5
Last active August 12, 2025 07:23
대한민국 국내에 존재하는 지하철역 정보(역명, 지역구, 노선, 위/경도) 목록
[
// =======================================================
// 본 자료는 CC0-1.0 라이선스를 따릅니다.
// https://creativecommons.org/publicdomain/zero/1.0/deed
// 저작권자의 허락을 구하지 않아도 상업적인 목적을 포함한 어떠한 목적으로든
// 자유롭게 복사, 수정, 배포, 실연할 수 있습니다.
// =======================================================
// 모든 정보는 직접 수집하여 정리한 정보입니다.
// 일부 정보가 오기되었거나, 위/경도 좌표가 부정확할 수 있습니다.
// Gist 댓글을 통해 오류를 제보해주세요.
@shreyasms17
shreyasms17 / execute_autoflatten_complex.py
Last active September 2, 2022 10:22
AutoFlatten Complex JSON
from pyspark.sql.functions import col, explode_outer
from pyspark.sql.types import *
from copy import deepcopy
from autoflatten import AutoFlatten
from collections import Counter
s3_path = 's3://mybucket/orders/'
df = spark.read.orc(s3_path)
json_df = spark.read.json(df.rdd.map(lambda row: row.json))
json_schema = json_df.schema
@shreyasms17
shreyasms17 / autoflatten.py
Last active December 12, 2022 20:40
AutoFlatten
import json
class AutoFlatten:
def __init__(self, json_schema):
self.fields_in_json = self.get_fields_in_json(json_schema)
self.all_fields = {}
self.cols_to_explode = set()
self.structure = {}
self.order = []
self.bottom_to_top = {}
@mrchristine
mrchristine / spark_schema_save_n_load.py
Created May 28, 2019 21:12
Read / Write Spark Schema to JSON
##### READ SPARK DATAFRAME
df = spark.read.option("header", "true").option("inferSchema", "true").csv(fname)
# store the schema from the CSV w/ the header in the first file, and infer the types for the columns
df_schema = df.schema
##### SAVE JSON SCHEMA INTO S3 / BLOB STORAGE
# save the schema to load from the streaming job, which we will load during the next job
dbutils.fs.rm("/home/mwc/airline_schema.json", True)
with open("/dbfs/home/mwc/airline_schema.json", "w") as f:
@kevin-smets
kevin-smets / 1_kubernetes_on_macOS.md
Last active June 23, 2025 01:06
Local Kubernetes setup on macOS with minikube on VirtualBox and local Docker registry

Requirements

Minikube requires that VT-x/AMD-v virtualization is enabled in BIOS. To check that this is enabled on OSX / macOS run:

sysctl -a | grep machdep.cpu.features | grep VMX

If there's output, you're good!

Prerequisites

@Pusnow
Pusnow / 한글과유니코드.md
Last active November 10, 2025 01:39
한글과 유니코드

한글과 유니코드

유니코드에서 한글을 어떻게 다루는지를 정리하였다.

유니코드

  • 유니코드(Unicode)는 전 세계의 모든 문자를 컴퓨터에서 일관되게 표현하고 다룰 수 있도록 설계된 산업 표준 (위키 백과)
  • 단순히 문자마다 번호를 붙임
  • 계속 업데이트되며 현재는 Unicode Version 9.0.0 이 최신이다.

UTF

  • 유니코드를 실제 파일 등에 어떻게 기록할 것인지를 표준화한 것이다.
@qgp9
qgp9 / tmon.sh
Last active August 8, 2016 15:13
monitoring with tmux
#!/bin/bash
com=${1:-"htop"}
tname=mon
twin=mon
tnw=$tname:$twin
tmux has-session -t $tname
if [ $? != 0 ]
then
tmux new-session -s $tname -n $twin -d ssh -t a1 "$com"