Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)
- Transformers as Support Vector Machines
- Survey of LLMS
- Deep Learning Systems
- Fundamental ML Reading List
- What are embeddings
- Concepts from Operating Systems that Found their way into LLMS
- Talking about Large Language Models
- Language Modeling is Compression
- Vector Search - Long-Term Memory in AI
- Eight things to know about large language models
- The Bitter Lesson
- The Hardware Lottery
- The Scaling Hypothesis
- Tokenization
- LLM Course
- Seq2Seq
- Attention is all you Need
- BERT
- GPT-1
- Scaling Laws for Neural Language Models
- T5
- GPT-2: Language Models are Unsupervised Multi-Task Learners
- InstructGPT: Training Language Models to Follow Instructions
- GPT-3: Language Models are Few-Shot Learners
- Transformers from Scratch
- Transformer Math
- Five Years of GPT Progress
- Lost in the Middle: How Language Models Use Long Contexts
- Self-attention and transformer networks
- Attention
- Understanding and Coding the Attention Mechanism
- Attention Mechanisms
- Keys, Queries, and Values
- What is ChatGPT doing and why does it work
- My own notes from a few months back.
- Karpathy's The State of GPT (YouTube)
- OpenAI Cookbook
- Catching up on the weird world of LLMS
- How open are open architectures?
- Building an LLM from Scratch
- Large Language Models in 2023 and Slides
- Timeline of Transformer Models
- Large Language Model Evolutionary Tree
- Why host your own LLM?
- How to train your own LLMs
- Hugging Face Resources on Training Your Own
- Training Compute-Optimal Large Language Models
- Opt-175B Logbook
- RLHF
- Instruction-tuning for LLMs: Survey
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- RLHF and DPO Compared
- The Complete Guide to LLM Fine-tuning
- LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language - Really great overview of SOTA fine-tuning techniques
- On the Structural Pruning of Large Language Models
- Quantiztion
- PEFT
- How is LlamaCPP Possible?
- How to beat GPT-4 with a 13-B Model
- Efficient LLM Inference on CPUs
- Tiny Language Models Come of Age
- Efficiency LLM Spectrum
- TinyML at MIT
- Building LLM Applications for Production
- Challenges and Applications of Large Language Models
- All the Hard Stuff Nobody talks about when building products with LLMs
- Scaling Kubernetes to run ChatGPT
- Numbers every LLM Developer should know
- Against LLM Maximalism
- A Guide to Inference and Performance
- (InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
- The State of Production LLMs in 2023
- Machine Learning Engineering for successful training of large language models and multi-modal models.
- Fine-tuning RedPajama on Slack Data
- LLM Inference Performance Engineering: Best Practices
- How to Make LLMs go Fast
- Transformer Inference Arithmetic
- Which serving technology to use for LLMs?
- Speeding up the K-V cache
- Large Transformer Model Inference Optimization
- On Prompt Engineering
- Prompt Engineering Versus Blind Prompting
- Building RAG-Based Applications for Production
- Full Fine-Tuning, PEFT, or RAG?
- Prompt Engineering Guide
- The Best GPUS for Deep Learning 2023
- Making Deep Learning Go Brr from First Principles
- Everything about Distributed Training and Efficient Finetuning
- Training LLMs at Scale with AMD MI250 GPUs
- GPU Programming
- Evaluating ChatGPT
- ChatGPT: Jack of All Trades, Master of None
- What's Going on with the Open LLM Leaderboard
- Challenges in Evaluating AI Systems
- LLM Evaluation Papers
- Evaluating LLMs is a MineField
- Generative Interfaces Beyond Chat (YouTube)
- Why Chatbots are not the Future
- The Future of Search is Boutique
- As a Large Language Model, I
- Natural Language is an Unnatural Interface
Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.