When an LLM processes a prompt, it computes a Key and Value vector for every token — the KV cache. If many requests share the same system prompt, recomputing its KV cache from scratch each time is wasteful. Radix Cache stores these computed prefixes in a Radix Tree and reuses them across requests, which is one of the main reasons SGLang achieves high throughput.