Drop-in Dockerfile that builds TurboQuant llama-server and swaps it into llama-swap. One image, same config format, just add -ctk turbo3 -ctv turbo3.
| Metric | Stock llama.cpp | TurboQuant (turbo3) |
|---|---|---|
| Prefill | 48.9 tok/s | 4,176 tok/s |
| Decode | 44.5 tok/s | 18.2 tok/s |