Skip to content

Instantly share code, notes, and snippets.

@ck3d
Last active January 29, 2025 08:59
Show Gist options
  • Save ck3d/a7b1f2aac9875fb1e227272165555b09 to your computer and use it in GitHub Desktop.
Save ck3d/a7b1f2aac9875fb1e227272165555b09 to your computer and use it in GitHub Desktop.
llama-bench

llama.cpp version: https://github.com/ggerganov/llama.cpp/commit/925e5584a058afb612f9c20bc472c130f5d0f891

LLM: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/blob/main/llama-2-7b-chat.Q4_K_M.gguf

llama-bench -m ../models/llama-2-7b-chat.Q4_K_M.gguf

Intel i7-6700K

model size params backend threads test t/s
llama 7B Q4_K - Medium 3.80 GiB 6.74 B BLAS 4 pp 512 7.58 ± 0.08
llama 7B Q4_K - Medium 3.80 GiB 6.74 B BLAS 4 tg 128 6.27 ± 0.01

AMD Ryzen 7 7735HS

model size params backend threads test t/s
llama 7B Q4_K - Medium 3.80 GiB 6.74 B BLAS 8 pp 512 27.12 ± 0.39
llama 7B Q4_K - Medium 3.80 GiB 6.74 B BLAS 8 tg 128 11.31 ± 0.01

Apple M1Pro

model size params backend ngl test t/s
llama 7B Q4_K - Medium 3.80 GiB 6.74 B Metal 99 pp 512 229.66 ± 7.05
llama 7B Q4_K - Medium 3.80 GiB 6.74 B Metal 99 tg 128 28.99 ± 0.19

See Also

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment