Skip to content

Instantly share code, notes, and snippets.

@Phate334
Last active May 13, 2025 03:10
Show Gist options
  • Save Phate334/b0495859ca8cd2c149743e789b66fcef to your computer and use it in GitHub Desktop.
Save Phate334/b0495859ca8cd2c149743e789b66fcef to your computer and use it in GitHub Desktop.
A simple script to bench GGUF base and draft models
#!/bin/bash
set -e
set -u
if [ $# -lt 1 ]; then
echo "Usage: $0 BASE_MODEL [DRAFT_MODEL] [--skip-base] [--skip-draft] [--skip-sd]"
exit 1
fi
BASE_MODEL="$1"
DRAFT_MODEL="${2:-}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
LLAMA_BENCH="$SCRIPT_DIR/llama.cpp/build/bin/llama-bench"
LLAMA_SPECULATIVE_SAMPLE="$SCRIPT_DIR/llama.cpp/build/bin/llama-speculative-simple"
SKIP_BASE=false
SKIP_DRAFT=false
SKIP_SD=false
# 如果沒有給 DRAFT_MODEL,則自動跳過 SD 和 DRAFT
if [ -z "$DRAFT_MODEL" ]; then
SKIP_SD=true
SKIP_DRAFT=true
fi
for arg in "$@"; do
case "$arg" in
--skip-base)
SKIP_BASE=true
;;
--skip-draft)
SKIP_DRAFT=true
;;
--skip-sd)
SKIP_SD=true
;;
esac
done
if [ "$SKIP_SD" = false ]; then
echo "Running speculative decoding..."
$LLAMA_SPECULATIVE_SAMPLE -fa -sm none -ctk q8_0 -ctv q8_0 -m "$BASE_MODEL" -ngl 99 -ngl 99 -c 4096 -md "$DRAFT_MODEL" -ngld 99 -cd 4096 -p "如何評價一瓶威士忌" --no-perf
fi
if [ "$SKIP_BASE" = false ]; then
echo "Running base model..."
$LLAMA_BENCH -m "$BASE_MODEL" -ngl 99 -sm none -fa 1 -p 2048 -n 512
fi
if [ "$SKIP_DRAFT" = false ]; then
echo "Running draft model..."
$LLAMA_BENCH -m "$DRAFT_MODEL" -ngl 99 -sm none -fa 1 -p 2048 -n 512
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment