Skip to content

Instantly share code, notes, and snippets.

@stas00
Created October 12, 2024 02:08
Show Gist options
  • Save stas00/060bffc245244532231a7bb29003cd56 to your computer and use it in GitHub Desktop.
Save stas00/060bffc245244532231a7bb29003cd56 to your computer and use it in GitHub Desktop.
easy scalable inference benchmarking with aiohttp client (via vllm)
git clone https://github.com/vllm-project/vllm
cd vllm/benchmarks
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
mkdir results
python benchmark_serving.py \
--backend vllm \
--model meta-llama/Meta-Llama-3-8B-Instruct \
--dataset-name sharegpt \
--dataset-path ShareGPT_V3_unfiltered_cleaned_split.json \
--port 9999 \
--save-result \
--result-dir results \
--result-filename test.json \
--num-prompts 50 \
--request-rate inf \
--seed 42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment