Skip to content

Instantly share code, notes, and snippets.

@svpino
Created February 24, 2025 20:14
Show Gist options
  • Save svpino/0c72e474a82c89f8ef921f003f9fcc7a to your computer and use it in GitHub Desktop.
Save svpino/0c72e474a82c89f8ef921f003f9fcc7a to your computer and use it in GitHub Desktop.
Running DeepSeek on HPC
# Loading and serving the model:
vllm serve /root/commonData/DeepSeek-R1 \
--host 0.0.0.0 \
--port 8000 \
--enable-reasoning \
--reasoning-parser deepseek_r1 \
--tensor-parallel-size 8 \
--load-format auto \
--trust-remote-code \
--served-model-name deepseek-ai/DeepSeek-R1
# Running inference:
curl "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-R1",
"messages": [{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Write a short description of a hypothetical gray car"
}
]}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment