Skip to content

Instantly share code, notes, and snippets.

@ugovaretto
Last active August 27, 2025 12:37
Show Gist options
  • Save ugovaretto/03f8996a5bcccdb421ca52cb90ba424a to your computer and use it in GitHub Desktop.
Save ugovaretto/03f8996a5bcccdb421ca52cb90ba424a to your computer and use it in GitHub Desktop.
Run vLLM on gfx1151
#!/bin/env bash
# Launch VLLM on a gfx1151 AMD Strix Halo 395+
# As of August 27, 2025
podman run -it --rm --ipc=host --network=host --privileged \
--cap-add=CAP_SYS_ADMIN \
--device=/dev/kfd --device=/dev/dri --device=/dev/mem \
--group-add render \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-e HSA_OVERRIDE_GFX_VERSION=11.0.0 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
rocm/vllm-dev:main bash -c "pip install --upgrade transformers && vllm serve Qwen/Qwen3-4B-Instruct-2507"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment