Skip to content

Instantly share code, notes, and snippets.

@anna-hope
Created September 27, 2025 19:32
Show Gist options
  • Save anna-hope/057d2f2152ea0a4dd41b513c091ac313 to your computer and use it in GitHub Desktop.
Save anna-hope/057d2f2152ea0a4dd41b513c091ac313 to your computer and use it in GitHub Desktop.
llama.cpp AMD Vulkan Docker/Podman compose
services:
llama-cpp:
image: ghcr.io/ggml-org/llama.cpp:server-vulkan
command: "-m /models/Qwen3-8B-Q8_0.gguf --host 0.0.0.0 --port 8000 --ctx-size 16000 --context-shift"
devices:
- "/dev/kfd:/dev/kfd"
- "/dev/dri:/dev/dri"
ports:
- "8000:8000"
restart: "unless-stopped"
security_opt:
- label=type:container_runtime_t
volumes:
- /path/to/models:/models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment