Skip to content

Instantly share code, notes, and snippets.

@fucksophie
Last active June 24, 2025 14:36
Show Gist options
  • Save fucksophie/2154636d68f5b10024dae54ae225db8d to your computer and use it in GitHub Desktop.
Save fucksophie/2154636d68f5b10024dae54ae225db8d to your computer and use it in GitHub Desktop.
Run llama.cpp with a single .zip file and nix-shell

this script runs steam-run, and nixgl together to launch a llama.cpp server

tutorial:

  1. Get the latest llama.cpp version. 6/24/25, it is b5749. Replace the <build> in 1. with the correct number.
  2. wget https://github.com/ggml-org/llama.cpp/releases/download/<build>/llama-<build>-bin-ubuntu-vulkan-x64.zip --output l.zip && unzip l.zip
  3. wget https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-GGUF/blob/main/airoboros-mistral2.2-7b.Q4_K_S.gguf?download=true (or use a different model)
  4. chmod +x run-llama-vulkan.sh
  5. if you are using vulkan, then just ./run-llama-vulkan.sh, if you aren't follow further
  6. in line 10, modify nixVulkanIntel to be nixVulkanNvidia.
    or if you want to use cuda/ don't want vulkan modify it to nixGLDefault
NIXPKGS_ALLOW_UNFREE=1 nix-shell -E '
let
nixglSrc = builtins.fetchTarball {
url = "https://github.com/nix-community/nixGL/archive/main.tar.gz";
};
nixglPkgs = import nixglSrc {};
pkgs = import <nixpkgs> {};
in pkgs.mkShell {
buildInputs = [
nixglPkgs.nixVulkanIntel
pkgs.steam-run
pkgs.vulkan-tools
];
}
' --run '
nixVulkanIntel steam-run ./llama-server \
--model ./airoboros-mistral2.2-7b.Q4_K_S.gguf \
--ctx-size 2048 \
--n-gpu-layers 100 \
--port 8080 \
--host 0.0.0.0 \
--threads $(nproc) \
--mlock
'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment