-
-
Save cgmb/a74e63ab5b2727b076eca7cafeb65bd5 to your computer and use it in GitHub Desktop.
#!/bin/sh | |
# Build llama.cpp on Ubuntu 24.04 with AMD GPU support | |
sudo apt -y install git wget hipcc libhipblas-dev librocblas-dev cmake build-essential | |
# ensure you have the necessary permissions by adding yourself to the video and render groups | |
sudo usermod -aG video,render $USER | |
# reboot to apply the group changes | |
# run rocminfo to check everything is working thus far | |
rocminfo | |
# if it printed information about your GPU, that means it's working | |
# if you see an error message, fix the problem before continuing | |
# download a model | |
wget --continue https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/resolve/main/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf?download=true -O dolphin-2.2.1-mistral-7b.Q5_K_M.gguf | |
# build llama.cpp | |
git clone https://github.com/ggerganov/llama.cpp.git | |
cd llama.cpp | |
git checkout b3267 | |
HIPCXX=clang++-17 cmake -H. -Bbuild -DGGML_HIPBLAS=ON -DCMAKE_BUILD_TYPE=Release | |
make -j16 -C build | |
# run llama.cpp | |
build/bin/llama-cli -ngl 32 --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -m ../dolphin-2.2.1-mistral-7b.Q5_K_M.gguf --prompt "Once upon a time" |
Thank you, this worked fantastic. I am curious on why -DAMDGPU_TARGETS=gfx{...}
is not used compared to official guide
. What is the behavior, is it choosing whatever it detects?
@MarioIshac, the official guide is out of date. The -DAMDGPU_TARGETS
flag only affects the hip::device
target provided by find_package(hip)
. This is the mechanism you would use to choose the target if you were building with CXX=hipcc
. However, llama-cpp switched to using CMake's built-in support for the HIP language, with HIPCXX=clang++
and enable_language(hip)
. The target selection for that mechanism would be controlled by -DCMAKE_HIP_ARCHITECTURES
flag.
In the case that the AMDGPU_TARGETS
is not specified, hipcc will detect your GPU and build for that target. In the case that CMAKE_HIP_ARCHITECTURES
is not specified, cmake will detect your GPU and build for that target. As such, in the official guide you linked, the build uses HIPCXX
and CMAKE_HIP_ARCHITECTURES
is unset, so it will autodetect the architecture. Since hipcc is not used, the fact that it sets AMDGPU_TARGETS
is irrelevant.
For end users, it's better for them to not specify the GPU architecture anyway. It's better to just let it autodetect, as then I don't need to train users on how to determine their GPU architecture for specifying it manually. Only power-users that are building binaries to distribute to other people really need to learn that.
If you wish to llama-cpp in a docker container, ensure devices are passed through:
The
$(getent group render | cut -d: -f3)
is to add the render group by number, because the name will not exist within the container at launch.The official docs for this can be found at https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#accessing-gpus-in-containers