Skip to content

Instantly share code, notes, and snippets.

@cgmb
Last active December 21, 2024 09:41
Show Gist options
  • Save cgmb/be113c04cd740425f637aa33c3e4ea33 to your computer and use it in GitHub Desktop.
Save cgmb/be113c04cd740425f637aa33c3e4ea33 to your computer and use it in GitHub Desktop.
How to build llama.cpp on Ubuntu Mantic
#!/bin/sh
# Build llama.cpp on Ubuntu 23.10
# Tested with `docker run -it --device=/dev/dri --device=/dev/kfd --security-opt seccomp=unconfined --volume $HOME:/mnt/home debian:sid`
apt -y update
apt -y upgrade
apt -y install git hipcc libhipblas-dev librocblas-dev cmake build-essential
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
git checkout b2110
CC=clang-15 CXX=clang++-15 cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release
make -j16 -C build
build/bin/main -ngl 32 --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -m ~/Downloads/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf --prompt "Once upon a time"
@userbox020
Copy link

userbox020 commented Mar 4, 2024

sup bro, i try to run the git inside a docker container on ubuntu 22.04 but it just detect cpu

here my Dockerfile

# Using Debian Bullseye for better stability
FROM debian:bullseye

# Build argument for Clang version to make it flexible
ARG CLANG_VERSION=11

# Set non-interactive frontend to avoid prompts during build
ENV DEBIAN_FRONTEND=noninteractive

# Update system and install essential packages
RUN apt-get update && apt-get upgrade -y && apt-get install -y \
    git \
    cmake \
    build-essential \
    "clang-$CLANG_VERSION" \
    libomp-dev # OpenMP library, often used with Clang

# Clone the specific repo and checkout to the specified commit
RUN git clone https://github.com/ggerganov/llama.cpp.git /llama.cpp && \
    cd /llama.cpp && \
    git checkout b2110

# Build the project
RUN cd /llama.cpp && \
    CC="clang-$CLANG_VERSION" CXX="clang++-$CLANG_VERSION" cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release && \
    make -j$(nproc) -C build

# Set the working directory to the build directory
WORKDIR /llama.cpp/build

# Command to keep the container running (replace this with your desired command)
CMD ["tail", "-f", "/dev/null"]

and here my docker compose file

version: '3.8'
services:
  llama-builder:
    build:
      context: . # Assumes Dockerfile is in the same directory
      args:
        CLANG_VERSION: "11" # Set the Clang version here, adjust as necessary
    devices:
      - "/dev/dri:/dev/dri" # For GPU access, might not be necessary for all use cases
      - "/dev/kfd:/dev/kfd" # For AMD ROCm access, adjust if not using ROCm
    security_opt:
      - seccomp=unconfined
    volumes:
      - "$HOME:/mnt/home" # Mount home directory to access necessary files
      - "/media/500GB_HDD/Models/:/Models/"   #Models files
      

this is how i start the docker container sudo docker compose up --build

and then try to run a model like the follow

./bin/main -m /Models/openhermes-2.5-neural-chat-v3-3-slerp.Q8_0.gguf -p "Hi you how are you" -ngl 90 --no-mmap --numa

and its not offloading anything to gpu

can you give me a hand bro? i will appreciate

@cgmb
Copy link
Author

cgmb commented Mar 4, 2024

@userbox020, you need to apt-get install -y hipcc libhipblas-dev librocblas-dev and use a newer OS and compiler. I suggest FROM ubuntu:mantic and CLANG_VERSION=15. The necessary packages are not available for Debian Bullseye or clang-11. Those are too old.

@cgmb
Copy link
Author

cgmb commented Dec 15, 2024

These instructions are now out of date for Debian 13, which has updated its ROCm stack to use clang-17. It should still work fine if you replace -15 with -17 for clang in the build command.

I also have Ubuntu 24.04 instructions for a newer version of llama.cpp. For the moment, the Debian 13 instructions are the same as Ubuntu 24.04, however, Debian 13 has not released yet so it's possible the instructions may change again (to use clang-18).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment