llama.cpp on framework 13 with AMD Ryzen 7040 series on Arch linux

Install deps

> sudo pacman -Sy extra/rocminfo radeontop cmake ROCm extra/rocblas extra/hipblas

Identify integrated GPU

> rocminfo | grep Name
  Name:                    AMD Ryzen 5 7640U w/ Radeon 760M Graphics
  Marketing Name:          AMD Ryzen 5 7640U w/ Radeon 760M Graphics
  Vendor Name:             CPU
  Name:                    gfx1103
  Marketing Name:          AMD Radeon 760M
  Vendor Name:             AMD
      Name:                    amdgcn-amd-amdhsa--gfx1103

Build and install steps from https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#hip

Build

> HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1103 -DCMAKE_BUILD_TYPE=Release \
    && cmake --build build --config Release -- -j 16

The library for the GPU arch gfx1103 isn't available in the rocm libs on Archlinux as yet ROCm/ROCm#2631

For now override the GPU version


> HSA_OVERRIDE_GFX_VERSION=11.0.0 ./build/bin/llama-server

Install huggingface cli and fetch model

> sudo pacman -Sy extra/python-huggingface-hub

> huggingface-cli download TheBloke/deepseek-coder-33B-base-GGUF deepseek-coder-33b-base.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False

joelrebel/notes.md

Select an option

No results found

Select an option

No results found