Compile ollama in Ubuntu 22.04:
# Install and activate oneapi
sudo apt install intel-basekit
source /opt/intel/oneapi/setvars.sh
# You may need to install other build dependencies ...
# sudo apt install apt-utils
# Install go lang
sudo add-apt-repository ppa:longsleep/golang-backports
sudo apt update
sudo apt install -y golang-1.23-go
export PATH=/usr/lib/go-1.23/bin:$PATH
# Clone ollama
git clone --depth 1 --branch v0.3.13 https://github.com/ollama/ollama.git
# Compile
cd ollama
CGO_ENABLED="1" OLLAMA_SKIP_CPU_GENERATE="1" OLLAMA_INTEL_GPU="1" go generate ./...
go build
ollama
binary will appear in the repository root directory
When you start the server, you need to the set OLLAMA_INTEL_GPU
environment variable. For example:
export OLLAMA_INTEL_GPU=1
export ENV OLLAMA_NUM_GPU=999
export ZES_ENABLE_SYSMAN=1
export SYCL_CACHE_PERSISTENT=1
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
ollama serve
If successful, you should see the discrete GPU(s) listed when the server starts
time=2024-11-27T08:33:32.243Z level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
time=2024-11-27T08:33:32.315Z level=INFO source=types.go:123 msg="inference compute" id=0 library=oneapi variant="" compute="" driver=0.0 name="Intel(R) Data Center GPU Max 1100" total="48.0 GiB" available="45.6 GiB"
NOTE: At the moment, only discrete GPUs are supported, not integrated GPUs.
I gave it a try on Arch, but this only builds the CPU runners and runs on the CPU only here.