I used Bazzite Linux because it seems to have the best 395+ CPU support right now. It also installs and uses podman by default. But the following instructions should work on any linux if:
- You have (very) recent AMD kernel drivers installed
- Podman or Docker installed (I think the instructions before should also work with docker if you just change the tool name)
- Go into the BIOS and bump up the amount of RAM given over to the GPU side by default (I used 64GB but you do you)
This will start serving on port 11434 and for example purposes I also have it fetch a model (llama3) and run it.
Based on this issue comment I got GPU access inside the podman containers:
# let containers use devices (like the AMD GPU)
sudo setsebool -P container_use_devices=true
Then download and run the container as a daemon:
# download and run the container, it will then answer on port 11434
podman run -it --name ollama -d --network=host -p 127.0.0.1:11434:11434 -v ollama:/root/.ollama --device /dev/kfd --device /dev/dri docker.io/ollama/ollama:rocm
# To pull a new model
podman exec -it ollama ollama pull llama3
# Or to pull a model and run it in an interactive shell
podman exec -it ollama ollama run llama3
# Test the web api
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":" Why is the color of the sea blue ?"
}'
Here's a quick demo of using the basic shell access to the model.
~/development/ai$ podman exec -it ollama ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 4.7 GB
pulling 4fa551d4f938... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 12 KB
pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 254 B
pulling 577073ffcc6c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 110 B
pulling 3f8eb4da87fa... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 485 B
verifying sha256 digest
writing manifest
success
>>> make a joke about cats
Why did the cat join a band?
Because it wanted to be the purr-cussionist!
You probably want to use the (very nice) web interface. In that case, run the following container:
podman run -d --network=host -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Then you can access the web interface at http://localhost:8080/