Using Ollama on an Asus Flow Z13 (128GB RAM) and Linux

Environment setup

I used Bazzite Linux because it seems to have the best 395+ CPU support right now. It also installs and uses podman by default. But the following instructions should work on any linux if:

You have (very) recent AMD kernel drivers installed
Podman or Docker installed (I think the instructions before should also work with docker if you just change the tool name)
Go into the BIOS and bump up the amount of RAM given over to the GPU side by default (I used 64GB but you do you)

Install and run the Ollama container

This will start serving on port 11434 and for example purposes I also have it fetch a model (llama3) and run it.

Based on this issue comment I got GPU access inside the podman containers:

# let containers use devices (like the AMD GPU)
sudo setsebool -P container_use_devices=true

Then download and run the container as a daemon:

# download and run the container, it will then answer on port 11434
podman run -it --name ollama -d --network=host -p 127.0.0.1:11434:11434 -v ollama:/root/.ollama --device /dev/kfd --device /dev/dri   docker.io/ollama/ollama:rocm

# To pull a new model
podman exec -it ollama ollama pull llama3
# Or to pull a model and run it in an interactive shell
podman exec -it ollama ollama run llama3

# Test the web api
curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":" Why is the color of the sea blue ?"
}'

A test session

Here's a quick demo of using the basic shell access to the model.

~/development/ai$ podman exec -it ollama ollama run llama3
pulling manifest 
pulling 6a0746a1ec1a... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏ 4.7 GB                         
pulling 4fa551d4f938... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏  12 KB                         
pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏  254 B                         
pulling 577073ffcc6c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏  110 B                         
pulling 3f8eb4da87fa... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████▏  485 B                         
verifying sha256 digest 
writing manifest 
success 
>>> make a joke about cats
Why did the cat join a band?

Because it wanted to be the purr-cussionist!

Installing the Ollama WebUI

You probably want to use the (very nice) web interface. In that case, run the following container:

podman run -d --network=host -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Then you can access the web interface at http://localhost:8080/

geeksville/gist:d8ec1fc86507277e123ebf507f034fe9

Using Ollama on an Asus Flow Z13 (128GB RAM) and Linux

Environment setup

Install and run the Ollama container

A test session

Installing the Ollama WebUI