Here's how I setup a multi-part model (gpt-oss-120b) as a service via ramalama:
# Pull the model (this will take a while)
ramalama pull hf://ggml-org/gpt-oss-120b-GGUF
mkdir -p ~/.config/containers/systemd
cd ~/.config/containers/systemd
# Generate the quadlet