Skip to content

Instantly share code, notes, and snippets.

@vadimstasiev
Created February 7, 2023 00:04
Show Gist options
  • Select an option

  • Save vadimstasiev/d18ac345e87c152a599559632eb0395d to your computer and use it in GitHub Desktop.

Select an option

Save vadimstasiev/d18ac345e87c152a599559632eb0395d to your computer and use it in GitHub Desktop.
Nvidia Quadro P400 Passthrough on Proxmox - Reddit Post

Source: https://www.reddit.com/r/jellyfin/comments/cig9kh/nvidia_quadro_p400_passthrough_on_proxmox/

Nvidia Quadro P400 Passthrough on Proxmox

Ok, finally got it working. I'll provide a writeup here of my steps. But these are for Proxmox with an Ubuntu 1810 container only. If you want to use a different setup, you will have to figure it out yourself.

On the host (Proxmox):

  • Install the Proxmox Linux Headers. The version should match your kernel
    apt install pve-headers-$(uname -r)
  • Download the Nvidia driver, make it executable and run it:
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/430.34/NVIDIA-Linux-x86_64-430.34.run && chmod +x NVIDIA-Linux-x86_64-430.34.run && ./NVIDIA-Linux-x86_64-430.34.run
    You want to use dkms but not update the x config file
  • Load the Nvidia kernel modules at boot time. For this edit the file /etc/modules-load.d/modules.conf and add the lines nvidia and nvidia-uvm. The file should something like this:

    # /etc/modules: kernel modules to load at boot time.
    #
    # This file contains the names of kernel modules that should be loaded
    # at boot time, one per line. Lines beginning with "#" are ignored.
    nvidia
    nvidia-uvm
  • Create a script. (I used vim /root/nvidia-dev-node-setup) and fill it with following bash script code

#!/bin/bash

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then
    # Count the number of NVIDIA controllers found.
    NVDEVS=`lspci | grep -i NVIDIA`
    N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
    NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
    N=`expr $N3D + $NVGA - 1`
    for i in `seq 0 $N`; do
        mknod -m 666 /dev/nvidia$i c 195 $i
    done
    mknod -m 666 /dev/nvidiactl c 195 255
else
    exit 1
fi

/sbin/modprobe nvidia-uvm

if [ "$?" -eq 0 ]; then
     # Find out the major device number used by the nvidia-uvm driver
     D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
     mknod -m 666 /dev/nvidia-uvm c $D 0
else
    exit 1
fi

/usr/bin/nvidia-modprobe -u -c 0

/usr/bin/nvidia-persistenced --persistence-mode
  • Edit your crontab with crontab -e and add following line at the end:@reboot /root/nvidia-dev-node-setup
  • Reboot your Proxmox host. When rebooted, the command ls -lah /dev/nvidia* should show these devices:

root@pve:~# ls -lah /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Jul 28 14:34 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jul 28 14:34 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Jul 28 14:34 /dev/nvidia-modeset
crw-rw-rw- 1 root root 237,   0 Jul 28 14:34 /dev/nvidia-uvm
crw-rw-rw- 1 root root 237,   1 Jul 28 14:34 /dev/nvidia-uvm-tools
  • When executing ls -lah /dev/dri/* you should see something like this:

root@pve:~# ls -lah /dev/dri/*
crw-rw---- 1 root video 226,   0 Jul 28 14:34 /dev/dri/card0
crw-rw---- 1 root video 226,   1 Jul 28 14:34 /dev/dri/card1
crw-rw---- 1 root video 226, 128 Jul 28 14:34 /dev/dri/renderD128
  • Please note the numbers in the fifth column when executing both these commands (eg. 195, 237, 226)
  • Last but not least the Quadro should be recognized by nvidia-smi:

root@pve:~# nvidia-smi
Sun Jul 28 14:51:13 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.34       Driver Version: 430.34       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:07:00.0 Off |                  N/A |
| 34%   35C    P8    N/A /  N/A |      1MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In the container

  • Set up an privileged (dunno if unprivileged works too) Ubuntu 18.10 LXC container on proxmox
  • Setup a sudo user https://linuxize.com/post/how-to-create-a-sudo-user-on-ubuntu/
  • Login as the new user and install jellyfin (taken from the official documentation)
    sudo apt install -y apt-transport-https software-properties-common && sudo add-apt-repository universe && wget -O - https://repo.jellyfin.org/ubuntu/jellyfin_team.gpg.key | sudo apt-key add - && echo "deb [arch=$( dpkg --print-architecture )] https://repo.jellyfin.org/ubuntu $( lsb_release -c -s ) main" | sudo tee /etc/apt/sources.list.d/jellyfin.list && sudo apt update && sudo apt install -y jellyfin && sudo systemctl enable jellyfin && sudo reboot
  • After rebooting download and install the same Nvidia driver as on the host, but without the kernel modules (since we share the kernel with the host and it already has the modules)
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/430.34/NVIDIA-Linux-x86_64-430.34.run && chmod +x NVIDIA-Linux-x86_64-430.34.run && sudo ./NVIDIA-Linux-x86_64-430.34.run --no-kernel-module

Again on the host

  • When this is done, shut down the container and edit its conf file. For me it was the container with the id 115:
    vim /etc/pve/nodes/pve/lxc/115.conf
  • Add following lines to the conf file. But replace the numbers in the first three lines with the numbers you saw above when doing the ls commands. Adjust the number of lines accordingly. One line per number:

lxc.cgroup.devices.allow: c 226:* rwm
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 237:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
lxc.mount.entry: /dev/dri/card1 dev/dri/card1 none bind,optional,create=file
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
  • Reboot the container

Back in the container

  • Run nvidia-smi. The graphics card should be recognized as shown above
  • Run ls -lah /dev/nvidia* and ls -lah /dev/dri*. You should see the same nodes as on the host.
  • Test ffmpeg. A transcoding command I picked up from Jellyfin logs (dunno if it is versatile enough to work with all test files on any machine) You should not see any error messages
    /usr/lib/jellyfin-ffmpeg/ffmpeg -c:v h264_cuvid -resize 426x238 -i file:"/path/to/input.mkv" -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -force_key_frames "expr:gte(t,n_forced*5)" -copyts -avoid_negative_ts disabled -start_at_zero -pix_fmt yuv420p -preset default -b:v 64000 -maxrate 64000 -bufsize 128000 -profile:v high -vsync -1 -map_metadata -1 -map_chapters -1 -threads 0 -codec:a:0 libmp3lame -ac 2 -ab 128000 -af "volume=2" -y /path/to/output.mkv
  • Go to the Jellyfin web interface, Hamburger Menu/Admin/Dashboard/Transcoding. Choose Nvidia NVENC. Check all the boxes that appear (formats and the hardware acceleration boxes). Save the settings
  • Play something in Jellyfin. nvidia-smi should show a thread (On the host and/or container)
  • Congratulations. You have passed through your graphics card to Jellyfin in a LXC Container on Proxmox.

Talking points

  • I really struggled with Reddit markdown language when writing this. So if someone whats to structure/format this guide in a better way: be my guest
  • I ended up passing through everything graphics/Nvidia related to the container. Most likely it is not necessary, but when doing some test with removing device nodes, everything stopped working. So I leave it no as is is. But everyone is invited to optimize this.
  • The device cgroup number might change when rebooting the host. If this becomes a problem, a script might become necessary to update the LXC conf file before starting the container. Or is there a way to fix these values?

[Initial post]

Hi guys,

I bought a Quadro P400 for my home server to do some transcoding for 4K videos. I spent the last weekend to figure out what to do to get transcoding working in my Jellyfin LXC container on proxmox. I ended up passing through /dev/dri/* (which only seems to be for VAAPI) but also /dev/nvidia0, /dev/nvidiactl and /dev/nvidia-uvm. Transcoding still didn't work though.

Does someone know how to setup a Quadro/Nvidia passthrough on Proxmox with Jellyfin?

Thanks!

permalink by FriedrichNietzsche84 (↑ 16/ ↓ 0)

@vadimstasiev
Copy link
Copy Markdown
Author

vadimstasiev commented Feb 7, 2023

"For lxc the host has to have the Nvidia driver. Then you apply this patch to the host once you have the Nvidia driver installed."
https://github.com/keylase/nvidia-patch

Don't forget to hold the nvidia packages so they don't get updated, run on host and containers:
sudo apt-mark hold $(apt list | grep -i nvidia | awk -F "/" '{print $1}')

@vadimstasiev
Copy link
Copy Markdown
Author

vadimstasiev commented Feb 7, 2023

source: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

Docker setup Nvidia container:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

apt update -y

apt install docker-ce nvidia-docker2

systemctl restart docker

Test it works:

docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

@vadimstasiev
Copy link
Copy Markdown
Author

error

nvidia-container-cli: mount error: failed to add device rules: unable to find any existing device filters attached to the cgroup: bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation not permitted: unknown

Fix

source: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#step-3-rootless-containers-setup

Allow Nvidia container to run in rootless container:
sudo sed -i 's/^#no-cgroups = false/no-cgroups = true/;' /etc/nvidia-container-runtime/config.toml

@vadimstasiev
Copy link
Copy Markdown
Author

@vadimstasiev
Copy link
Copy Markdown
Author

vadimstasiev commented Feb 9, 2023

Final working container, must note I am passing through an intel iGPU along with a quadro p400

arch: amd64
cores: 3
cpulimit: 1
features: nesting=1
hostname: docker-tdar
memory: 8000
net0: name=eth0,bridge=vmbr0,gw=10.10.10.1,hwaddr=56:33:0A:E0:71:16,ip=10.10.10.86/24,type=veth
onboot: 0
ostype: debian
rootfs: GREEN250G:2303/vm-2303-disk-0.raw,size=60G
swap: 0
unprivileged: 1
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file,mode=0666
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.hook.pre-start: sh -c "chown 1000:1000 /dev/dri/renderD128"
lxc.hook.pre-start: sh -c "chown 1000:1000 /dev/dri/card0"
lxc.idmap: u 0 100000 44
lxc.idmap: g 0 100000 44
lxc.idmap: u 44 44 1
lxc.idmap: g 44 44 1
lxc.idmap: u 45 100045 60
lxc.idmap: g 45 100045 60
lxc.idmap: u 105 103 1
lxc.idmap: g 105 103 1
lxc.idmap: u 106 100106 894
lxc.idmap: g 106 100106 894
lxc.idmap: u 1000 1000 1
lxc.idmap: g 1000 1000 1
lxc.idmap: u 1001 101001 64535
lxc.idmap: g 1001 101001 64535
lxc.cgroup.devices.allow: c 226:* rwm
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 237:* rwm
lxc.hook.pre-start: sh -c "chown 1000:1000 /dev/nvidia*"
lxc.hook.pre-start: sh -c "chown 1000:1000 /dev/dri/card1"
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
lxc.mount.entry: /dev/dri/card1 dev/dri/card1 none bind,optional,create=file
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file

@ibwaheemi
Copy link
Copy Markdown

ibwaheemi commented Apr 2, 2026

your github was INVALUABLE thanks for the write up. Even Claude was completely stuck
DELL Poweredge 730
for Proxmox 9.1 with LXC debian 12

Nvidia P400 GPU Passthrough to Proxmox LXC for Jellyfin Docker

1. Proxmox Host — Install Nvidia Driver

bash

apt install pve-headers-$(uname -r)
wget https://international.download.nvidia.com/XFree86/Linux-x86_64/580.142/NVIDIA-Linux-x86_64-580.142.run
chmod +x NVIDIA-Linux-x86_64-580.142.run
./NVIDIA-Linux-x86_64-580.142.run

Accept dkms, decline x config update
Verify:
bashnvidia-smi


### 2. <CT-ID>.conf additions

lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 235:* rwm
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
lxc.autodev: 1

Note: verify major numbers for your system with ls -la /dev/nvidia* /dev/dri/ — 195 and 235 are for driver 580, 226 is for DRI. These can differ.

### 3. LXC Customisation
Install matching Nvidia driver (no kernel modules):
bash# Push installer from host first
pct push 155 NVIDIA-Linux-x86_64-580.142.run /root/NVIDIA-Linux-x86_64-580.142.run

 Inside LXC
pct enter <CT-ID>
nohup /root/NVIDIA-Linux-x86_64-580.142.run --no-kernel-module --ui=none --no-questions 2>&1 | tee /root/nvidia-install.log &
# Wait for completion then remove installer
rm /root/NVIDIA-Linux-x86_64-580.142.run

Add non-free repos to /etc/apt/sources.list:

deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
deb http://security.debian.org bookworm-security main contrib non-free non-free-firmware

Configure nvidia container runtime:
bash# Install nvidia container toolkit first

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt update && apt install -y nvidia-container-toolkit
nvidia-ctk runtime configure --runtime=docker


 Critical — allow running without cgroups in LXC

sed -i 's/^#no-cgroups = false/no-cgroups = true/' /etc/nvidia-container-runtime/config.toml
systemctl restart docker

Create ldconfig entrypoint script:
bash

cat > /docker/ldconfig-init.sh << 'EOF'
#!/bin/bash
ldconfig
exec /init
EOF
chmod +x /docker/ldconfig-init.sh

Hold nvidia packages to prevent version mismatch on updates:
bash
`apt-mark hold libnvcuvid1 libnvidia-encode1`

### 4. compose.yaml

services:
jellyfin:
image: lscr.io/linuxserver/jellyfin:latest
container_name: jellyfin
privileged: true
entrypoint: /ldconfig-init.sh
environment:
- PUID=0
- PGID=0
- TZ=Europe/London
volumes:
- /docker/ldconfig-init.sh:/ldconfig-init.sh:ro
- ./library:/config
- /mnt/jellyfin/jellyfin:/data/tvshows
- /mnt/jellyfin/movies/:/data/movies
- /mnt/jellyfin/sports/:/data/sports
- /mnt/basketball/:/data/basketball
- /mnt/jellyfin/transcodes:/mnt/transcodes:rw
# Nvidia libraries — must match host driver version
- /usr/lib/x86_64-linux-gnu/libcuda.so.580.142:/usr/lib/x86_64-linux-gnu/libcuda.so.580.142:ro
- /usr/lib/x86_64-linux-gnu/libcuda.so.1:/usr/lib/x86_64-linux-gnu/libcuda.so.1:ro
- /usr/lib/x86_64-linux-gnu/libcuda.so:/usr/lib/x86_64-linux-gnu/libcuda.so:ro
- /usr/lib/x86_64-linux-gnu/libnvcuvid.so.580.142:/usr/lib/x86_64-linux-gnu/libnvcuvid.so.580.142:ro
- /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1:/usr/lib/x86_64-linux-gnu/libnvcuvid.so.1:ro
- /usr/lib/x86_64-linux-gnu/libnvcuvid.so:/usr/lib/x86_64-linux-gnu/libnvcuvid.so:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.580.142:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.580.142:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-encode.so:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.580.142:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.580.142:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1:ro
- /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so:ro
devices:
- /dev/dri:/dev/dri
- /dev/nvidia0:/dev/nvidia0
- /dev/nvidiactl:/dev/nvidiactl
- /dev/nvidia-uvm:/dev/nvidia-uvm
- /dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools
ports:
- 8096:8096
- 8920:8920
- 7359:7359/udp
- 1900:1900/udp
restart: unless-stopped


### 5. Jellyfin Dashboard Settings

Hardware acceleration: Nvidia NVENC
Enable hardware decoding: H264, HEVC, MPEG2, VC1, VP8, VP9, HEVC 10bit, VP9 10bit
Enable hardware encoding: ✅
Allow encoding in HEVC format: ✅
Enable tone mapping: ✅ (BT.2390)
Throttle transcodes: ✅, throttle after 30 seconds
Delete segments: ✅
Transcode path: /mnt/transcodes    (i transcode on my NAS this is optional)


### 6. Things to watch out for

Driver version must match exactly between host and LXC — the .run installer in the LXC must be the same version as the host
When Proxmox updates the nvidia driver on the host, you must reinstall the .run in the LXC with the new version and update the version numbers in all compose.yaml volume mount paths
The LXC disk needs to be at least 20GB — the nvidia driver install is ~380MB plus docker images
no-cgroups = true in the container runtime config is essential for LXC — without it the nvidia runtime fails
The ldconfig entrypoint is necessary because Docker doesn't run ldconfig on startup, so the mounted libraries won't be found without it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment