Skip to content

Instantly share code, notes, and snippets.

@egg82
Last active November 15, 2024 23:18
Show Gist options
  • Save egg82/90164a31db6b71d36fa4f4056bbee2eb to your computer and use it in GitHub Desktop.
Save egg82/90164a31db6b71d36fa4f4056bbee2eb to your computer and use it in GitHub Desktop.
NVidia Proxmox + LXC

Proxmox

Find the proper driver at the NVidia website.

Note: Make sure to select "Linux 64-bit" as your OS

Hit the "Search" button.

Hit the "Download" button.

Right-click the download button and "Copy link address".

SSH into to your Proxmox instace.

Create the file /etc/modprobe.d/nvidia-installer-disable-nouveau.conf with the following contents:

# generated by nvidia-installer
blacklist nouveau
options nouveau modeset=0

Reboot the machine:

reboot now

Run the following:

apt install build-essential pve-headers-$(uname -r)
wget <link you copied>
chmod +x ./NVIDIA-Linux-x86_64-<VERSION>.run
./NVIDIA-Linux-x86_64-<VERSION>.run

Edit /etc/modules-load.d/modules.conf and add the following to the end of the file:

nvidia
nvidia_uvm

Run the following:

update-initramfs -u

Create the file /etc/udev/rules.d/70-nvidia.rules and add the following:

# /etc/udev/rules.d/70-nvidia.rules
# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loaded
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
# Create the CUDA node when nvidia_uvm CUDA module is loaded
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"

Reboot the machine.

For each container

SSH into the Proxmox.

Run the following:

modprobe nvidia-uvm
ls /dev/nvidia* -l

Note these numbers, you'll need them in the next step

Edit /etc/pve/lxc/<container ID>.conf and add the following:

lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

Container/LXC

SSH into your container.

Run the following:

dpkg --add-architecture i386
apt update
apt install libc6:i386

wget <link you copied for the Proxmox step>
chmod +x ./NVIDIA-Linux-x86_64-<VERSION>.run
./NVIDIA-Linux-x86_64-<VERSION>.run --no-kernel-module

Reboot the container.

CUDA

SSH back into your container.

Run the following:

apt install nvidia-cuda-toolkit nvidia-cuda-dev

Note: Plex DOES NOT USE THE GPU until you install CUDA

Plex will pick up the fact that you have a GPU in the install process and will enable the hardware transcoding checkbox, but it will NOT use the GPU until CUDA is installed.

Python/cuDNN

SSH into your container.

Run the following:

apt install python3 python3-dev python3-pip python3-pycuda

Check your CUDA version:

nvidia-smi

Download the correct cuDNN library from the NVidia website (requires creating an account, but it's free).

Upload it to your container.

Run the following:

tar -xvf cudnn-<tab>
mkdir -p /usr/local/cuda/lib64/
mkdir -p /usr/local/cuda/include/
cp cuda/lib64/* /usr/local/cuda/lib64/
cp cuda/include/* /usr/local/cuda/include/
export CUDA_ROOT=/usr/local/cuda
export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$LD_LIBRARY_PATH
export CPATH=$CUDA_ROOT/include:$CPATH
export LIBRARY_PATH=$CUDA_ROOT/lib64:$LIBRARY_PATH
echo "export CUDA_ROOT=/usr/local/cuda" >> .bashrc
echo "export LD_LIBRARY_PATH=\$CUDA_ROOT/lib64:\$LD_LIBRARY_PATH" >> .bashrc
echo "export CPATH=\$CUDA_ROOT/include:\$CPATH" >> .bashrc
echo "export LIBRARY_PATH=\$CUDA_ROOT/lib64:\$LIBRARY_PATH" >> .bashrc

Done!

@fisherwei
Copy link

pve 7.x should use cgroup2

TLDR: lxc.cgroup.devices.allow MUST be changed to lxc.cgroup2.devices.allow

https://forum.proxmox.com/threads/pve-7-0-lxc-intel-quick-sync-passtrough-not-working-anymore.92025/

@BlummNikkiS
Copy link

Thank you!

@bobo-jamson
Copy link

I think there is a minor error, based on the pattern of mapping /dev/nvidia-* to the same dev/nvidia-*,

lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

the /dev/nvidia-uvm-tools should likely be /dev/nvidia-caps/nvidia-cap1 and /dev/nvidia-caps/nvidia-cap2.

Thanks for this really helpful guide by the way.

@doughnet
Copy link

Should include that every time a kernel update is completed and a reboot done the numbers from the commend "ls /dev/nvidia* -l" can change rendering the container with the passthrough not seeing the NVIDIA device.

@doughnet
Copy link

This is also a good link to have for a patcher and direct link for the linux drivers. In additional the script in the repo allows for bypassing NVIDIA restrictions

https://github.com/keylase/nvidia-patch

@GarckaMan
Copy link

I upgraded to Proxmox 8 and it comes with 6.2 kernel.
With that, I can't install CUDA drivers anymore. Any solution to that?

@ryc111
Copy link

ryc111 commented Aug 19, 2023

for pve8 and debian lxc:

I tried this on both host and lxc to replace this step apt install nvidia-cuda-toolkit nvidia-cuda-dev , for install cuda:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Debian&target_version=11&target_type=deb_network

Do both on host and lxc to install cuda and related driver:

wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
add-apt-repository contrib
apt-get update
apt-get -y install cuda

And it works now!

@Jugrnot
Copy link

Jugrnot commented Sep 19, 2023

I hate to necro this, but I'm having some issues. Followed the original writeup to the T and had no errors result from any of the process. Edited the lxc conf, added what was necessary, rebooted. From inside my plex container for example, nvidia-smi reflects the cards are present, plex install recognizes that an nvidia card exists, yet in the plex options for transcode I can't select anything other than "auto" for hardware. Using "auto" nothing ever touches the GPUs. Both the host and container have the exact same version of driver and cuda installed. Did I install the wrong version or something for this to work??

image

image

@swahpy
Copy link

swahpy commented Nov 2, 2023

This is also a good link to have for a patcher and direct link for the linux drivers. In additional the script in the repo allows for bypassing NVIDIA restrictions

Hi, did you just simply replace apt install nvidia-cuda-toolkit nvidia-cuda-dev with following commands?

wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
add-apt-repository contrib
apt-get update
apt-get -y install cuda

But when I run above commands it prompted that the driver version was not compatible and always let me run nvidia uninstaller to remove the driver. I could not even run nvidia-smi successfully.
So could help to share all the steps you did? Thank you very much.

@chyld
Copy link

chyld commented Feb 9, 2024

Here are my changes to /etc/pve/lxc/100.conf

lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 234:* rwm
lxc.cgroup2.devices.allow: c 237:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

It works flawlessly!

@jsapede
Copy link

jsapede commented May 28, 2024

I upgraded to Proxmox 8 and it comes with 6.2 kernel. With that, I can't install CUDA drivers anymore. Any solution to that?

just resintalled proxmox 8.2 with 6.8.4-3-pve kernel and fowllowed the instructions for 470 drivers on host and it works

howerver dont know how to update in the future

@amangupta20
Copy link

would it be possible to add the configuration to multiple containers and let them all have access to the gpu?

@MathisHureau
Copy link

Hi everyone,
I tried all advices i found on this page but i'm stuck at CUDA step.

I have a GTX 1650 and driver already install on node and lxc. I installed this driver.

nvidia-smi tell me the nvidia cuda version supported is 12.4 :
image

When i tried to install nvidia-cuda with recommended steps for my version of cuda supported, like this (official nvidia docs): image

It tells me that a non compatible version of nvidia-installer is present :
image

I don't really know what to try now, can someone give me tips on this ? I don't have cuda on the node, can be this that broke the whole thing ?

Thanks for any type of help and sry for my english, i train on my english.

@kacoo1
Copy link

kacoo1 commented Oct 16, 2024

Hi everyone, I tried all advices i found on this page but i'm stuck at CUDA step.

I have a GTX 1650 and driver already install on node and lxc. I installed this driver.

nvidia-smi tell me the nvidia cuda version supported is 12.4 : image

When i tried to install nvidia-cuda with recommended steps for my version of cuda supported, like this (official nvidia docs): image

It tells me that a non compatible version of nvidia-installer is present : image

I don't really know what to try now, can someone give me tips on this ? I don't have cuda on the node, can be this that broke the whole thing ?

Thanks for any type of help and sry for my english, i train on my english.

Am having the same issue as above but Plex does use hardware transcoding without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment