Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save FrankieWOO/8f5d0794512ae43a4efd7e068b19a4d8 to your computer and use it in GitHub Desktop.
Save FrankieWOO/8f5d0794512ae43a4efd7e068b19a4d8 to your computer and use it in GitHub Desktop.

New tested approach: Build Nvidia driver for rt kernel with installed driver on non rt OS. (tested on July 19, 2024 on zorin17 (equivalent to Ubuntu23.10))

Important

One thing before start the below process is to make sure your non-rt kernel and installed rt-kernel has same or close major version, in my most recent test case, it is 6.5.0 for generic vs 6.6.40 for rt.

Step. 0: First, install the rt kernel but dont boot into the rt version OS right now. Install the Nvidia driver first in your non rt OS. When you installed your OS this has probably been done, but check if you got the nvidia-kernel-source, if not,

sudo apt install nvidia-kernel-source-XXX

Step. 1: Now cd into the nvidia-kernel-source directory:

cd "$(dpkg -L nvidia-kernel-source-XXX | grep -m 1 "nvidia-drm" | xargs dirname)"

Step. 2: Build Nvidia driver with IGNORE_PREEMPT_RT_PRESENCE=1. Note that, the command below use $(uname -r) when you have already booted into the realtime OS, however, if you follow me steps right now I assume you are still in the non rt OS, so you should replace the $(uname -r) with your installed rt kernel name.

# Build Nvidia driver with IGNORE_PREEMPT_RT_PRESENCE=1
sudo env NV_VERBOSE=1 \
    make -j$(nproc) NV_EXCLUDE_BUILD_MODULES='' \
    KERNEL_UNAME=$(uname -r) \
    IGNORE_XEN_PRESENCE=1 \
    IGNORE_CC_MISMATCH=1 \
    IGNORE_PREEMPT_RT_PRESENCE=1 \
    SYSSRC=/lib/modules/$(uname -r)/build \
    LD=/usr/bin/ld.bfd \
    modules
 
sudo mv *.ko /lib/modules/$(uname -r)/updates/dkms/

The last line move the .ko files from the kernel source folder to the rt kernel directory.

Step. 3: Now reboot with your realtime kernel. Step. 4: After you get into the rt system,

sudo depmod -a
sudo update-initramfs -k $(uname -r) -u

Step. 5: Then reboot again and check using nvidia-smi if you have nvidia driver successfully loaded.

Note

If you need to rebuild the NV driver in the RT system, just repeat step 1-5, skip step 3.

How to install Nvidia driver alongside real-time linux kernel

To install Nvidia driver on a PC running real-time kernel, follow the tutorial here: https://github.com/ApolloAuto/apollo/blob/master/docs/howto/how_to_install_apollo_kernel.md

https://github.com/cacao-org/cacao/wiki/OS-install-RTC

The rt kernel I tested at this moment is 5.15.21-rt30 (Feb 2022).

Install real-time linux kernel

Follow a community contributed tutorial to install the latest stable RT_PREEMPT version https://docs.ros.org/en/foxy/Tutorials/Building-Realtime-rt_preempt-kernel-for-ROS-2.html

Open Software & Updates. in the Ubuntu Software menu tick the ‘Source code’ box.

Install dependencies:

sudo apt-get update
sudo apt-get build-dep linux
sudo apt-get install build-essential bc curl ca-certificates gnupg2 lsb-release libncurses-dev flex bison openssl libssl-dev dkms libelf-dev libudev-dev libpci-dev libiberty-dev autoconf fakeroot
sudo apt update
sudo apt install zstd
cd ~
mkdir rt_kernel
cd rt_kernel

wget https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/linux-5.15.21.tar.gz
wget https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/linux-5.15.21.tar.sign
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patch-5.15.21-rt30.patch.gz
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patch-5.15.21-rt30.patch.sign

gunzip linux-5.15.21.tar.gz
# tar -xzf linux-5.15.21.tar.gz

gunzip patch-5.15.21-rt30.patch.gz

Verify the kernel files integrity

gpg2 --verify linux-*.tar.sign
gpg2 --verify patch-*.patch.sign
tar xf linux-*.tar

cd linux-*/
patch -p1 < ../patch-*.patch

cp /boot/config-$(uname -r) .config

yes '' | make oldconfig
make menuconfig

apply these modifications:

  • general setup / preemption model
    • Fully preemptible Kernel (Real-time)
  • general setup / Timers subsystem
    • High Resolution Timer Support
  • general setup / Timers subsystem / Timer tick handling
    • Full dynticks system (tickless)
  • Processor type and features / Timer frequency
    • 1000 Hz
  • Power management and ACPI options
    • CPU Frequency scaling, CPU Frequency scaling (CPU_FREQ [=y]) -> Default CPUFreq governor ( [=y]) (X) performance
  • Cryptographic API* > Certificates for signature checking (at the very bottom of the list) > Provide system-wide ring of trusted keys > Additional X.509 keys for default system keyring
    • Remove the “debian/canonical-certs.pem” from the prompt and press Ok.

save and exit.

make -j$(nproc) deb-pkg

or

make -j $(nproc)
sudo make bzImage
sudo make INSTALL_MOD_STRIP=1 modules_install -j $(nproc)
sudo make install

According to the post by cacao:

the INSTALL_MOD_STRIP is important - it shrinks the initial ramdisk size by ~90%. In some cases, when not used, the initial ramdisk would not load and the patched kernel would not boot.

If you see these errors:

  • if you see this error: CONFIG_X86_X32 enabled but no binutils support, change config_x86_x32 to n
  • sed: can't read modules.order: No such file or directory
    • set CONFIG_SYSTEM_TRUSTED_KEY=""
    • set CONFIG_SYSTEM_REVOCATION_KEYS=""
  • Missing file: arch/x86/boot/bzImage , you need to run sudo make bzImage before modules_install
  • if you see this error:bin/sh: 1: zstd: not found, you need to install Zstandard bysudo apt install zstd

Make sure realtime kernel is set to default

Edit /etc/default/grub file Change GRUB_DEFAULT=0 to GRUB_DEFAULT=saved. Add GRUB_SAVEDEFAULT=true

Allow a user to set real-time permissions for its processes

According to https://frankaemika.github.io/docs/installation_linux.html#setting-up-the-real-time-kernel:

sudo addgroup realtime
sudo usermod -a -G realtime $(whoami)

Afterwards, add the following limits to the realtime group in /etc/security/limits.conf:

@realtime soft rtprio 99
@realtime soft priority 99
@realtime soft memlock 102400
@realtime hard rtprio 99
@realtime hard priority 99
@realtime hard memlock 102400

Install CUDA and nvidia driver

Install CUDA

Run the following to install CUDA 11.3

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu2004-11-3-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

The corresponding driver version is 465 465 failed

sudo apt-get install nvidia-driver-470

Install Nvidia driver for realtime kernel

Run the following to build Nvidia driver 470

sudo apt-get install nvidia-kernel-source-470
# Change to Nvidia driver source directory /usr/src/nvidia-465.19.01
cd "$(dpkg -L nvidia-kernel-source-550 | grep -m 1 "nvidia-drm" | xargs dirname)"

# Build Nvidia driver with IGNORE_PREEMPT_RT_PRESENCE=1
sudo env NV_VERBOSE=1 \
    make -j$(nproc) NV_EXCLUDE_BUILD_MODULES='' \
    KERNEL_UNAME=$(uname -r) \
    IGNORE_XEN_PRESENCE=1 \
    IGNORE_CC_MISMATCH=1 \
    IGNORE_PREEMPT_RT_PRESENCE=1 \
    SYSSRC=/lib/modules/$(uname -r)/build \
    LD=/usr/bin/ld.bfd \
    modules

sudo mv *.ko /lib/modules/$(uname -r)/updates/dkms/
sudo depmod -a
  
sudo update-initramfs -k $(uname -r) -u  

reboot the system

sudo reboot

run nvidia-smi to check the driver

option 2

A guide how to install nvidia driver manually: https://www.if-not-true-then-false.com/2021/debian-ubuntu-linux-mint-nvidia-guide/

Install dependencies

sudo apt-get update -y
sudo apt-get install -y libglvnd-dev
# 1. Download NVIDIA driver as a .run file

# 2. Stop X-Server
sudo service lightdm stop

# 3. Blacklist Nouveau driver
sudo nano /etc/modprobe.d/blacklist-nouveau.conf

# Insert into file:
#  blacklist nouveau
#  options nouveau modeset=0

# 4. Update kernel initramfs
sudo update-initramfs -u
sudo reboot  # I'm not sure if needed

# 5. Install driver!
chmod +x NVIDIA-Linux-*.run
sudo IGNORE_PREEMPT_RT_PRESENCE=1 bash NVIDIA-Linux-*.run  # Insert downloaded .run file

# 6. Reboot
sudo reboot

https://gist.github.com/pantor/9786c41c03a97bca7a52aa0a72fa9387

if the nvidia-installer create modprobe files to disable Nouveau driver, and if you want to re-enable the Nouveau driver later, you will need to delete these files:

/usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf
/etc/modprobe.d/nvidia-installer-disable-nouveau.conf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment