- Install proprietary nvidia driver
- Install nvidia-container-toolkit, containing CUDA (?) (via apt-get)
- Install tensorflow docker container
Instructions from https://linuxconfig.org/how-to-install-nvidia-driver-on-debian-10-buster-linux
Preparations:
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install nvidia-detect
Check recommended driver:
nvidia-detect
Possible output (depending on your graphics card model):
Detected NVIDIA GPUs:
05:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1)
Checking card: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
Your card is supported by the default drivers and legacy driver series 390.
It is recommended to install the
nvidia-driver
package.
Download the recommended nvidia source package from https://www.nvidia.com/en-us/drivers/unix/
There seem to be two options:
- The latest compatible driver, which at time of writing was version 430.50 (filename NVIDIA-Linux-x86_64-430.50.run)
- The legacy driver, version 390.129 (filename: NVIDIA-Linux-x86_64-390.129.run)
The legacy driver (i.e. version 390.129) is not compatible with CUDA 10, so the newer version (i.e. version 430.50) is needed. Save the respective file at an accessible location (will need to navigate to the folder using the CLI later).
Install linux-headers (may already be installed anyway):
sudo apt-get install linux-headers-$(uname -r) build-essential
Disable the default nouveau driver:
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nvidia-nouveau.conf
Some more preparations (perhaps unnecessary?):
sudo apt-get install module-assistant
sudo m-a prepare
sudo update-initramfs -u
Source: https://unix.stackexchange.com/a/424603
Reboot to multi-user runlevel. This will disable the GUI after reboot:
sudo systemctl set-default multi-user.target
sudo systemctl reboot
Here is some more background information on the previous step:
For systemd, the concept of runlevels is replaced by the term “targets”. There is a “mapping” between the init runlevels and systemd targets:
- multi-user.target: analogous to runlevel 3, Text mode
- graphical.target: analogous to runlevel 5, GUI mode with X server Source: https://www.systutorials.com/239880/change-systemd-boot-target-linux/
Login as root user. cd
to the directory containing the installation file
(NVIDIA-Linux-x86_64-390.129.run
) and install the nvidia driver by running:
bash NVIDIA-Linux-x86_64-390.116.run
During the installation you may be asked the following set of questions:
Register kernel module sources with DKSM?
--> Yes
(The CC version check failed: The kernel was built with gcc version 8.2.0 (Debian 8.2.0-14), but the current compiler version is cc (Debian 8.3.0-2) 8.3.0. This may lead to subtle problems; if you are not certain whether the mismatched compiler will be compatible with your kernel, you may wish to abort installation, set the CC environment variable to the name of the compiler used to compile your kernel, and restart installation.
--> Ignore CC version check
Install NVIDIA's 32-bit compatibility libraries?
--> Yes
An incomplete installation of libglvnd was found. Do you want to install a full copy of libglvnd? This will overwrite any existing libglvnd libraries.
--> Install and overwrite existing filesort installation.
Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up.
--> Yes
Reboot the system, back into GUI mode:
systemctl set-default graphical.target
systemctl reboot
Instructions from: https://github.com/NVIDIA/nvidia-docker
Add the package repositories:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Install the toolkit:
sudo apt-get update
sudo apt-get install nvidia-container-toolkit nvidia-docker2
sudo systemctl restart docker
Reboot system.
Install nvidia cuda docker container:
docker run --runtime=nvidia --rm nvidia/cuda:10.1-devel nvidia-smi
Instructions from: https://www.tensorflow.org/install/docker
Pull docker image:
docker pull tensorflow/tensorflow:2.3.1-gpu
Test docker images:
docker run -it --rm tensorflow/tensorflow:2.3.1-gpu bash
Sources: