Skip to content

Instantly share code, notes, and snippets.

@jeremy-rutman
Last active May 26, 2025 20:36
Show Gist options
  • Select an option

  • Save jeremy-rutman/02243e08f9864ebd81de75a05bfe1eec to your computer and use it in GitHub Desktop.

Select an option

Save jeremy-rutman/02243e08f9864ebd81de75a05bfe1eec to your computer and use it in GitHub Desktop.
tl;dr
use the package manager install (.deb or .rpm files, not .run file)
remove stuff using
sudo apt-get --purge remove 'cuda*'
sudo apt-get --purge -y remove 'nvidia*'
sudo apt-get --purge -y remove 'libnvidia*'
using
jeremy@jeremy-Blade:~$ dpkg -l | grep -i nvidia
make sure everything ius iuninstalled
uninstall steps to avoid 'existing runfile installation already found, it is strongly recommended to remove it'
sudo apt-get purge nvidia-current
sudo apt-get remove --purge nvidia-*
this may also hhelp
sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl
but in my case the cuda10.1 and 10.2 dir have no bin dir , and
$ sudo /usr/bin/nvidia-uninstall doesnt exist
on running install i get
jeremy@jeremy-Blade:~/Downloads$ sudo sh cuda_10.2.89_440.33.01_linux.run.1
Installation failed. See log at /var/log/cuda-installer.log for details.
jeremy@jeremy-Blade:~/Downloads$ more /var/log/cuda-installer.log
[INFO]: Driver not installed.
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install:
[INFO]: Driver
[INFO]: 440.33.01
[INFO]: Executing NVIDIA-Linux-x86_64-440.33.01.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-versi
on-check --install-libglvnd 2>&1
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 440.33.01 failed, quitting
dpkg -l | grep -i nvidia revealed
jeremy@jeremy-Blade:~$ dpkg -l | grep -i nvidia
rc cuda-nvtx-10-1 10.1.243-1 amd64 NVIDIA Tools Extension
rc cuda-nvtx-10-2 10.2.89-1 amd64 NVIDIA Tools Extension
rc libnvidia-compute-415:amd64 415.27-0ubuntu0~gpu18.04.2 amd64 NVIDIA libcompute package
rc libnvidia-compute-418:amd64 418.87.01-0ubuntu1 amd64 NVIDIA libcompute package
rc libnvidia-compute-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA libcompute package
sudo apt-get remove --auto-remove nvidia-cuda-toolkit
didnt work - something else was using dpkg - so restarted and tried again
sudo apt-get purge nvidia-cuda-toolkit or sudo apt-get purge --auto-remove nvidia-cuda-toolkit
Additionally, delete the /opt/cuda and ~/NVIDIA_GPU_Computing_SDK folders if they are present. and remove the export PATH=$PATH:/opt/cuda/bin and export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/lib:/opt/cuda/lib64 lines of the ~/.bash_profile file
this one removed a lot
sudo apt-get --purge remove 'cuda*'
sudo apt-get --purge -y remove 'nvidia*'
sudo reboot
check /var/log/cuda-installer.log
if that shows probs in driver then check
/var/log/nvidia-installer.log
which shows
ERROR: You appear to be running an X server; please exit X before installing. For further details, please see the section INSTALLING TH
E NVIDIA DRIVER in the README available on the Linux driver download page at www.nvidia.com.
so drop out to shell(alt-f4 on my ubuntu18.04) and try
sudo service lightdm stop
unable to load nvidia.drm , also nouveau complaint.
it turns out what i want is the .deb or package file and not the runfile (* It is recommended to use the distribution-specific packages, where possible. )
sudo nano /etc/profile
export PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs${LD_LIBRARY_PATH:+:${LS_LIBRARY_PATH}}
/usr/bin/nvidia-persistenced --verbose
check /var/crash, i found an err where module could not be made
ake[1]-***-no-rule-to-make-target-`modules'-stop-571
I tried
jeremy@jeremy-Blade:~$ ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/lib/x86_64-linux-gnu/libnvidia-ml.so
but still nvidia-smi still gives me
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system
and nvidia-settings gives me
ERROR: Unable to load info from any available system
INFO
jeremy@jeremy-Blade:~$ lspci |grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GP106M [GeForce GTX 1060 Mobile] (rev a1)
jeremy@jeremy-Blade:~$ uname -a
Linux jeremy-Blade 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
jeremy@jeremy-Blade:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic
1312 sudo ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/lib/x86_64-linux-gnu/libnvidia-ml.so
and
sudo ln -s /usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/local/cuda/lib64/libnvidia-ml.so
didnt help
finally solved by install using pure package managers, no downloads
CUDNN
Check where your cuda installation is. For the installation from the repository it is /usr/lib/... and /usr/include. Otherwise, it will be /usr/local/cuda/ or /usr/local/cuda-<version>. You can check it with which nvcc or ldconfig -p | grep cuda
Copy the files:
Repository installation:
$ cd folder/extracted/contents
$ sudo cp -P include/cudnn.h /usr/include
$ sudo cp -P lib64/libcudnn* /usr/lib/x86_64-linux-gnu/
$ sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcudnn*
@jakpiase
Copy link
Copy Markdown

Thank you for sharing these instructions, they have helped me a lot with an issue related to nvidia!

@jeremy-rutman
Copy link
Copy Markdown
Author

Glad it was of help to you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment