Installing any version of CUDA on Ubuntu and installing GPU versions of both torch and tensorflow

It contains steps for installing CUDA. It further shows how one can install Tensorflow and Pytorch and use GPUs with them.Our motive is to get output as in the image below,

where nvidia-smi, nvcc --version, torch.cuda.is_available(),tf.test.gpu_devic_name() all are working. Let's go then. I will specifically install CUDA 10.1 where both Tensorflow and Pytorch works fine.

1. Prerequisite instructions

1.1.a Verify You Have a CUDA-Capable GPU

      *lspci | grep -i nvidia*

It lists the nvidia GPU you have on your system. Check that on https://developer.nvidia.com/cuda-gpus .If it is listed, your GPU is CUDA-capable.

1.1.b There are many versions of Linux, CUDA is supported on few distribution. To determine which distribution and release number you're running, type the following at the command line

     *uname -m && cat /etc/*release*

1.1.c. Verify that system has gcc installed: Most of the times it comes pre installed with operating system use gcc --version .If it’s not installed, use :

  sudo apt-get install manpages-dev
  sudo apt-get update
  sudo apt install build-essential
  sudo apt-get install manpages-dev

1.1.d Verifying correct Kernel headers

     sudo apt-get install linux-headers-$(uname -r)

2. Installing Cuda toolkit

you can find documentation related to any cuda version from here https://developer.nvidia.com/cuda-toolkit-archive I am specifically doing this for 10.1

   wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.105-1_amd64.deb

  1. sudo dpkg -i cuda-repo-ubuntu1804_10.1.105-1_amd64.deb
  2. sudo apt-key adv --fetch-keys             https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
  3. sudo apt-get update

Next line is tricky and confusing for most people. Since it installs the most recent cuda version,and shouldn't be used if you don't want latest version of cuda.
sudo apt-get install cuda(don't use it, use below one )

 4. sudo apt-get install cuda-10-1

If you do nvidia-smi or nvcc --version now, they would work becuase they are yet to be added to bashrc. Update bashrc now .

  export PATH="/usr/local/cuda-10.1/bin:$PATH"
  export LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"

After which do,

  source .bashrc

If nvidia-smi or nvcc --version are not working now, try to reboot the system. It would work.

We need to install torch and tensorflow now. we can use pip or conda environment. If you don't want to install CuDNN manually, it's better to anaconda. It automatically installs CuDNN and saves lot of hassle.

Install anaconda now. I specifically followed this link https://www.digitalocean.com/community/tutorials/how-to-install-anaconda-on-ubuntu-18-04-quickstart

Make environment now using conda like and activate as well:

  conda create --name my_env python=3
  conda activate my_env

Inside this environment now do,

  conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
  conda install tensorflow-gpu

It will install pytorch and tensorflow. checking if they are working : Open pythonon terminal now and import tensorflow and torch inside it. Now, do:

  torch.cuda.is_available()

If it returns True, torch is able to use GPU. Check for tensorflow now. If below command shows GPU name then Tensorflow is working with GPU as well.

tf.test.gpu_device_name()

Shivampanwar/Install CUDA pytorch tensorflow.md

1. Prerequisite instructions

2. Installing Cuda toolkit

Shivampanwar commented Oct 21, 2020