Skip to content

Instantly share code, notes, and snippets.

@manisnesan
Last active November 17, 2020 03:04
Show Gist options
  • Save manisnesan/d79681ebffca4579a09c56381c0e642a to your computer and use it in GitHub Desktop.
Save manisnesan/d79681ebffca4579a09c56381c0e642a to your computer and use it in GitHub Desktop.
Test if tensorflow is detecting GPU

Environment CUDA version: 10.2 Tensorflow: 2.2.0

$ nvidia-smi 
Mon Nov 16 11:59:35 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   54C    P0    40W / 180W |      0MiB /  8116MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

Check if tensorflow can detect the GPU

>>> import tensorflow as tf;print(tf.__version__)
2.2.0

>>> tf.test.gpu_device_name()
''

>>> from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())
2020-11-16 11:51:17.480346: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-11-16 11:51:17.498691: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3999980000 Hz
2020-11-16 11:51:17.499204: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b39f4a2b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-16 11:51:17.499232: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 5465609901031465818
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 15084457781891957405
physical_device_desc: "device: XLA_CPU device"
]

Run the python program

$ python en_rcnn.py 
Num GPUs Available:  0
en_rcnn.py:33: DeprecationWarning: Call to deprecated `syn0` (Attribute will be removed in 4.0.0, use self.vectors instead).
  MAX_TOKENS = word2vec.wv.syn0.shape[0]
en_rcnn.py:47: DeprecationWarning: Call to deprecated `syn0` (Attribute will be removed in 4.0.0, use self.vectors instead).
  embedding_dim = word2vec.wv.syn0.shape[1]
en_rcnn.py:48: DeprecationWarning: Call to deprecated `syn0` (Attribute will be removed in 4.0.0, use self.vectors instead).
  embeddings = np.zeros((MAX_TOKENS + 2, word2vec.wv.syn0.shape[1]), dtype = "float32")
en_rcnn.py:49: DeprecationWarning: Call to deprecated `syn0` (Attribute will be removed in 4.0.0, use self.vectors instead).
  embeddings[:MAX_TOKENS] = word2vec.wv.syn0
2020-11-16 11:45:06.600915: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-11-16 11:45:06.615834: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3999980000 Hz
2020-11-16 11:45:06.616030: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b31a39e570 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-16 11:45:06.616045: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-11-16 11:45:06.616104: I tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Epoch 1/10
@manisnesan
Copy link
Author

manisnesan commented Nov 16, 2020

for versions older than 1.5, gpu packages seem to be separate. But for newer versions, it's a single package.

Source: https://www.tensorflow.org/install/gpu

But after trying with 'pip install tensorflow-gpu' #This installed tensorflow 2.3.1 version

(sbr)  msivanes@deepshadow  ~  python                                                                     
Python 3.7.7 (default, May  7 2020, 21:25:33) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))
2020-11-16 15:01:32.074963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-16 15:01:32.450275: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 15:01:32.450592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 298.32GiB/s
2020-11-16 15:01:32.450676: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64/
2020-11-16 15:01:32.451762: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-16 15:01:32.452816: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-16 15:01:32.452979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-16 15:01:32.454127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-16 15:01:32.454770: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-16 15:01:32.454841: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64/
2020-11-16 15:01:32.454850: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

@manisnesan
Copy link
Author

manisnesan commented Nov 16, 2020

After speaking with Subin, 10.2 is not supported. So the next step is downgrade cuda to 10.1

@manisnesan
Copy link
Author

$ sudo dnf downgrade cuda

Finally one need to install cudnn using https://developer.nvidia.com/cudnn after signing for nvidia developer program.
In my case, for 10.1 I was missing 'libcudnn.so.7, so I had to download the rpm libcudnn7-7.6.5.32-1.cuda10.1.x86_64.rpm

$ sudo rpm -ivh libcudnn7-7.6.5.32-1.cuda10.1.x86_64.rpm

@manisnesan
Copy link
Author

manisnesan commented Nov 17, 2020

Finally I can confirm tensorflow is detecting the GPU from the output '/device:GPU:0'

>>> import tensorflow as tf;tf.test.gpu_device_name()
2020-11-16 21:55:26.955554: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-11-16 21:55:27.803889: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-16 21:55:27.819847: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3999980000 Hz
2020-11-16 21:55:27.820089: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55beab993ab0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-16 21:55:27.820105: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-11-16 21:55:27.821558: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-11-16 21:55:28.234244: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.235303: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bead6bb920 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-16 21:55:28.235349: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
2020-11-16 21:55:28.235760: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.236622: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 298.32GiB/s
2020-11-16 21:55:28.236686: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-11-16 21:55:28.240399: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-11-16 21:55:28.243517: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-11-16 21:55:28.244056: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-11-16 21:55:28.247681: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-11-16 21:55:28.249876: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-11-16 21:55:28.258614: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-11-16 21:55:28.258847: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.259788: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.260554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-11-16 21:55:28.260632: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-11-16 21:55:28.586010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-16 21:55:28.586039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-11-16 21:55:28.586046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
2020-11-16 21:55:28.586192: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.586510: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-16 21:55:28.586769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/device:GPU:0 with 7417 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
'/device:GPU:0'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment