pre-built tensorflow that is packaged with tf-node-gpu is built to support GPU with compute capability of 6.0 if you have an older GPU with compute capability < 6, TF will ignore your GPU and output below warning message!!
tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Ignoring visible gpu device (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0) with Cuda compute capability 5.0. The minimum required Cuda capability is 6.0.
related issues:
https://github.com/tensorflow/tensorflow/issues/38971
Follow below instructions (work-in-progress) so that tensorflow will be build with your GPU's compute capability!
versions used:
tensorflow_gpu-1.15.0 py 3.7 GCC 7.3.1 Bazel 0.26.1 cudnn7.4 cuda 10.0
cuda compute capability can be set in Dockerfile, set to 5.0 for the gpu i'm using (GeForce 940MX).
- (optional) increase swap file to 16gb [3,4]
- run below commands
docker build -t mytensorflow .
- OR... go in to container and run below.
docker run --gpus all -it -w /tensorflow -v $PWD:/mnt mytensorflow bash
bazel build --config=opt --config=cuda //tensorflow:libtensorflow.so
-
go for a surf/watch a movie/clean the house, since above will take a long long time, maybe 4+ hours [10].
-
expect to update Dockerfile, rinse and repeat...at least we are using docker these days.
-
expect to abadon above attempt, and just go buy a gpu with compute capabiliyt of 6.0 :D
-
copy built files out
docker run --gpus all -it -w /tensorflow -v $PWD:/mnt mytensorflow bash
cp bazel-bin/tensorflow/libtensorflow.so.1.15.0 /mnt
cp bazel-bin/tensorflow/libtensorflow_framework.so.1.15.0 /mnt
[1] https://www.tensorflow.org/install/source#tested_build_configurations
[2] https://stackoverflow.com/questions/9727688/how-to-get-the-cuda-version
[3] tensorflow/tensorflow#25965
[4] https://askubuntu.com/questions/1075505/how-do-i-increase-swapfile-in-ubuntu-18-04
[5] https://gist.github.com/yochze/3898e1405bb3a024acfb9bb9eef132c3
[6] tensorflow/tensorflow#21531
[7] https://launchpad.net/~jonathonf/+archive/ubuntu/python-3.6
[9] tensorflow/tensorflow#25865
[10] https://stackoverflow.com/questions/54541969/tensorflow-compile-runs-forever
[11] https://gist.github.com/Brainiarc7/6d6c3f23ea057775b72c52817759b25c
-
started using dockerfile from [5],
-
ppa:jonathonf/python-3.6
is no longer available [7], thus swapped py 3.6 to 3.7, and tensorflow version. -
getting multiple errors [6], due to versions of gcc/cuda/tensorflow
-
finally, decided to go with official build instruction [1,8] and using the below version guideline
version Python version Compiler Build tools cuDNN CUDA
tensorflow_gpu-1.13.1 2.7, 3.3-3.7 GCC 4.8 Bazel 0.19.2 7.4 10.0
using GCC 7.x instead.
only because I'm lazy.
-
not building in docker, since we want gpu access, during compilation, thus following [1] to build by first, going into docker container via
docker run --gpus all ...
then,bazel build ...
. -
(turns out you can just use
docker build
, and there is no need to usedocker run
to build, see below last bullet point) -
final dockerfile and instructions is a blend from [1,5,9,11]
-
attemping to build within docker (not using 'docker run') and switched tf to v1.15.0. finally got TF to built successfully.
A tip for those must go down this path, go with the tested configuration/versions listed in the below "test_build_configurations" section.
https://www.tensorflow.org/install/source#tested_build_configurations