OS: CENTOS 6.8 (No root access)
GCC: locally installed 5.2.0 (Cluster default is 4.4.7)
Bazel: 0.4.0-2016-11-06 (@fa407e5)
Tensorflow: v0.11.0rc2
CUDA: 8.0
CUDNN: 5.1.5
You should be able to modify the script (buildtf.sh) below to do these steps automatically, but I list out details here as well.
Follow this Tutorial or download prefered version of JDK 8.0 and set proper environment variables as described in the tutorial.
Great Tutorial that got me to the error below!
Note: After change the linker line to your local or module GCC, If you get errors about finding ld, or other executables
that are stored in /usr/bin
here is the work around I used (it isn't pretty and you might not need it, but just in case):
-
Copy your compiler directory (
/opt/gcc/5.2.0
) to a local directory that you have permissions to modify. -
Then run:
cp `which ld` /opt/gcc/5.2.0/bin/ld (repeat for any command listed in the crosstools that doesn't already reside in your gcc /bin directory)
Note2: I downloaded a newer release of bazel and tensorflow as noted above and there are fewer changes required in the latest versions of the crosstool then described in the tutorial.
-
modify /tensorflow/third_party/gpus/crosstool/CROSSTOOL.tpl as described in tutorial above
-
modify /tensorflow/third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl as described in tutorial above. I did not change the first line: #!/usr/bin/env python (but the tutorial does!)
Again these steps led to the below error which took me forever to get past:
GLIBCXX_3.4.18 not found error
As described in gbkedar's comment from Jul 12. You have to find this file:
$INSTALL_PATH/tensorflow/bazel-tensorflow/external/protobuf/protobuf.bzl
But, until the compile fails this file is harder to find. (The buildtf.sh re-runs the compile after modifying the file after the first failure). The failure creates the shortcut in the /tensorflow
directory. I was running into issues re-attempting the compile and had to run ./configure
almost everytime. Therefore, I had to find this file before the first failure of my compile attempt. The file should be located somewhere similar to this after running ./configure
from the /tensorflow
directory:
~/.cache/bazel/_bazel_YOURUSERNAME/YOURHASH(i.e. f81f1107f96c7515450fc43e0dbb6ed5)/external/protobuf/protobuf.bzl
If you have several hashes, check the files that were modified at the time corresponding to your ./configure
run.
As described in the error link above, search for ctx.action and add env=ctx.configuration.default_shell_env,
at the bottom of the call like so:
if args:
ctx.action(
inputs=inputs,
outputs=ctx.outputs.outs,
arguments=args + import_flags + [s.path for s in srcs],
executable=ctx.executable.protoc,
mnemonic="ProtoCompile",
env=ctx.configuration.default_shell_env,
)
You will then likely hit error trying to exec 'as': execvp: No such file or directory
. Since I am a self-confessing linux noob, you have to use the few tricks you know as much as possible(I didn't follow gbkedar's 2nd comment):
cp `which as` /opt/gcc/5.2.0/bin/as
After this change, tensorflow finally compiled successfully for me!
Going back to our tutorial I ran this command:
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
and received the bdist_wheel
not found error... I solved this by using pip install to install a new version of wheel locally:
pip install --target=/home/thpaul/python27-packages wheel
and then added that directory to my $PYTHONPATH variable:
export PYTHONPATH=/home/thpaul/python27-packages/:$PYTHONPATH
Re-running the command builds the proper .whl file which you can install via pip.
Hope this helps anyone trying to compile tensorflow from source!
@mrdivine. Sorry for taking so long to get back! I asked the maintainers of my cluster to install cuda 8.0. They were willing since we already had 7.5 installed. That being said, I am pretty sure this should work with older versions of cuda. You just have to provide the location of your cuda library anytime the tensorflow install asks for it. And be sure to install CUDNN locally following the link at the top of the script.