OS: CENTOS 6.8 (No root access)
GCC: locally installed 5.2.0 (Cluster default is 4.4.7)
Bazel: 0.4.0-2016-11-06 (@fa407e5)
Tensorflow: v0.11.0rc2
CUDA: 8.0
#!/usr/bin/env python | |
# Benchmark transferring data, part of troubleshooting https://github.com/tensorflow/tensorflow/issues/6116 | |
# | |
# Take a independent workers communicating with b parameter shards | |
# Each worker tries to add to variables stored on parameter server as fast as | |
# possible. | |
# | |
# macbook | |
# ps=1: 1.6 GB/s | |
# ps=2: 2.6 GB/s |
"""Example of barrier implementation using TensorFlow shared variables. | |
All workers synchronize on barrier, copy global parameters to local versions | |
and increment global parameter variable asynchronously. Should see something | |
like this: | |
bash> killall python | |
bash> python simple_barrier.py --num_workers=4 | |
worker 0, local_param 4 global_param 5 | |
worker 2, local_param 4 global_param 7 |