J'ai dupliqué le dépôt initial convnet-benchmarks
sur mon compte.
Les tests ont été exécutés avec une branche de Theano qui ajoute du code pour afficher des informations en mode DEBUG: Theano/Theano#6166
Les tests ont été exécutés pour les 4 réseaux disponibles (alexnet
, googlenet
, vgg
, overfeat
) chaque fois avec les flags Theano suivants:
THEANO_FLAGS=cmodule.debug=True,device=cuda,floatX=float32
# À chaque fois, je change l'algo pour les 3 types de calculs en même temps (fwd, bwd_filter, bwd_data).
# Algos testés: time_once, time_on_shape_change, guess_once, guess_on_shape_change.
THEANO_FLAGS=${THEANO_FLAGS},dnn.conv.algo_fwd=$algo
THEANO_FLAGS=${THEANO_FLAGS},dnn.conv.algo_bwd_filter=$algo
THEANO_FLAGS=${THEANO_FLAGS},dnn.conv.algo_bwd_data=$algo
Les résultats sont dans ma branche benchmarks-float32
:
- Commit: https://github.com/notoraptor/convnet-benchmarks/commit/f89f4b8f165d86f3ce6d01a5eda52391a1caca7e
- Dossier: https://github.com/notoraptor/convnet-benchmarks/tree/benchmarks-float32/theano (cf. fichiers
benchmark-*.log
)
Pour faire un bilan rapide des tests:
$ tail benchmark*.log -n 1
==> benchmark-alexnet-guess_once.log <==
2017-07-19 13:25:26.763059: Forward-Backward across 100 steps, 0.125 +/- 0.005 sec / batch
==> benchmark-alexnet-guess_on_shape_change.log <==
2017-07-19 13:26:07.941936: Forward-Backward across 100 steps, 0.125 +/- 0.005 sec / batch
==> benchmark-alexnet-time_once.log <==
2017-07-19 13:23:56.579496: Forward-Backward across 100 steps, 0.112 +/- 0.003 sec / batch
==> benchmark-alexnet-time_on_shape_change.log <==
2017-07-19 13:24:42.175716: Forward-Backward across 100 steps, 0.117 +/- 0.004 sec / batch
==> benchmark-googlenet-guess_once.log <==
2017-07-19 13:34:47.829857: Forward-Backward across 100 steps, 0.577 +/- 0.000 sec / batch
==> benchmark-googlenet-guess_on_shape_change.log <==
2017-07-19 13:36:43.970389: Forward-Backward across 100 steps, 0.578 +/- 0.000 sec / batch
==> benchmark-googlenet-time_once.log <==
2017-07-19 13:28:55.239014: Forward-Backward across 100 steps, 0.556 +/- 0.001 sec / batch
==> benchmark-googlenet-time_on_shape_change.log <==
2017-07-19 13:32:51.139293: Forward-Backward across 100 steps, 0.555 +/- 0.001 sec / batch
==> benchmark-overfeat-guess_once.log <==
2017-07-19 13:48:48.165231: Forward-Backward across 100 steps, 0.391 +/- 0.006 sec / batch
==> benchmark-overfeat-guess_on_shape_change.log <==
2017-07-19 13:49:59.805688: Forward-Backward across 100 steps, 0.392 +/- 0.000 sec / batch
==> benchmark-overfeat-time_once.log <==
2017-07-19 13:45:23.355088: Forward-Backward across 100 steps, 0.238 +/- 0.006 sec / batch
==> benchmark-overfeat-time_on_shape_change.log <==
2017-07-19 13:46:57.597981: Forward-Backward across 100 steps, 0.238 +/- 0.007 sec / batch
==> benchmark-vgg-guess_once.log <==
2017-07-19 13:42:10.026075: Forward-Backward across 100 steps, 0.649 +/- 0.000 sec / batch
==> benchmark-vgg-guess_on_shape_change.log <==
2017-07-19 13:43:52.521482: Forward-Backward across 100 steps, 0.652 +/- 0.001 sec / batch
==> benchmark-vgg-time_once.log <==
2017-07-19 13:38:33.516701: Forward-Backward across 100 steps, 0.527 +/- 0.001 sec / batch
==> benchmark-vgg-time_on_shape_change.log <==
2017-07-19 13:40:23.093246: Forward-Backward across 100 steps, 0.527 +/- 0.000 sec / batch
Remarque:: au départ, j'exécutais les tests sans définir floatX, donc avec floatX=float64 par défaut. J'obtenais alors systématiquement une erreur de mémoire avec le réseau vgg: https://github.com/notoraptor/convnet-benchmarks/blob/benchmarks/theano/benchmark-vgg-time_once.log
Traceback (most recent call last):
File "benchmark_imagenet.py", line 75, in <module>
main()
File "benchmark_imagenet.py", line 72, in main
time_theano_run(full_func, [images, labels], 'Forward-Backward')
File "benchmark_imagenet.py", line 39, in time_theano_run
_ = func(*fargs)
File "/u/boccoset/mila/dev/git/theano/theano/compile/function_module.py", line 897, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/u/boccoset/mila/dev/git/theano/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/u/boccoset/mila/dev/git/theano/theano/compile/function_module.py", line 883, in __call__
self.fn() if output_subset is None else\
File "pygpu/gpuarray.pyx", line 676, in pygpu.gpuarray.pygpu_empty (pygpu/gpuarray.c:9773)
File "pygpu/gpuarray.pyx", line 290, in pygpu.gpuarray.array_empty (pygpu/gpuarray.c:5622)
pygpu.gpuarray.GpuArrayException: cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Apply node that caused the error: GpuDnnPoolGrad{mode='max'}(GpuContiguous.0, GpuContiguous.0, GpuContiguous.0, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 0})
Toposort index: 353
Inputs types: [GpuArrayType<None>(float64, 4D), GpuArrayType<None>(float64, 4D), GpuArrayType<None>(float64, 4D), TensorType(int64, vector), TensorType(int64, vector), TensorType(int64, vector)]
Inputs shapes: [(64, 64, 224, 224), (64, 64, 112, 112), (64, 64, 112, 112), (2,), (2,), (2,)]
Inputs strides: [(25690112, 401408, 1792, 8), (6422528, 100352, 896, 8), (6422528, 100352, 896, 8), (8,), (8,), (8,)]
Inputs values: ['not shown', 'not shown', 'not shown', array([2, 2]), array([2, 2]), array([0, 0])]
Outputs clients: [[GpuElemwise{Composite{((i0 * i1) + (i2 * i1 * sgn(i3)))}}[(0, 1)]<gpuarray>(GpuArrayConstant{[[[[ 0.5]]]]}, GpuDnnPoolGrad{mode='max'}.0, GpuArrayConstant{[[[[ 0.5]]]]}, GpuElemwise{Add}[(0, 0)]<gpuarray>.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.