In this commit https://github.com/Theano/Theano/pull/5190/commits/772fbcb00ba03b8ef9c6c198f04bf519a91589b1 I 'fixed' a bug by renaming the property 'inplace_running_mean' to 'inplace_running_xxx'.
The problem seems to be in the optimization:
-
There is a GpuDnnBatchNorm Op that takes as its first input a tensor ('x'). This tensor is then normalized and returned as the output. The GpuDnnBatchNorm Op has a few inplace parameters named inplace_running_mean, inplace_running_var and inplace_output. Setting inplace_output=True will modify the original input tensor 'x'.
-
The gradient Op GpuDnnBatchNormGrad takes as input the original tensor ('x') and some outputs of the GpuDnnBatchNorm Op. This means that the grad Op has to run after the normalization Op.
-
Obviously, if the normalization Op and the grad Op use the same input 'x', the normalization Op should not run with inplace_output=True.