Skip to content

Instantly share code, notes, and snippets.

@zomux
Last active August 29, 2015 14:21
Show Gist options
  • Save zomux/0d8603e390b3cf2eae85 to your computer and use it in GitHub Desktop.
Save zomux/0d8603e390b3cf2eae85 to your computer and use it in GitHub Desktop.
Function profiling
==================
Message: trainers.py:282
Time in 546 calls to Function.__call__: 7.641145e+01s
Time in Function.fn.__call__: 7.633449e+01s (99.899%)
Time in thunks: 7.620820e+01s (99.734%)
Total compile time: 9.094489e-01s
Number of Apply nodes: 167
Theano Optimizer time: 3.391752e-01s
Theano validate time: 2.257347e-03s
Theano Linker time (includes C, CUDA code generation/compiling): 3.889339e-01s
Import time 1.662374e-01s
Time in all call to theano.grad() 2.867198e-02s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
77.6% 77.6% 59.165s 1.55e-03s C 38220 76 theano.tensor.elemwise.Elemwise
11.0% 88.6% 8.348s 1.91e-03s C 4368 8 theano.tensor.blas.Dot22
10.0% 98.6% 7.618s 4.65e-03s C 1638 3 theano.tensor.blas.Gemm
0.5% 99.1% 0.358s 4.38e-05s C 8190 15 theano.tensor.elemwise.Sum
0.4% 99.5% 0.342s 2.61e-05s Py 13104 12 theano.ifelse.IfElse
0.2% 99.8% 0.187s 5.72e-05s C 3276 6 theano.tensor.elemwise.Any
0.1% 99.9% 0.092s 1.20e-05s C 7644 14 theano.tensor.elemwise.DimShuffle
0.0% 99.9% 0.026s 4.81e-05s C 546 1 theano.tensor.basic.MaxAndArgmax
0.0% 99.9% 0.022s 4.10e-05s Py 546 1 theano.tensor.subtensor.AdvancedIncSubtensor
0.0% 100.0% 0.017s 3.08e-05s Py 546 1 theano.tensor.basic.ARange
0.0% 100.0% 0.012s 2.28e-05s Py 546 1 theano.tensor.subtensor.AdvancedSubtensor
0.0% 100.0% 0.008s 1.38e-05s C 546 1 theano.tensor.nnet.nnet.SoftmaxWithBias
0.0% 100.0% 0.003s 8.31e-07s C 3276 6 theano.tensor.subtensor.Subtensor
0.0% 100.0% 0.002s 5.70e-07s C 4368 8 theano.compile.ops.Shape_i
0.0% 100.0% 0.002s 4.52e-06s C 546 7 theano.tensor.basic.Alloc
0.0% 100.0% 0.002s 1.09e-06s C 1638 3 theano.tensor.basic.AllocEmpty
0.0% 100.0% 0.001s 2.41e-06s C 546 1 theano.tensor.nnet.nnet.SoftmaxGrad
0.0% 100.0% 0.001s 6.72e-07s C 1638 3 theano.tensor.opt.MakeVector
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
15.6% 15.6% 11.917s 3.64e-03s C 3276 6 Elemwise{Composite{sqr(Abs(i0))}}
14.7% 30.3% 11.168s 6.82e-03s C 1638 3 Elemwise{Composite{(i0 + (i1 * i2))}}
12.8% 43.1% 9.758s 2.98e-03s C 3276 6 Elemwise{isnan,no_inplace}
11.7% 54.8% 8.923s 2.72e-03s C 3276 6 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}
11.0% 65.8% 8.348s 1.91e-03s C 4368 8 Dot22
10.0% 75.8% 7.618s 4.65e-03s C 1638 3 Gemm{no_inplace}
8.4% 84.1% 6.371s 2.92e-03s C 2184 4 Elemwise{mul}
5.9% 90.0% 4.476s 4.10e-03s C 1092 2 Elemwise{gt,no_inplace}
4.9% 94.8% 3.697s 7.52e-04s C 4914 9 Elemwise{add,no_inplace}
3.7% 98.5% 2.813s 2.58e-03s C 1092 2 Elemwise{Composite{Abs((i0 * i1))}}
0.4% 99.0% 0.342s 2.61e-05s Py 13104 12 if{}
0.4% 99.4% 0.342s 5.69e-05s C 6006 11 Sum{acc_dtype=float64}
0.2% 99.7% 0.187s 5.72e-05s C 3276 6 Any
0.1% 99.8% 0.060s 2.20e-05s C 2730 5 DimShuffle{1,0}
0.0% 99.8% 0.026s 4.81e-05s C 546 1 MaxAndArgmax
0.0% 99.8% 0.023s 8.45e-06s C 2730 5 DimShuffle{x}
0.0% 99.8% 0.022s 4.10e-05s Py 546 1 AdvancedIncSubtensor{inplace=False, set_instead_of_inc=False}
0.0% 99.9% 0.017s 3.08e-05s Py 546 1 ARange
0.0% 99.9% 0.016s 9.77e-06s C 1638 3 Sum{axis=[0], acc_dtype=float64}
0.0% 99.9% 0.012s 2.28e-05s Py 546 1 AdvancedSubtensor
... (remaining 25 Ops account for 0.09%(0.07s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
5.7% 5.7% 4.349s 7.97e-03s 546 140 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense2)
5.7% 11.4% 4.344s 7.96e-03s 546 99 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
5.6% 17.0% 4.265s 7.81e-03s 546 108 Dot22(x.T, Elemwise{mul}.0)
5.4% 22.4% 4.134s 7.57e-03s 546 154 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense1)
5.4% 27.8% 4.129s 7.56e-03s 546 114 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
5.3% 33.1% 4.030s 7.38e-03s 546 146 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
5.3% 38.4% 4.011s 7.35e-03s 546 148 Gemm{no_inplace}(AllocEmpty{dtype='float32'}.0, if{}.0, x.T, Elemwise{mul}.0, TensorConstant{0.0})
4.8% 43.2% 3.662s 6.71e-03s 546 160 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense2_vel, DimShuffle{x,x}.0, if{}.0)
4.8% 48.0% 3.629s 6.65e-03s 546 158 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
4.7% 52.7% 3.583s 6.56e-03s 546 132 Gemm{no_inplace}(AllocEmpty{dtype='float32'}.0, if{}.0, DimShuffle{1,0}.0, Elemwise{mul}.0, TensorConstant{0.0})
4.6% 57.2% 3.486s 6.38e-03s 546 166 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense1_vel, DimShuffle{x,x}.0, if{}.0)
4.5% 61.8% 3.440s 6.30e-03s 546 88 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
4.2% 66.0% 3.233s 5.92e-03s 546 94 Dot22(DimShuffle{1,0}.0, Elemwise{mul}.0)
3.5% 69.5% 2.685s 4.92e-03s 546 124 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense3)
3.4% 72.9% 2.562s 4.69e-03s 546 44 Elemwise{mul,no_inplace}(Elemwise{add,no_inplace}.0, Elemwise{gt,no_inplace}.0)
3.1% 76.0% 2.393s 4.38e-03s 546 39 Elemwise{gt,no_inplace}(Elemwise{add,no_inplace}.0, TensorConstant{(1, 1) of 0})
2.7% 78.8% 2.096s 3.84e-03s 546 130 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
2.7% 81.5% 2.083s 3.81e-03s 546 59 Elemwise{gt,no_inplace}(Elemwise{add,no_inplace}.0, TensorConstant{(1, 1) of 0})
2.3% 83.8% 1.763s 3.23e-03s 546 150 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense3_vel, DimShuffle{x,x}.0, if{}.0)
2.2% 86.0% 1.687s 3.09e-03s 546 60 Elemwise{mul,no_inplace}(Elemwise{add,no_inplace}.0, Elemwise{gt,no_inplace}.0)
... (remaining 147 Apply instances account for 13.97%(10.65s) of the runtime)
Function profiling
==================
Message: Sum of all(2) printed profiles at exit excluding Scan op profile.
Time in 2046 calls to Function.__call__: 7.708845e+01s
Time in Function.fn.__call__: 7.698374e+01s (99.864%)
Time in thunks: 7.684891e+01s (99.689%)
Total compile time: 2.267539e+00s
Number of Apply nodes: 21
Theano Optimizer time: 5.171282e-01s
Theano validate time: 2.811909e-03s
Theano Linker time (includes C, CUDA code generation/compiling): 4.703619e-01s
Import time 2.221687e-01s
Time in all call to theano.grad() 2.867198e-02s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
77.1% 77.1% 59.237s 1.22e-03s C 48720 83 theano.tensor.elemwise.Elemwise
11.5% 88.6% 8.870s 1.00e-03s C 8868 11 theano.tensor.blas.Dot22
9.9% 98.5% 7.618s 4.65e-03s C 1638 3 theano.tensor.blas.Gemm
0.5% 99.0% 0.360s 3.22e-05s C 11190 17 theano.tensor.elemwise.Sum
0.4% 99.5% 0.342s 2.61e-05s Py 13104 12 theano.ifelse.IfElse
0.2% 99.7% 0.187s 5.72e-05s C 3276 6 theano.tensor.elemwise.Any
0.1% 99.8% 0.098s 8.09e-06s C 12144 17 theano.tensor.elemwise.DimShuffle
0.1% 99.9% 0.039s 1.89e-05s C 2046 2 theano.tensor.basic.MaxAndArgmax
0.0% 99.9% 0.027s 1.30e-05s Py 2046 2 theano.tensor.basic.ARange
0.0% 99.9% 0.023s 1.12e-05s Py 2046 2 theano.tensor.subtensor.AdvancedSubtensor
0.0% 100.0% 0.022s 4.10e-05s Py 546 1 theano.tensor.subtensor.AdvancedIncSubtensor
0.0% 100.0% 0.013s 6.47e-06s C 2046 2 theano.tensor.nnet.nnet.SoftmaxWithBias
0.0% 100.0% 0.003s 4.01e-07s C 7368 10 theano.compile.ops.Shape_i
0.0% 100.0% 0.003s 8.31e-07s C 3276 6 theano.tensor.subtensor.Subtensor
0.0% 100.0% 0.002s 4.52e-06s C 546 7 theano.tensor.basic.Alloc
0.0% 100.0% 0.002s 1.09e-06s C 1638 3 theano.tensor.basic.AllocEmpty
0.0% 100.0% 0.001s 2.41e-06s C 546 1 theano.tensor.nnet.nnet.SoftmaxGrad
0.0% 100.0% 0.001s 6.72e-07s C 1638 3 theano.tensor.opt.MakeVector
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
15.5% 15.5% 11.917s 3.64e-03s C 3276 6 Elemwise{Composite{sqr(Abs(i0))}}
14.5% 30.0% 11.168s 6.82e-03s C 1638 3 Elemwise{Composite{(i0 + (i1 * i2))}}
12.7% 42.7% 9.758s 2.98e-03s C 3276 6 Elemwise{isnan,no_inplace}
11.6% 54.3% 8.923s 2.72e-03s C 3276 6 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}
11.5% 65.9% 8.870s 1.00e-03s C 8868 11 Dot22
9.9% 75.8% 7.618s 4.65e-03s C 1638 3 Gemm{no_inplace}
8.3% 84.1% 6.371s 2.92e-03s C 2184 4 Elemwise{mul}
5.8% 89.9% 4.476s 4.10e-03s C 1092 2 Elemwise{gt,no_inplace}
4.8% 94.7% 3.700s 5.77e-04s C 6414 10 Elemwise{add,no_inplace}
3.7% 98.4% 2.813s 2.58e-03s C 1092 2 Elemwise{Composite{Abs((i0 * i1))}}
0.4% 98.8% 0.343s 4.58e-05s C 7506 12 Sum{acc_dtype=float64}
0.4% 99.3% 0.342s 2.61e-05s Py 13104 12 if{}
0.2% 99.5% 0.187s 5.72e-05s C 3276 6 Any
0.1% 99.6% 0.061s 2.04e-05s C 3000 2 Elemwise{Composite{((i0 + i1) * GT((i0 + i1), i2))}}
0.1% 99.7% 0.060s 2.20e-05s C 2730 5 DimShuffle{1,0}
0.1% 99.7% 0.039s 1.89e-05s C 2046 2 MaxAndArgmax
0.0% 99.8% 0.027s 1.30e-05s Py 2046 2 ARange
0.0% 99.8% 0.023s 8.45e-06s C 2730 5 DimShuffle{x}
0.0% 99.8% 0.023s 1.12e-05s Py 2046 2 AdvancedSubtensor
0.0% 99.9% 0.022s 4.10e-05s Py 546 1 AdvancedIncSubtensor{inplace=False, set_instead_of_inc=False}
... (remaining 28 Ops account for 0.14%(0.11s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
5.7% 5.7% 4.349s 7.97e-03s 546 140 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense2)
5.7% 11.3% 4.344s 7.96e-03s 546 99 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
5.6% 16.9% 4.265s 7.81e-03s 546 108 Dot22(x.T, Elemwise{mul}.0)
5.4% 22.2% 4.134s 7.57e-03s 546 154 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense1)
5.4% 27.6% 4.129s 7.56e-03s 546 114 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
5.2% 32.9% 4.030s 7.38e-03s 546 146 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
5.2% 38.1% 4.011s 7.35e-03s 546 148 Gemm{no_inplace}(AllocEmpty{dtype='float32'}.0, if{}.0, x.T, Elemwise{mul}.0, TensorConstant{0.0})
4.8% 42.8% 3.662s 6.71e-03s 546 160 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense2_vel, DimShuffle{x,x}.0, if{}.0)
4.7% 47.6% 3.629s 6.65e-03s 546 158 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
4.7% 52.2% 3.583s 6.56e-03s 546 132 Gemm{no_inplace}(AllocEmpty{dtype='float32'}.0, if{}.0, DimShuffle{1,0}.0, Elemwise{mul}.0, TensorConstant{0.0})
4.5% 56.8% 3.486s 6.38e-03s 546 166 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense1_vel, DimShuffle{x,x}.0, if{}.0)
4.5% 61.2% 3.440s 6.30e-03s 546 88 Elemwise{Composite{sqr(Abs(i0))}}(Dot22.0)
4.2% 65.4% 3.233s 5.92e-03s 546 94 Dot22(DimShuffle{1,0}.0, Elemwise{mul}.0)
3.5% 68.9% 2.685s 4.92e-03s 546 124 Elemwise{Composite{(i0 + (i1 * i2))}}(Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0.0002}, W_dense3)
3.3% 72.3% 2.562s 4.69e-03s 546 44 Elemwise{mul,no_inplace}(Elemwise{add,no_inplace}.0, Elemwise{gt,no_inplace}.0)
3.1% 75.4% 2.393s 4.38e-03s 546 39 Elemwise{gt,no_inplace}(Elemwise{add,no_inplace}.0, TensorConstant{(1, 1) of 0})
2.7% 78.1% 2.096s 3.84e-03s 546 130 Elemwise{isnan,no_inplace}(Elemwise{Composite{(i0 + (i1 * i2))}}.0)
2.7% 80.8% 2.083s 3.81e-03s 546 59 Elemwise{gt,no_inplace}(Elemwise{add,no_inplace}.0, TensorConstant{(1, 1) of 0})
2.3% 83.1% 1.763s 3.23e-03s 546 150 Elemwise{Composite{((i0 * i1) - (i2 * i3))}}(TensorConstant{(1, 1) of 0.9}, W_dense3_vel, DimShuffle{x,x}.0, if{}.0)
2.2% 85.3% 1.687s 3.09e-03s 546 60 Elemwise{mul,no_inplace}(Elemwise{add,no_inplace}.0, Elemwise{gt,no_inplace}.0)
... (remaining 168 Apply instances account for 14.69%(11.29s) of the runtime)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment