Skip to content

Instantly share code, notes, and snippets.

@zomux
Last active August 29, 2015 14:21
Show Gist options
  • Save zomux/4bde1aad0590cab94274 to your computer and use it in GitHub Desktop.
Save zomux/4bde1aad0590cab94274 to your computer and use it in GitHub Desktop.
Function profiling
==================
Message: None
Time in 19735 calls to Function.__call__: 8.007819e+01s
Time in Function.fn.__call__: 7.888933e+01s (98.515%)
Time in thunks: 7.739506e+01s (96.649%)
Total compile time: 9.325302e-01s
Theano Optimizer time: 6.993260e-01s
Theano validate time: 1.049185e-02s
Theano Linker time (includes C, CUDA code generation/compiling): 2.172978e-01s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
40.9% 40.9% 31.625s 2.29e-05s C 1381454 76 <class 'theano.tensor.elemwise.Elemwise'>
24.4% 65.2% 18.864s 1.19e-04s C 157880 8 <class 'theano.tensor.blas.Dot22'>
9.5% 74.8% 7.389s 1.56e-05s Py 473644 12 <class 'theano.ifelse.IfElse'>
8.9% 83.7% 6.919s 2.34e-05s C 296025 15 <class 'theano.tensor.elemwise.Sum'>
6.1% 89.8% 4.728s 3.99e-05s C 118410 6 <class 'theano.tensor.elemwise.Any'>
4.3% 94.1% 3.337s 5.64e-05s C 59205 3 <class 'theano.tensor.blas.Dot22Scalar'>
2.0% 96.1% 1.517s 5.49e-06s C 276290 14 <class 'theano.tensor.elemwise.DimShuffle'>
1.7% 97.8% 1.328s 6.73e-05s Py 19735 1 <class 'theano.tensor.basic.MaxAndArgmax'>
0.6% 98.4% 0.448s 2.27e-05s Py 19735 1 <class 'theano.tensor.subtensor.AdvancedSubtensor'>
0.5% 98.9% 0.410s 2.08e-05s Py 19735 1 <class 'theano.tensor.subtensor.AdvancedIncSubtensor'>
0.4% 99.3% 0.321s 1.63e-05s Py 19735 1 <class 'theano.tensor.basic.ARange'>
0.2% 99.5% 0.148s 7.49e-06s C 19735 1 <class 'theano.tensor.nnet.nnet.SoftmaxWithBias'>
0.1% 99.7% 0.098s 8.31e-07s C 118410 6 <class 'theano.tensor.subtensor.Subtensor'>
0.1% 99.8% 0.083s 2.09e-06s C 39470 8 <class 'theano.tensor.basic.Alloc'>
0.1% 99.9% 0.079s 8.03e-07s C 98675 8 <class 'theano.compile.ops.Shape_i'>
0.1% 99.9% 0.052s 8.77e-07s C 59205 3 <class 'theano.tensor.opt.MakeVector'>
0.1% 100.0% 0.049s 2.46e-06s C 19735 1 <class 'theano.tensor.nnet.nnet.SoftmaxGrad'>
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
24.4% 24.4% 18.864s 1.19e-04s C 157880 8 Dot22
19.4% 43.8% 15.034s 1.27e-04s C 118410 6 Elemwise{isnan,no_inplace}
9.5% 53.3% 7.389s 1.56e-05s Py 473644 12 if{}
8.4% 61.7% 6.470s 2.98e-05s C 217085 11 Sum
6.1% 67.8% 4.728s 3.99e-05s C 118410 6 Any
5.8% 73.6% 4.501s 3.80e-05s C 118410 6 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}
5.0% 78.7% 3.908s 2.20e-05s C 177615 9 Elemwise{add,no_inplace}
4.9% 83.6% 3.784s 6.39e-05s C 59205 3 Elemwise{Composite{[add(i0, mul(i1, i2))]}}
4.3% 87.9% 3.337s 5.64e-05s C 59205 3 Dot22Scalar
3.9% 91.8% 3.000s 2.53e-05s C 118410 6 Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.sca
1.7% 93.5% 1.328s 6.73e-05s Py 19735 1 MaxAndArgmax
1.4% 94.9% 1.119s 1.13e-05s C 98675 5 DimShuffle{1,0}
0.6% 95.5% 0.448s 2.27e-05s Py 19735 1 AdvancedSubtensor
0.6% 96.1% 0.428s 7.24e-06s C 59205 3 Sum{0}
0.5% 96.6% 0.410s 2.08e-05s Py 19735 1 AdvancedIncSubtensor{inplace=False, set_instead_of_inc=False}
0.4% 97.0% 0.321s 1.63e-05s Py 19735 1 ARange
0.4% 97.4% 0.283s 7.16e-06s C 39470 2 Elemwise{gt,no_inplace}
0.3% 97.7% 0.261s 3.30e-06s C 78940 4 Elemwise{mul,no_inplace}
0.2% 97.9% 0.177s 2.99e-06s C 59205 3 DimShuffle{x,0}
0.2% 98.1% 0.163s 1.65e-06s C 98675 5 DimShuffle{x}
... (remaining 24 Ops account for 1.86%(1.44s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
14.4% 14.4% 11.147s 5.65e-04s 19735 156 Elemwise{isnan,no_inplace}(Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
11.1% 25.6% 8.628s 4.37e-04s 19735 10 Dot22(x, W_dense1)
5.8% 31.3% 4.464s 2.26e-04s 19735 120 Sum(Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.bas
4.7% 36.0% 3.650s 1.85e-04s 19735 144 Elemwise{isnan,no_inplace}(Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
4.5% 40.5% 3.471s 1.76e-04s 19735 160 Any(Elemwise{isnan,no_inplace}.0)
4.3% 44.8% 3.291s 1.67e-04s 19735 164 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}(TensorConstant{(1, 1) of 0.9
3.7% 48.5% 2.890s 1.46e-04s 19735 106 Dot22(x.T, Elemwise{mul}.0)
3.6% 52.1% 2.751s 1.39e-04s 19735 49 Dot22(Elemwise{mul,no_inplace}.0, W_dense2)
3.5% 55.6% 2.747s 1.39e-04s 19735 152 Elemwise{Composite{[add(i0, mul(i1, i2))]}}(Dot22Scalar.0, TensorConstant{(1, 1)
3.4% 59.0% 2.609s 1.32e-04s 19735 22 Elemwise{add,no_inplace}(W_dense1, W_dense1_vel)
3.1% 62.1% 2.386s 1.21e-04s 19735 93 Dot22(Elemwise{mul}.0, W_dense2.T)
3.0% 65.1% 2.360s 5.98e-05s 39470 162 if{}(Any.0, Alloc.0, Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
2.8% 68.0% 2.199s 1.11e-04s 19735 112 Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.basic.s
2.8% 70.8% 2.186s 1.11e-04s 19735 146 Dot22Scalar(x.T, Elemwise{mul}.0, if{}.0)
1.9% 72.7% 1.482s 7.51e-05s 19735 104 Sum(Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.bas
1.7% 74.4% 1.328s 6.73e-05s 19735 68 MaxAndArgmax(Elemwise{add,no_inplace}.0, TensorConstant{(1,) of 1})
1.7% 76.1% 1.309s 6.63e-05s 19735 92 Dot22(DimShuffle{1,0}.0, Elemwise{mul}.0)
1.5% 77.6% 1.137s 5.76e-05s 19735 150 Any(Elemwise{isnan,no_inplace}.0)
1.4% 78.9% 1.047s 5.30e-05s 19735 158 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}(TensorConstant{(1, 1) of 0.9
1.3% 80.3% 1.041s 2.64e-05s 39470 154 if{}(Any.0, Alloc.0, Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
... (remaining 145 Apply instances account for 19.73%(15.27s) of the runtime)
Function profiling
==================
Message: Sum of all printed profiles at exit excluding Scan op profile.
Time in 24735 calls to Function.__call__: 8.251194e+01s
Time in Function.fn.__call__: 8.122847e+01s (98.445%)
Time in thunks: 7.970361e+01s (96.596%)
Total compile time: 1.653158e+00s
Theano Optimizer time: 8.151951e-01s
Theano validate time: 1.219654e-02s
Theano Linker time (includes C, CUDA code generation/compiling): 2.763069e-01s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
40.0% 40.0% 31.911s 2.25e-05s C 1416454 83 <class 'theano.tensor.elemwise.Elemwise'>
25.9% 65.9% 20.618s 1.19e-04s C 172880 11 <class 'theano.tensor.blas.Dot22'>
9.3% 75.2% 7.389s 1.56e-05s Py 473644 12 <class 'theano.ifelse.IfElse'>
8.7% 83.9% 6.926s 2.26e-05s C 306025 17 <class 'theano.tensor.elemwise.Sum'>
5.9% 89.8% 4.728s 3.99e-05s C 118410 6 <class 'theano.tensor.elemwise.Any'>
4.2% 94.0% 3.337s 5.64e-05s C 59205 3 <class 'theano.tensor.blas.Dot22Scalar'>
1.9% 95.9% 1.540s 5.29e-06s C 291290 17 <class 'theano.tensor.elemwise.DimShuffle'>
1.8% 97.8% 1.473s 5.95e-05s Py 24735 2 <class 'theano.tensor.basic.MaxAndArgmax'>
0.6% 98.4% 0.487s 1.97e-05s Py 24735 2 <class 'theano.tensor.subtensor.AdvancedSubtensor'>
0.5% 98.9% 0.410s 2.08e-05s Py 19735 1 <class 'theano.tensor.subtensor.AdvancedIncSubtensor'>
0.4% 99.3% 0.354s 1.43e-05s Py 24735 2 <class 'theano.tensor.basic.ARange'>
0.2% 99.5% 0.168s 6.79e-06s C 24735 2 <class 'theano.tensor.nnet.nnet.SoftmaxWithBias'>
0.1% 99.7% 0.098s 8.31e-07s C 118410 6 <class 'theano.tensor.subtensor.Subtensor'>
0.1% 99.8% 0.083s 2.09e-06s C 39470 8 <class 'theano.tensor.basic.Alloc'>
0.1% 99.9% 0.082s 7.51e-07s C 108675 10 <class 'theano.compile.ops.Shape_i'>
0.1% 99.9% 0.052s 8.77e-07s C 59205 3 <class 'theano.tensor.opt.MakeVector'>
0.1% 100.0% 0.049s 2.46e-06s C 19735 1 <class 'theano.tensor.nnet.nnet.SoftmaxGrad'>
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
25.9% 25.9% 20.618s 1.19e-04s C 172880 11 Dot22
18.9% 44.7% 15.034s 1.27e-04s C 118410 6 Elemwise{isnan,no_inplace}
9.3% 54.0% 7.389s 1.56e-05s Py 473644 12 if{}
8.1% 62.1% 6.475s 2.92e-05s C 222085 12 Sum
5.9% 68.1% 4.728s 3.99e-05s C 118410 6 Any
5.6% 73.7% 4.501s 3.80e-05s C 118410 6 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}
4.9% 78.6% 3.912s 2.14e-05s C 182615 10 Elemwise{add,no_inplace}
4.7% 83.4% 3.784s 6.39e-05s C 59205 3 Elemwise{Composite{[add(i0, mul(i1, i2))]}}
4.2% 87.5% 3.337s 5.64e-05s C 59205 3 Dot22Scalar
3.8% 91.3% 3.000s 2.53e-05s C 118410 6 Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.sca
1.8% 93.2% 1.473s 5.95e-05s Py 24735 2 MaxAndArgmax
1.4% 94.6% 1.119s 1.13e-05s C 98675 5 DimShuffle{1,0}
0.6% 95.2% 0.487s 1.97e-05s Py 24735 2 AdvancedSubtensor
0.5% 95.7% 0.428s 7.24e-06s C 59205 3 Sum{0}
0.5% 96.2% 0.410s 2.08e-05s Py 19735 1 AdvancedIncSubtensor{inplace=False, set_instead_of_inc=False}
0.4% 96.7% 0.354s 1.43e-05s Py 24735 2 ARange
0.4% 97.0% 0.283s 7.16e-06s C 39470 2 Elemwise{gt,no_inplace}
0.3% 97.4% 0.261s 3.30e-06s C 78940 4 Elemwise{mul,no_inplace}
0.3% 97.7% 0.257s 2.57e-05s C 10000 2 Elemwise{Composite{[Composite{[mul(i0, GT(i0, i1))]}(add(i0, i1), i2)]
0.3% 97.9% 0.200s 2.70e-06s C 74205 6 DimShuffle{x,0}
... (remaining 27 Ops account for 2.08%(1.65s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
14.0% 14.0% 11.147s 5.65e-04s 19735 156 Elemwise{isnan,no_inplace}(Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
10.8% 24.8% 8.628s 4.37e-04s 19735 10 Dot22(x, W_dense1)
5.6% 30.4% 4.464s 2.26e-04s 19735 120 Sum(Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.bas
4.6% 35.0% 3.650s 1.85e-04s 19735 144 Elemwise{isnan,no_inplace}(Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
4.4% 39.3% 3.471s 1.76e-04s 19735 160 Any(Elemwise{isnan,no_inplace}.0)
4.1% 43.5% 3.291s 1.67e-04s 19735 164 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}(TensorConstant{(1, 1) of 0.9
3.6% 47.1% 2.890s 1.46e-04s 19735 106 Dot22(x.T, Elemwise{mul}.0)
3.5% 50.6% 2.751s 1.39e-04s 19735 49 Dot22(Elemwise{mul,no_inplace}.0, W_dense2)
3.4% 54.0% 2.747s 1.39e-04s 19735 152 Elemwise{Composite{[add(i0, mul(i1, i2))]}}(Dot22Scalar.0, TensorConstant{(1, 1)
3.3% 57.3% 2.609s 1.32e-04s 19735 22 Elemwise{add,no_inplace}(W_dense1, W_dense1_vel)
3.0% 60.3% 2.386s 1.21e-04s 19735 93 Dot22(Elemwise{mul}.0, W_dense2.T)
3.0% 63.2% 2.360s 5.98e-05s 39470 162 if{}(Any.0, Alloc.0, Elemwise{Composite{[add(i0, mul(i1, i2))]}}.0)
2.8% 66.0% 2.199s 1.11e-04s 19735 112 Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.basic.s
2.7% 68.7% 2.186s 1.11e-04s 19735 146 Dot22Scalar(x.T, Elemwise{mul}.0, if{}.0)
1.9% 70.6% 1.482s 7.51e-05s 19735 104 Sum(Elemwise{Composite{[sqr(Abs{output_types_preference=<class 'theano.scalar.bas
1.7% 72.3% 1.328s 6.73e-05s 19735 68 MaxAndArgmax(Elemwise{add,no_inplace}.0, TensorConstant{(1,) of 1})
1.6% 73.9% 1.309s 6.63e-05s 19735 92 Dot22(DimShuffle{1,0}.0, Elemwise{mul}.0)
1.6% 75.5% 1.286s 2.57e-04s 5000 5 Dot22(x, W_dense1)
1.4% 76.9% 1.137s 5.76e-05s 19735 150 Any(Elemwise{isnan,no_inplace}.0)
1.3% 78.2% 1.047s 5.30e-05s 19735 158 Elemwise{Composite{[sub(mul(i0, i1), mul(i2, i3))]}}(TensorConstant{(1, 1) of 0.9
... (remaining 166 Apply instances account for 21.75%(17.34s) of the runtime)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment