Created
July 27, 2016 03:46
-
-
Save marty1885/b9c3f96033aafcd77c30602e3c71128b to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
args: bin/deepcl_unittests --gtest_filter=-DATA*:SLOW* | |
Note: Google Test filter = -DATA*:SLOW* | |
[==========] Running 158 tests from 29 test cases. | |
[----------] Global test environment set-up. | |
[----------] 7 tests from testClBlas | |
[ RUN ] testClBlas.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.basic (82 ms) | |
[ RUN ] testClBlas.transA | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
1 2 9 | |
3 7 5 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.transA (36 ms) | |
[ RUN ] testClBlas.transB | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
3 | |
-1 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.transB (37 ms) | |
[ RUN ] testClBlas.colMajor | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.colMajor (34 ms) | |
[ RUN ] testClBlas.colMajor2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.colMajor2 (34 ms) | |
[ RUN ] testClBlas.colMajorTransA | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.colMajorTransA (37 ms) | |
[ RUN ] testClBlas.colMajorTransB | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
clblas teardown | |
[ OK ] testClBlas.colMajorTransB (35 ms) | |
[----------] 7 tests from testClBlas (295 ms total) | |
[----------] 1 test from testDeepCL | |
[ RUN ] testDeepCL.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
expected number of output: 4 | |
clblas teardown | |
[ OK ] testDeepCL.basic (176 ms) | |
[----------] 1 test from testDeepCL (176 ms total) | |
[----------] 23 tests from testupdateweights | |
[ RUN ] testupdateweights.conv1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=2 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=2 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
batchSize: 4 | |
inputtotalsize=200 outputTotalSize=72 | |
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
weightsize=36 biassize=0 | |
statefultimer v0.7 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=2 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
Parameters overview: (skipping 2 layers with 0 params) | |
layer 1: params=36 100.0% | |
TOTAL : params=36 | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
idx=8 predicted losschange=0.000111445 actual=0.000112534 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
idx=13 predicted losschange=-0.000886715 actual=-0.000884056 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
idx=0 predicted losschange=0.000210491 actual=0.000212669 | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
idx=22 predicted losschange=-0.000164224 actual=-0.000163078 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 152ms | |
idx=22 predicted losschange=-0.000164224 actual=-0.000163078 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 152ms | |
forward layer selected kernel 1 | |
idx=35 predicted losschange=-0.000391028 actual=-0.000391006 | |
idx=26 predicted losschange=2.23142e-05 actual=2.57492e-05 | |
idx=27 predicted losschange=9.38328e-05 actual=9.44138e-05 | |
idx=27 predicted losschange=9.38328e-05 actual=9.44138e-05 | |
idx=10 predicted losschange=0.00186697 actual=0.00187111 | |
clblas teardown | |
[ OK ] testupdateweights.conv1 (566 ms) | |
[ RUN ] testupdateweights.conv1z | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=2 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=2 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
batchSize: 4 | |
inputtotalsize=72 outputTotalSize=72 | |
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} } | |
weightsize=36 biassize=0 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=2 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
Parameters overview: (skipping 2 layers with 0 params) | |
layer 1: params=36 100.0% | |
TOTAL : params=36 | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
idx=8 predicted losschange=0.00039831 actual=0.000397682 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
idx=13 predicted losschange=-0.000426502 actual=-0.000426292 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
idx=0 predicted losschange=0.000143287 actual=0.000144005 | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, padzeros must be disabled | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
idx=22 predicted losschange=-1.7916e-06 actual=0 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 148ms | |
idx=22 predicted losschange=-1.7916e-06 actual=0 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 148ms | |
forward layer selected kernel 1 | |
idx=35 predicted losschange=-2.82565e-05 actual=-2.76566e-05 | |
idx=26 predicted losschange=3.62191e-05 actual=3.71933e-05 | |
idx=27 predicted losschange=-0.000319862 actual=-0.000317574 | |
idx=27 predicted losschange=-0.000319862 actual=-0.000317574 | |
idx=10 predicted losschange=-0.000883857 actual=-0.000883102 | |
clblas teardown | |
[ OK ] testupdateweights.conv1z (554 ms) | |
[ RUN ] testupdateweights.numericallytest | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=1 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=1 filterSize=1 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=1 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=1 filterSize=1 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
loss 0.0367983 loss2 0.0367913 change: 7.01472e-06 | |
sumweightsdiff -0.000264842 | |
loss change 7.01472e-06 | |
estimatedLossChangeFromW 7.01413e-06 | |
[ OK ] testupdateweights.numericallytest (388 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=1 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=1 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
loss 1.23358 loss2 1.21612 change: 0.0174605 | |
sumweightsdiff -0.0132709 | |
loss change 0.0174605 | |
estimatedLossChangeFromW 0.0176118 | |
[ OK ] testupdateweights.numericallytest_imagesize3 (394 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize5 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=1 outputSize=5 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=1 outputSize=5 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
loss 4.12958 loss2 4.11952 change: 0.0100665 | |
sumweightsdiff -0.0101708 | |
loss change 0.0100665 | |
estimatedLossChangeFromW 0.0103444 | |
[ OK ] testupdateweights.numericallytest_imagesize5 (398 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize9 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=1 outputSize=9 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=1 outputSize=9 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=1 100.0% | |
TOTAL : params=1 | |
loss 13.4341 loss2 13.4339 change: 0.000207901 | |
sumweightsdiff 0.00153953 | |
loss change 0.000207901 | |
estimatedLossChangeFromW 0.000237015 | |
[ OK ] testupdateweights.numericallytest_imagesize9 (392 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize9_filtersize9 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=9 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=81 100.0% | |
TOTAL : params=81 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=9 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=81 100.0% | |
TOTAL : params=81 | |
loss 0.135896 loss2 0.0848782 change: 0.0510182 | |
sumweightsdiff -0.0322406 | |
loss change 0.0510182 | |
estimatedLossChangeFromW 0.0555841 | |
[ OK ] testupdateweights.numericallytest_imagesize9_filtersize9 (456 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize9_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=3 outputSize=7 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=9 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=3 outputSize=7 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
loss 7.70633 loss2 7.41581 change: 0.290529 | |
sumweightsdiff -0.0898813 | |
loss change 0.290529 | |
estimatedLossChangeFromW 0.316231 | |
[ OK ] testupdateweights.numericallytest_imagesize9_filtersize3 (444 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize3_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=3 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=3 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=3 outputSize=1 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
loss 0.0719101 loss2 0.0694461 change: 0.00246406 | |
sumweightsdiff -0.0110647 | |
loss change 0.00246406 | |
estimatedLossChangeFromW 0.00248372 | |
[ OK ] testupdateweights.numericallytest_imagesize3_filtersize3 (401 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
loss 1.20022 loss2 1.17241 change: 0.0278131 | |
sumweightsdiff -0.0203888 | |
loss change 0.0278131 | |
estimatedLossChangeFromW 0.0280929 | |
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3 (411 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_batchsize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=9 100.0% | |
TOTAL : params=9 | |
loss 4.97142 loss2 4.78768 change: 0.183744 | |
sumweightsdiff -0.056004 | |
loss change 0.183744 | |
estimatedLossChangeFromW 0.193264 | |
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_batchsize3 (409 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=27 100.0% | |
TOTAL : params=27 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=27 100.0% | |
TOTAL : params=27 | |
loss 1.08887 loss2 0.9575 change: 0.13137 | |
sumweightsdiff -0.00764531 | |
loss change 0.13137 | |
estimatedLossChangeFromW 0.134379 | |
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3 (440 ms) | |
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3_batchsize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=27 100.0% | |
TOTAL : params=27 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:ActivationLayer{ TANH } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=27 100.0% | |
TOTAL : params=27 | |
loss 4.76631 loss2 4.18154 change: 0.584769 | |
sumweightsdiff 0.029606 | |
loss change 0.584769 | |
estimatedLossChangeFromW 0.620442 | |
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3_batchsize3 (424 ms) | |
[ RUN ] testupdateweights.backprop_weights_2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2 (38 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=2 -D gInputSizeSquared=4 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=2 -D gOutputSizeSquared=4 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=2 -DgInputStripeOuterNumRows=2 -DgInputStripeInnerSize=4 -DgInputStripeOuterSize=4 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=2 -DgOutputStripeSize=4 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize2 (40 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
mismatch for i 0 | |
mismatch for i 1 | |
mismatch for i 2 | |
mismatch for i 3 | |
mismatch for i 4 | |
mismatch for i 5 | |
mismatch for i 6 | |
mismatch for i 7 | |
mismatch for i 8 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize3 (41 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize4_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=2 -D gOutputSizeSquared=4 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=8 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=32 -DgInputStripeMarginSize=8 -DgOutputStripeNumRows=2 -DgOutputStripeSize=4 | |
mismatch for i 0 | |
mismatch for i 8 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize4_filtersize3 (44 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize5_filtersize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
mismatch for i 0 | |
mismatch for i 4 | |
mismatch for i 8 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize5_filtersize3 (48 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=3 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=9 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize1 (46 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize16_filtersize1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=16 -D gInputSizeSquared=256 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=16 -D gOutputSizeSquared=256 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=16 -DgInputStripeOuterNumRows=16 -DgInputStripeInnerSize=256 -DgInputStripeOuterSize=256 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=16 -DgOutputStripeSize=256 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize16_filtersize1 (46 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1 | |
LayerDimensions{ inputPlanes=1 inputSize=17 numFilters=1 filterSize=1 outputSize=17 padZeros=0 biased=0 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=17 -D gInputSizeSquared=289 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=17 -D gOutputSizeSquared=289 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=17 -DgInputStripeOuterNumRows=17 -DgInputStripeInnerSize=289 -DgInputStripeOuterSize=289 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=17 -DgOutputStripeSize=289 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1 (46 ms) | |
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1_moredata | |
expectedresult: -958.715 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=17 -D gInputSizeSquared=289 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=17 -D gOutputSizeSquared=289 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=17 -DgInputStripeOuterNumRows=17 -DgInputStripeInnerSize=289 -DgInputStripeOuterSize=289 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=17 -DgOutputStripeSize=289 | |
mismatch for i 0 | |
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1_moredata (44 ms) | |
[ RUN ] testupdateweights.backprop_instance3_smaller2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
numweights: 36 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=96 -D gInputSizeSquared=9216 -D gNumFilters=1 -D gFilterSize=6 -D gHalfFilterSize=3 -D gFilterSizeSquared=36 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=91 -D gOutputSizeSquared=8281 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=8 -DgInputStripeMarginRows=5 -DgInputStripeInnerNumRows=12 -DgInputStripeOuterNumRows=22 -DgInputStripeInnerSize=1152 -DgInputStripeOuterSize=2112 -DgInputStripeMarginSize=480 -DgOutputStripeNumRows=12 -DgOutputStripeSize=1092 | |
138 0 0 0 0 0 | |
132 0 0 0 0 0 | |
138 0 0 0 0 0 | |
138 0 0 0 0 0 | |
138 0 0 0 0 0 | |
132 0 0 0 0 0 | |
138 0 0 0 0 0 | |
132 0 0 0 0 0 | |
138 0 0 0 0 0 | |
138 0 0 0 0 0 | |
138 0 0 0 0 0 | |
132 0 0 0 0 0 | |
...... | |
...... | |
...... | |
...... | |
...... | |
...... | |
0=0 0 0 0 0 0 0 0 | |
1=0 0 0 0 0 0 0 0 | |
2=0 0 0 0 0 0 0 0 | |
3=0 0 0 0 0 0 0 0 | |
4=0 0 0 0 0 0 0 0 | |
5=0 0 0 0 0 0 0 0 | |
6=0 0 0 0 0 0 0 0 | |
7=0 0 0 0 0 0 0 0 | |
8=0 0 0 0 0 0 0 0 | |
9=0 0 0 0 0 0 0 0 | |
10=0 0 0 0 0 0 0 0 | |
11=0 0 0 0 0 0 0 0 | |
0=0 0 0 0 0 0 0 0 | |
1=0 0 0 0 0 0 0 0 | |
2=0 0 0 0 0 0 0 0 | |
3=0 0 0 0 0 0 0 0 | |
4=0 0 0 0 0 0 0 0 | |
5=0 0 0 0 0 0 0 0 | |
6=0 0 0 0 0 0 0 0 | |
7=0 0 0 0 0 0 0 0 | |
8=0 0 0 0 0 0 0 0 | |
9=0 0 0 0 0 0 0 0 | |
10=0 0 0 0 0 0 0 0 | |
11=0 0 0 0 0 0 0 0 | |
12=0 0 0 0 0 0 0 0 | |
13=0 0 0 0 0 0 0 0 | |
14=0 0 0 0 0 0 0 0 | |
15=0 0 0 0 0 0 0 0 | |
16=0 0 0 0 0 0 0 0 | |
17=0 0 0 0 0 0 0 0 | |
18=0 0 0 0 0 0 0 0 | |
19=0 0 0 0 0 0 0 0 | |
[ OK ] testupdateweights.backprop_instance3_smaller2 (72 ms) | |
[----------] 23 tests from testupdateweights (6142 ms total) | |
[----------] 17 tests from testforward | |
[ RUN ] testforward.imagesize2_nopadzeros | |
expected number of output: 4 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testforward.imagesize2_nopadzeros (177 ms) | |
[ RUN ] testforward.imagesize2_padzeros | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
checking result[0]=0 expecting: 0 | |
checking result[1]=0 expecting: 0 | |
checking result[2]=0 expecting: 0 | |
checking result[3]=0.2 expecting: 0.2 | |
checking result[4]=-0.13 expecting: -0.13 | |
checking result[5]=-0.15 expecting: -0.15 | |
checking result[6]=0 expecting: 0 | |
checking result[7]=0 expecting: 0 | |
checking result[8]=0 expecting: 0 | |
checking result[9]=0 expecting: 0 | |
checking result[10]=0 expecting: 0 | |
checking result[11]=0 expecting: 0 | |
checking result[12]=-0.55 expecting: -0.55 | |
checking result[13]=0.02 expecting: 0.02 | |
checking result[14]=0.21 expecting: 0.21 | |
checking result[27]=-14.3 expecting: -14.3 | |
checking result[28]=-9.6 expecting: -9.6 | |
checking result[29]=11.9 expecting: 11.9 | |
checking result[35]=0.46 expecting: 0.46 | |
[ OK ] testforward.imagesize2_padzeros (70 ms) | |
[ RUN ] testforward.imagesize3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
test1 ok | |
[ OK ] testforward.imagesize3 (67 ms) | |
[ RUN ] testforward.test2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testforward.test2 (64 ms) | |
[ RUN ] testforward.test3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testforward.test3 (67 ms) | |
[ RUN ] testforward.compare_0_1_biased_nopad | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_0_1_biased_nopad (89 ms) | |
[ RUN ] testforward.compare_0_1_biased_pad | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_0_1_biased_pad (91 ms) | |
[ RUN ] testforward.compare_1_n_biased_nopad | |
instance: 2 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
instance: 3 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
instance: 4 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
instance: 6 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
instance: 7 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_1_n_biased_nopad (913 ms) | |
[ RUN ] testforward.compare_1_n_biased_pad | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
instance: 2 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
instance: 3 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
instance: 4 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
instance: 6 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
instance: 7 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_1_n_biased_pad (876 ms) | |
[ RUN ] testforward.compare_1_5_biased_nopad | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=19 outputSize=1 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=19 outputSize=1 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_1_5_biased_nopad (141 ms) | |
[ RUN ] testforward.compare_1_4_fcscenario | |
LayerDimensions{ inputPlanes=10 inputSize=24 numFilters=10 filterSize=24 outputSize=1 padZeros=0 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=10 inputSize=24 numFilters=10 filterSize=24 outputSize=1 padZeros=0 biased=1 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_1_4_fcscenario (121 ms) | |
[ RUN ] testforward.compare_break1_0_1 | |
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 1 | |
dump enabled=0 | |
batch 0 batchsize 1 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_break1_0_1 (66 ms) | |
[ RUN ] testforward.compare_break1_0_4 | |
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 1 | |
dump enabled=0 | |
batch 0 batchsize 1 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0} | |
clblas teardown | |
[ OK ] testforward.compare_break1_0_4 (70 ms) | |
[ RUN ] testforward.comparespecific_break2 | |
LayerDimensions{ inputPlanes=64 inputSize=19 numFilters=64 filterSize=19 outputSize=1 padZeros=0 biased=0 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
batch 0 batchsize 4 | |
dump enabled=0 | |
batch 0 batchsize 4 | |
dump enabled=0 | |
LayerDimensions{ inputPlanes=64 inputSize=19 numFilters=64 filterSize=19 outputSize=1 padZeros=0 biased=0 skip=0} | |
clblas teardown | |
[ OK ] testforward.comparespecific_break2 (175 ms) | |
[ RUN ] testforward.softmax | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
output[0]=0.0320586 | |
output[1]=0.0871443 | |
output[2]=0.643914 | |
output[3]=0.236883 | |
loss 0.44019 | |
loss 3.44019 | |
loss 2.44019 | |
loss 1.44019 | |
[ OK ] testforward.softmax (2 ms) | |
[ RUN ] testforward.softmax_byplane | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
output[0]=0.0320586 | |
output[1]=0.0871443 | |
output[2]=0.643914 | |
output[3]=0.236883 | |
loss 0.44019 | |
loss 3.44019 | |
loss 2.44019 | |
loss 1.44019 | |
[ OK ] testforward.softmax_byplane (1 ms) | |
[ RUN ] testforward.crash_from_jm | |
-D gNumInputPlanes=32 -D gInputPlanes=32 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=20 -D gFilterSize=28 -D gHalfFilterSize=14 -D gFilterSizeSquared=784 -D gNumOutputPlanes=20 -D gOutputPlanes=20 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
dump enabled=0 | |
[ OK ] testforward.crash_from_jm (160 ms) | |
[----------] 17 tests from testforward (3151 ms total) | |
[----------] 2 tests from testfilehelper | |
[ RUN ] testfilehelper.testfilehelper | |
[ OK ] testfilehelper.testfilehelper (4 ms) | |
[ RUN ] testfilehelper.testreadchunk | |
[ OK ] testfilehelper.testreadchunk (2 ms) | |
[----------] 2 tests from testfilehelper (6 ms total) | |
[----------] 12 tests from testsimpleconvolvenet | |
[ RUN ] testsimpleconvolvenet.imagesize1_planes2_filters2_unbiased_tanh | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 0.141046 | |
accuracy: 2/2 100% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 144ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 144ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 94ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 94ms | |
forward layer selected kernel 1 | |
loss, E, 0.0733091 | |
accuracy: 2/2 100% | |
loss, E, 0.0426809 | |
accuracy: 2/2 100% | |
loss, E, 0.0262453 | |
accuracy: 2/2 100% | |
loss, E, 0.0164245 | |
accuracy: 2/2 100% | |
loss, E, 0.0107573 | |
accuracy: 2/2 100% | |
accuracy: 2/2 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize1_planes2_filters2_unbiased_tanh (927 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize1_planes2_filters2_tanh | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 0.964924 | |
accuracy: 1/2 50% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 206ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 206ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 97ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 97ms | |
forward layer selected kernel 1 | |
loss, E, 0.00570459 | |
accuracy: 2/2 100% | |
loss, E, 1.34828e-05 | |
accuracy: 2/2 100% | |
loss, E, 3.61852e-08 | |
accuracy: 2/2 100% | |
accuracy: 2/2 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize1_planes2_filters2_tanh (1033 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_tanh | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 1.13283 | |
accuracy: 3/4 75% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 229ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 229ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss, E, 0.00996342 | |
accuracy: 4/4 100% | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 114ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 114ms | |
forward layer selected kernel 1 | |
loss, E, 4.70668e-05 | |
accuracy: 4/4 100% | |
loss, E, 4.09802e-07 | |
accuracy: 4/4 100% | |
accuracy: 4/4 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_tanh (1064 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize1_2planes_filtersize1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 0.751601 | |
accuracy: 2/2 100% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 227ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 227ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss, E, 0.195916 | |
accuracy: 2/2 100% | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 101ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 101ms | |
forward layer selected kernel 1 | |
loss, E, 0.0679117 | |
accuracy: 2/2 100% | |
loss, E, 0.023677 | |
accuracy: 2/2 100% | |
loss, E, 0.00825563 | |
accuracy: 2/2 100% | |
loss, E, 0.00287856 | |
accuracy: 2/2 100% | |
loss, E, 0.00100369 | |
accuracy: 2/2 100% | |
loss, E, 0.000349964 | |
accuracy: 2/2 100% | |
accuracy: 2/2 100% | |
accuracy: 2/2 | |
loss, E, 0.000150648 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize1_2planes_filtersize1 (1025 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_relu | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 1.48951 | |
accuracy: 2/4 50% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 213ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 213ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss, E, 1.12957 | |
accuracy: 2/4 50% | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 112ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 112ms | |
forward layer selected kernel 1 | |
loss, E, 0.070782 | |
accuracy: 4/4 100% | |
loss, E, 0.003026 | |
accuracy: 4/4 100% | |
loss, E, 0.00021158 | |
accuracy: 4/4 100% | |
loss, E, 1.96858e-05 | |
accuracy: 4/4 100% | |
loss, E, 2.03002e-06 | |
accuracy: 4/4 100% | |
loss, E, 2.15572e-07 | |
accuracy: 4/4 100% | |
loss, E, 2.3083e-08 | |
accuracy: 4/4 100% | |
loss, E, 2.48239e-09 | |
accuracy: 4/4 100% | |
loss, E, 4.14442e-10 | |
accuracy: 4/4 100% | |
accuracy: 4/4 | |
loss, E, 4.14442e-10 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_relu (1085 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_linear | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 0.50604 | |
accuracy: 4/4 100% | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 229ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 229ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss, E, 0.0565529 | |
accuracy: 4/4 100% | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 115ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 115ms | |
forward layer selected kernel 1 | |
loss, E, 0.00777245 | |
accuracy: 4/4 100% | |
loss, E, 0.00106831 | |
accuracy: 4/4 100% | |
loss, E, 0.000218376 | |
accuracy: 4/4 100% | |
accuracy: 4/4 | |
loss, E, 0.000218376 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_linear (1025 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize1_n2_2layers_unbiased | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 1ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
epoch 0 loss, E, 0.0559531 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
epoch 1 loss, E, 0.0254554 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 97ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
epoch 2 loss, E, 0.0172943 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 97ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 201ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 34ms | |
epoch 3 loss, E, 0.0138013 | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 201ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 34ms | |
calcGradWeights layer selected kernel 1 | |
epoch 4 loss, E, 0.0115848 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 2ms | |
epoch 5 loss, E, 0.00987036 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 93ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 33ms | |
epoch 6 loss, E, 0.00844797 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 93ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 1ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 2ms | |
forward kernel 7 time: 33ms | |
forward layer selected kernel 2 | |
epoch 7 loss, E, 0.00724182 | |
epoch 8 loss, E, 0.00621212 | |
epoch 9 loss, E, 0.00533106 | |
epoch 10 loss, E, 0.00457645 | |
epoch 11 loss, E, 0.00392979 | |
epoch 12 loss, E, 0.00337539 | |
epoch 13 loss, E, 0.00289992 | |
epoch 14 loss, E, 0.002492 | |
epoch 15 loss, E, 0.00214191 | |
epoch 16 loss, E, 0.00184138 | |
epoch 17 loss, E, 0.00158331 | |
epoch 18 loss, E, 0.00136164 | |
epoch 19 loss, E, 0.0011712 | |
epoch 20 loss, E, 0.00100754 | |
epoch 21 loss, E, 0.000866877 | |
epoch 22 loss, E, 0.000745946 | |
epoch 23 loss, E, 0.000641966 | |
epoch 24 loss, E, 0.000552543 | |
epoch 25 loss, E, 0.000475625 | |
epoch 26 loss, E, 0.000409454 | |
epoch 27 loss, E, 0.000352522 | |
epoch 28 loss, E, 0.000303531 | |
epoch 29 loss, E, 0.00026137 | |
epoch 30 loss, E, 0.000225082 | |
epoch 31 loss, E, 0.000193845 | |
epoch 32 loss, E, 0.000166954 | |
epoch 33 loss, E, 0.000143801 | |
epoch 34 loss, E, 0.000123866 | |
epoch 35 loss, E, 0.000106699 | |
epoch 36 loss, E, 9.19176e-05 | |
epoch 37 loss, E, 7.91864e-05 | |
epoch 38 loss, E, 6.82211e-05 | |
epoch 39 loss, E, 5.87767e-05 | |
layer 0:InputLayer{ outputPlanes=1 outputSize=1 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 4:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=4 40.0% | |
layer 3: params=6 60.0% | |
TOTAL : params=10 | |
loss, E, 5.87767e-05 | |
accuracy: 2/2 100% | |
accuracy: 2/2 | |
loss, E, 5.87767e-05 | |
loss, E, 5.87767e-05 | |
layer 0:InputLayer{ outputPlanes=1 outputSize=1 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 4:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=4 40.0% | |
layer 3: params=6 60.0% | |
TOTAL : params=10 | |
float weights1[] = {-0.303866f, -1.59823f}; | |
float weights3[] = {0.426358f, -0.719592f, -0.420361f, 0.719566f}; | |
float bias1[] = {-0.324465f, 0.60279f}; | |
float bias3[] = {0.506862f, -0.506837f}; | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize1_n2_2layers_unbiased (1727 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize1_n2_2layers_biased | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 1ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 1.19067 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 102ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 102ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 195ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 36ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 1ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 195ms | |
calcGradWeights layer selected kernel 2 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 36ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss, E, 0.0667568 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 93ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 33ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 93ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 33ms | |
forward layer selected kernel 1 | |
loss, E, 0.00923595 | |
loss, E, 0.00112611 | |
loss, E, 0.0001174 | |
loss, E, 1.15642e-05 | |
dump enabled=0 | |
loss, E, 1.78564e-06 | |
accuracy: 2/2 100% | |
accuracy: 2/2 | |
loss, E, 1.78564e-06 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize1_n2_2layers_biased (1593 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n3 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 2.45455 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 110ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=3 -D gFilterSize=4 -D gHalfFilterSize=2 -D gFilterSizeSquared=16 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=3 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=40 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=2 -D gHalfFilterSize=1 -D gFilterSizeSquared=4 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=4 -D gOutputSizeSquared=16 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=1 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=35 -DgInputStripeMarginSize=5 -DgOutputStripeNumRows=4 -DgOutputStripeSize=16 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 110ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 205ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 193ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 205ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 193ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 110ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 110ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 103ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 103ms | |
forward layer selected kernel 1 | |
loss, E, 0.000668798 | |
loss, E, 8.79736e-08 | |
loss, E, 4.64206e-11 | |
loss, E, 3.85469e-13 | |
loss, E, 1.32339e-13 | |
loss, E, 1.14131e-13 | |
loss, E, 8.9706e-14 | |
loss, E, 6.83897e-14 | |
loss, E, 6.83897e-14 | |
loss, E, 6.83897e-14 | |
accuracy: 3/3 100% | |
accuracy: 3/3 | |
loss, E, 6.83897e-14 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n3 (3364 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n6 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 3.64011 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 116ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=3 -D gFilterSize=4 -D gHalfFilterSize=2 -D gFilterSizeSquared=16 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=3 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=40 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=2 -D gHalfFilterSize=1 -D gFilterSizeSquared=4 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=4 -D gOutputSizeSquared=16 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=1 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=35 -DgInputStripeMarginSize=5 -DgOutputStripeNumRows=4 -DgOutputStripeSize=16 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 5ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 116ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 238ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 213ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 238ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 213ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 115ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 5ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 115ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 103ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 103ms | |
forward layer selected kernel 1 | |
loss, E, 4.13952e-10 | |
loss, E, 2.13163e-14 | |
loss, E, 1.77636e-14 | |
loss, E, 1.68754e-14 | |
loss, E, 8.88178e-15 | |
accuracy: 6/6 100% | |
accuracy: 6/6 | |
loss, E, 8.88178e-15 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n6 (2886 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n6 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 4.00796 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 122ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 122ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 227ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 253ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 227ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 253ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 158ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 158ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 109ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 109ms | |
forward layer selected kernel 1 | |
loss, E, 1.87774e-08 | |
loss, E, 5.06262e-14 | |
loss, E, 6.21725e-15 | |
accuracy: 6/6 100% | |
accuracy: 6/6 | |
loss, E, 6.21725e-15 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n6 (2494 ms) | |
[ RUN ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n18 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
loss, E, 7.78557 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 129ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 129ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 219ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 247ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 3ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 219ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 247ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 155ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 3ms | |
forward kernel 7 time: 155ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 111ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 111ms | |
forward layer selected kernel 1 | |
loss, E, 0.0959845 | |
loss, E, 0.0247032 | |
loss, E, 0.0102922 | |
loss, E, 0.00504672 | |
loss, E, 0.00281197 | |
loss, E, 0.00167871 | |
loss, E, 0.00107965 | |
loss, E, 0.000748767 | |
loss, E, 0.00055091 | |
loss, E, 0.000425275 | |
loss, E, 0.000341032 | |
loss, E, 0.000281538 | |
loss, E, 0.000237609 | |
loss, E, 0.000203943 | |
loss, E, 0.000177351 | |
loss, E, 0.000155821 | |
loss, E, 0.000138037 | |
loss, E, 0.000123113 | |
loss, E, 0.000110426 | |
loss, E, 9.95232e-05 | |
loss, E, 9.00714e-05 | |
loss, E, 8.18146e-05 | |
loss, E, 7.45556e-05 | |
loss, E, 6.81377e-05 | |
loss, E, 6.24353e-05 | |
loss, E, 5.73468e-05 | |
loss, E, 5.27886e-05 | |
loss, E, 4.86925e-05 | |
loss, E, 4.49983e-05 | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=3 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=3 numFilters=3 filterSize=3 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 4:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 1: params=30 26.3% | |
layer 3: params=84 73.7% | |
TOTAL : params=114 | |
loss, E, 4.16905e-05 | |
accuracy: 18/18 100% | |
accuracy: 18/18 | |
loss, E, 4.16905e-05 | |
clblas teardown | |
[ OK ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n18 (6496 ms) | |
[----------] 12 tests from testsimpleconvolvenet (24719 ms total) | |
[----------] 3 tests from testlogicaloperators | |
[ RUN ] testlogicaloperators.Convolve_1layer_biased_And | |
And | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
Loss L 3.55932 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 210ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 210ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
Loss L 0.914111 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 96ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 96ms | |
forward layer selected kernel 1 | |
Loss L 0.4786 | |
Loss L 0.32969 | |
accuracy: 4/4 | |
loss, E, 0.284304 | |
clblas teardown | |
[ OK ] testlogicaloperators.Convolve_1layer_biased_And (925 ms) | |
[ RUN ] testlogicaloperators.Convolve_1layerbiased_Or | |
Or, convolve | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
Loss L 4.72064 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 203ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 203ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
Loss L 0.631151 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 96ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 96ms | |
forward layer selected kernel 1 | |
Loss L 0.375778 | |
Loss L 0.293813 | |
accuracy: 4/4 100% | |
loss, E, 0.26886 | |
clblas teardown | |
[ OK ] testlogicaloperators.Convolve_1layerbiased_Or (927 ms) | |
[ RUN ] testlogicaloperators.Convolve_2layers_relu_Xor | |
Xor, convolve | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
hand-setting weights... | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
Loss L 0.152638 | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 101ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 101ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 202ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 43ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 202ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 43ms | |
calcGradWeights layer selected kernel 1 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
Loss L 0.00640068 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 95ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 34ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 95ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 34ms | |
forward layer selected kernel 1 | |
Loss L 0.00139435 | |
Loss L 0.000383307 | |
Loss L 0.000117079 | |
Loss L 4.63626e-05 | |
Loss L 1.8873e-05 | |
Loss L 7.15534e-06 | |
Loss L 2.83958e-06 | |
Loss L 1.12727e-06 | |
Loss L 4.44109e-07 | |
Loss L 1.72233e-07 | |
Loss L 6.82345e-08 | |
Loss L 2.76343e-08 | |
Loss L 1.04286e-08 | |
Loss L 4.13357e-09 | |
Loss L 1.67201e-09 | |
Loss L 6.29148e-10 | |
Loss L 2.4837e-10 | |
Loss L 1.00833e-10 | |
Loss L 3.80673e-11 | |
Loss L 1.5131e-11 | |
Loss L 5.84421e-12 | |
Loss L 2.16893e-12 | |
Loss L 9.52127e-13 | |
Loss L 3.58824e-13 | |
Loss L 1.56319e-13 | |
Loss L 9.9476e-14 | |
Loss L 9.9476e-14 | |
Loss L 9.9476e-14 | |
Loss L 9.9476e-14 | |
Loss L 9.23706e-14 | |
Loss L 9.23706e-14 | |
Loss L 9.41469e-14 | |
Loss L 8.70415e-14 | |
Loss L 9.41469e-14 | |
Loss L 8.52651e-14 | |
Loss L 8.52651e-14 | |
Loss L 8.52651e-14 | |
Loss L 8.52651e-14 | |
layer 0:InputLayer{ outputPlanes=2 outputSize=1 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} } | |
layer 4:ActivationLayer{ RELU } | |
layer 5:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=6 50.0% | |
layer 3: params=6 50.0% | |
TOTAL : params=12 | |
accuracy: 4/4 100% | |
loss, E, 8.52651e-14 | |
clblas teardown | |
[ OK ] testlogicaloperators.Convolve_2layers_relu_Xor (1969 ms) | |
[----------] 3 tests from testlogicaloperators (3821 ms total) | |
[----------] 12 tests from testbackward | |
[ RUN ] testbackward.squareloss | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 } | |
layer 2:SquareLossLayer{} | |
inputtotalsize=2400 outputTotalSize=2400 | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 } | |
layer 2:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=44 predicted losschange=-0.000912508 actual=-0.000976562 | |
idx=2245 predicted losschange=0.00785823 actual=0.00805664 | |
idx=648 predicted losschange=0.00965759 actual=0.00976562 | |
idx=586 predicted losschange=0.0136895 actual=0.0136719 | |
idx=730 predicted losschange=0.00117897 actual=0.00146484 | |
idx=611 predicted losschange=0.00152302 actual=0.00195312 | |
idx=1130 predicted losschange=0.0159167 actual=0.0161133 | |
idx=15 predicted losschange=0.0434798 actual=0.0439453 | |
idx=1923 predicted losschange=-0.00790002 actual=-0.0078125 | |
idx=670 predicted losschange=0.0335141 actual=0.0336914 | |
[ OK ] testbackward.squareloss (2 ms) | |
[ RUN ] testbackward.crossentropyloss | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 } | |
layer 2:Layer{} | |
inputtotalsize=300 outputTotalSize=300 | |
layer 0:InputLayer{ outputPlanes=3 outputSize=5 } | |
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 } | |
layer 2:Layer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=44 predicted losschange=0.000274935 actual=0.000274658 | |
idx=145 predicted losschange=-0.000885784 actual=-0.00088501 | |
idx=48 predicted losschange=-0.000859834 actual=-0.000854492 | |
idx=286 predicted losschange=0.00713042 actual=0.00717163 | |
idx=130 predicted losschange=-0.000264829 actual=-0.000244141 | |
idx=11 predicted losschange=-1.98163e-05 actual=0 | |
idx=230 predicted losschange=-0.000594819 actual=-0.000610352 | |
idx=15 predicted losschange=-0.0006499 actual=-0.000640869 | |
idx=123 predicted losschange=-0.000846121 actual=-0.000823975 | |
idx=70 predicted losschange=0.000790196 actual=0.000793457 | |
[ OK ] testbackward.crossentropyloss (1 ms) | |
[ RUN ] testbackward.softmaxloss | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
inputtotalsize=10 outputTotalSize=10 | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=4 predicted losschange=0.000113075 actual=0.00011301 | |
idx=5 predicted losschange=0.000145627 actual=0.000145674 | |
idx=8 predicted losschange=3.16699e-05 actual=3.19481e-05 | |
idx=6 predicted losschange=4.89271e-06 actual=5.24521e-06 | |
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05 | |
idx=1 predicted losschange=-8.26119e-05 actual=-8.27312e-05 | |
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05 | |
idx=5 predicted losschange=0.000145627 actual=0.000145674 | |
idx=3 predicted losschange=-5.50179e-05 actual=-5.50747e-05 | |
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05 | |
[ OK ] testbackward.softmaxloss (2 ms) | |
[ RUN ] testbackward.squareloss2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SquareLossLayer{} | |
batchSize: 32 | |
inputtotalsize=160 outputTotalSize=160 | |
layer SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=44 predicted losschange=0.000126406 actual=0.000125885 | |
idx=5 predicted losschange=0.00461891 actual=0.00464439 | |
idx=8 predicted losschange=0.000356787 actual=0.000356674 | |
idx=106 predicted losschange=0.00716324 actual=0.00719643 | |
idx=90 predicted losschange=0.000474759 actual=0.000480652 | |
idx=131 predicted losschange=0.000979017 actual=0.000984192 | |
idx=10 predicted losschange=0.000660134 actual=0.000663757 | |
idx=15 predicted losschange=0.00961313 actual=0.00965118 | |
idx=3 predicted losschange=0.00264732 actual=0.00267029 | |
idx=30 predicted losschange=0.00865312 actual=0.00868607 | |
[ OK ] testbackward.squareloss2 (1 ms) | |
[ RUN ] testbackward.crossentropy2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:Layer{} | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:Layer{} | |
batchSize: 2 | |
inputtotalsize=10 outputTotalSize=10 | |
layer Layer{} | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:Layer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=4 predicted losschange=0.00258649 actual=nan | |
idx=5 predicted losschange=0.0227095 actual=nan | |
idx=8 predicted losschange=-0.00202714 actual=nan | |
idx=6 predicted losschange=-0.000846508 actual=nan | |
idx=0 predicted losschange=-0.000424821 actual=nan | |
idx=1 predicted losschange=-0.00171216 actual=nan | |
idx=0 predicted losschange=-0.000424821 actual=nan | |
idx=5 predicted losschange=0.0227095 actual=nan | |
idx=3 predicted losschange=0.0123444 actual=nan | |
idx=0 predicted losschange=-0.000424821 actual=nan | |
[ OK ] testbackward.crossentropy2 (2 ms) | |
[ RUN ] testbackward.softmax2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
batchSize: 2 | |
inputtotalsize=10 outputTotalSize=10 | |
layer SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
layer 0:InputLayer{ outputPlanes=5 outputSize=1 } | |
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 } | |
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
Parameters overview: (skipping 3 layers with 0 params) | |
TOTAL : params=0 | |
idx=4 predicted losschange=0.00035729 actual=0.000357628 | |
idx=5 predicted losschange=0.0015055 actual=0.00151086 | |
idx=8 predicted losschange=-5.63632e-05 actual=-5.65052e-05 | |
idx=6 predicted losschange=-1.48864e-05 actual=-1.4782e-05 | |
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05 | |
idx=1 predicted losschange=-0.000287167 actual=-0.000287056 | |
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05 | |
idx=5 predicted losschange=0.0015055 actual=0.00151086 | |
idx=3 predicted losschange=-0.000152824 actual=-0.00014782 | |
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05 | |
[ OK ] testbackward.softmax2 (1 ms) | |
[ RUN ] testbackward.conv1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} } | |
layer 3:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} } | |
layer 3:SquareLossLayer{} | |
batchSize: 4 | |
inputtotalsize=128 outputTotalSize=32 | |
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} } | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 2: params=36 100.0% | |
TOTAL : params=36 | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
idx=44 predicted losschange=-0.000314065 actual=-0.000314236 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
idx=37 predicted losschange=0.00253314 actual=0.00254202 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
idx=40 predicted losschange=0.00496457 actual=0.00497627 | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
idx=106 predicted losschange=-0.000453683 actual=-0.000446796 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 117ms | |
idx=122 predicted losschange=0.000748635 actual=-0.000446796 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 117ms | |
forward layer selected kernel 1 | |
idx=99 predicted losschange=5.24616e-05 actual=5.38826e-05 | |
idx=10 predicted losschange=0.000438654 actual=0.000439644 | |
idx=47 predicted losschange=-0.0013164 actual=-0.00131559 | |
idx=67 predicted losschange=0.00172771 actual=0.0017333 | |
idx=126 predicted losschange=0.00328649 actual=0.00329351 | |
clblas teardown | |
[ OK ] testbackward.conv1 (608 ms) | |
[ RUN ] testbackward.fc1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 } | |
layer 3:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 } | |
layer 3:SquareLossLayer{} | |
batchSize: 4 | |
inputtotalsize=128 outputTotalSize=16 | |
layer FullyConnectedLayer{ numPlanes=4 imageSize=1 } | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
layer 0:InputLayer{ outputPlanes=2 outputSize=4 } | |
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 } | |
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 2: params=128 100.0% | |
TOTAL : params=128 | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
idx=44 predicted losschange=0.000349482 actual=0.000349522 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
idx=37 predicted losschange=0.00073425 actual=0.000735283 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
idx=40 predicted losschange=0.000336202 actual=0.0003438 | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
idx=106 predicted losschange=-0.00125048 actual=-0.00124693 | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
idx=122 predicted losschange=-0.000898851 actual=-0.000895023 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 110ms | |
idx=99 predicted losschange=0.000183326 actual=-0.000895023 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 110ms | |
forward layer selected kernel 1 | |
idx=10 predicted losschange=0.000889723 actual=0.000889778 | |
idx=47 predicted losschange=-0.000766629 actual=-0.0007658 | |
idx=67 predicted losschange=0.00080667 actual=0.000810146 | |
idx=126 predicted losschange=-0.00017344 actual=-0.000169754 | |
clblas teardown | |
[ OK ] testbackward.fc1 (620 ms) | |
[ RUN ] testbackward.act1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=1 outputSize=2 } | |
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:SquareLossLayer{} | |
layer 0:InputLayer{ outputPlanes=1 outputSize=2 } | |
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:SquareLossLayer{} | |
batchSize: 1 | |
inputtotalsize=4 outputTotalSize=4 | |
layer ActivationLayer{ RELU } | |
layer 0:InputLayer{ outputPlanes=1 outputSize=2 } | |
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
TOTAL : params=0 | |
idx=0 predicted losschange=-0.000880961 actual=-0.00088048 | |
idx=1 predicted losschange=-0.00151209 actual=-0.00151044 | |
idx=0 predicted losschange=-0.000880961 actual=-0.00088048 | |
idx=2 predicted losschange=-0.00245153 actual=-0.0024423 | |
idx=2 predicted losschange=-0.00245153 actual=-0.0024423 | |
idx=3 predicted losschange=-0.00214455 actual=-0.00212085 | |
idx=2 predicted losschange=-0.00245153 actual=-0.0024423 | |
idx=3 predicted losschange=-0.00214455 actual=-0.00212085 | |
idx=3 predicted losschange=-0.00214455 actual=-0.00212085 | |
idx=2 predicted losschange=-0.00245153 actual=-0.0024423 | |
[ OK ] testbackward.act1 (65 ms) | |
[ RUN ] testbackward.checknumerically | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
loss 0.0986296 loss2 0.0984814 change: 0.000148199 | |
sumweightsdiff 0.0038507 | |
loss change 0.000148199 | |
estimatedLossChangeFromW 0.000148279 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
loss 0.0984814 loss2 0.0983336 change: 0.000147872 | |
sumweightsdiff 0.00384641 | |
loss change 0.000147872 | |
estimatedLossChangeFromW 0.000147948 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 65ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 39ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 65ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 39ms | |
forward layer selected kernel 1 | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 67ms | |
calcGradWeights try kernel 3 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
loss 0.0983336 loss2 0.098186 change: 0.000147544 | |
sumweightsdiff 0.00384223 | |
loss change 0.000147544 | |
estimatedLossChangeFromW 0.000147628 | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 67ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 101ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 34ms | |
loss 0.098186 loss2 0.0980388 change: 0.000147216 | |
sumweightsdiff 0.00383794 | |
loss change 0.000147216 | |
estimatedLossChangeFromW 0.000147298 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 101ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 34ms | |
calcGradWeights layer selected kernel 1 | |
loss 0.0980388 loss2 0.0978919 change: 0.000146888 | |
sumweightsdiff 0.00383377 | |
loss change 0.000146888 | |
estimatedLossChangeFromW 0.000146978 | |
clblas teardown | |
[ OK ] testbackward.checknumerically (1510 ms) | |
[ RUN ] testbackward.checknumerically_imagesize5_filter3_relu | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
loss 630.466 loss2 608.021 change: 22.4443 | |
sumweightsdiff -0.035685 | |
loss change 22.4443 | |
estimatedLossChangeFromW 22.6629 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 99ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 44ms | |
loss 608.021 loss2 586.349 change: 21.672 | |
sumweightsdiff -0.0350289 | |
loss change 21.672 | |
estimatedLossChangeFromW 21.7974 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 99ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 44ms | |
forward layer selected kernel 1 | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 175ms | |
calcGradWeights try kernel 3 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
loss 586.349 loss2 565.324 change: 21.025 | |
sumweightsdiff -0.0345262 | |
loss change 21.025 | |
estimatedLossChangeFromW 21.2378 | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 175ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 135ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 40ms | |
loss 565.324 loss2 545.133 change: 20.1916 | |
sumweightsdiff -0.0338754 | |
loss change 20.1916 | |
estimatedLossChangeFromW 20.3956 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 135ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 40ms | |
calcGradWeights layer selected kernel 1 | |
loss 545.133 loss2 525.742 change: 19.3912 | |
sumweightsdiff -0.0332378 | |
loss change 19.3912 | |
estimatedLossChangeFromW 19.5872 | |
loss 525.742 loss2 507.119 change: 18.6229 | |
sumweightsdiff -0.0326132 | |
loss change 18.6229 | |
estimatedLossChangeFromW 18.8111 | |
loss 507.119 loss2 489.233 change: 17.8853 | |
sumweightsdiff -0.032001 | |
loss change 17.8853 | |
estimatedLossChangeFromW 18.066 | |
loss 489.233 loss2 472.056 change: 17.1772 | |
sumweightsdiff -0.0314012 | |
loss change 17.1772 | |
estimatedLossChangeFromW 17.3506 | |
loss 472.056 loss2 455.559 change: 16.4975 | |
sumweightsdiff -0.0308135 | |
loss change 16.4975 | |
estimatedLossChangeFromW 16.6639 | |
loss 455.559 loss2 439.714 change: 15.8447 | |
sumweightsdiff -0.0302379 | |
loss change 15.8447 | |
estimatedLossChangeFromW 16.0046 | |
loss 439.714 loss2 424.416 change: 15.2976 | |
sumweightsdiff -0.0296733 | |
loss change 15.2976 | |
estimatedLossChangeFromW 15.3717 | |
loss 424.416 loss2 409.545 change: 14.871 | |
sumweightsdiff -0.0299227 | |
loss change 14.871 | |
estimatedLossChangeFromW 15.0234 | |
loss 409.545 loss2 395.271 change: 14.274 | |
sumweightsdiff -0.0293575 | |
loss change 14.274 | |
estimatedLossChangeFromW 14.4202 | |
loss 395.271 loss2 381.57 change: 13.7013 | |
sumweightsdiff -0.0288033 | |
loss change 13.7013 | |
estimatedLossChangeFromW 13.8415 | |
loss 381.57 loss2 368.418 change: 13.1519 | |
sumweightsdiff -0.0282608 | |
loss change 13.1519 | |
estimatedLossChangeFromW 13.2864 | |
loss 368.418 loss2 355.794 change: 12.6248 | |
sumweightsdiff -0.0277294 | |
loss change 12.6248 | |
estimatedLossChangeFromW 12.7538 | |
loss 355.794 loss2 343.675 change: 12.119 | |
sumweightsdiff -0.027209 | |
loss change 12.119 | |
estimatedLossChangeFromW 12.2429 | |
loss 343.675 loss2 332.041 change: 11.634 | |
sumweightsdiff -0.0266991 | |
loss change 11.634 | |
estimatedLossChangeFromW 11.7526 | |
loss 332.041 loss2 320.872 change: 11.1684 | |
sumweightsdiff -0.0261997 | |
loss change 11.1684 | |
estimatedLossChangeFromW 11.2823 | |
loss 320.872 loss2 310.15 change: 10.7218 | |
sumweightsdiff -0.0257105 | |
loss change 10.7218 | |
estimatedLossChangeFromW 10.8312 | |
clblas teardown | |
[ OK ] testbackward.checknumerically_imagesize5_filter3_relu (1877 ms) | |
[ RUN ] testbackward.compare_1_n_kgsgo_32c5 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
-D BIASED -D gNumInputPlanes=32 -D gInputPlanes=32 -D gInputSize=19 -D gInputSizeSquared=361 -D gNumFilters=32 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=32 -D gOutputPlanes=32 -D gOutputSize=19 -D gOutputSizeSquared=361 -D gPadZeros=1 -D gMargin=2 -D gEven=0 -D gSkip=0 | |
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.07916 || -0.118675 || 0.0416933 || 0.100887 || -0.106013 | |
output[1]=-0.0574008 -0.0574008 SAME || 0.099984 || 0.0155394 || 0.00411644 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087 | |
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.04953 || -0.109186 || 0.104814 | |
output[3]=-0.0805896 -0.0805896 SAME || 0.0216207 || -0.128649 || -0.0159031 || 0.0534839 || 0.0301581 || 0.104269 || -0.0841106 | |
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.00649171 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597854 | |
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626 | |
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805987 || 0.0834764 || 0.0631893 | |
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445722 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755 | |
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822076 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944 | |
output[9]=-0.0725742 -0.0725742 SAME || 0.0486592 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909432 || -0.0152857 | |
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161164 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338 | |
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812487 || 0.0532493 || -0.0183774 || -0.0937923 || -0.069912 | |
output[12]=0.0389741 0.0389741 SAME || 0.0809882 || 0.0370538 || 0.0241565 || -0.0582968 || 0.0437625 || 0.139931 || -0.065007 | |
output[13]=0.0349705 0.0349705 SAME || -0.0251775 || -0.0759114 || 0.0945214 || 0.00389841 || -0.0377205 || 0.17624 || -0.114476 | |
output[14]=0.0366689 0.0366689 SAME || -0.0348694 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913825 || -0.0745193 | |
output[15]=0.0186965 0.0186965 SAME || 0.0281147 || 0.00937999 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854 | |
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555238 || -0.0203267 || -0.0795422 || -0.123847 | |
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399 | |
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101773 || -0.120741 || -0.00821514 || 0.00894922 || -0.117651 || 0.0631629 | |
output[19]=-0.0110406 -0.0110406 SAME || 0.189081 || 0.0665268 || 0.0622702 || 0.151629 || -0.0172241 || -0.0215623 || 0.0457666 | |
clblas teardown | |
instance 2 | |
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.07916 || -0.118675 || 0.0416933 || 0.100887 || -0.106013 | |
output[1]=-0.0574008 -0.0574008 SAME || 0.099984 || 0.0155394 || 0.00411644 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087 | |
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.04953 || -0.109186 || 0.104814 | |
output[3]=-0.0805896 -0.0805896 SAME || 0.0216207 || -0.128649 || -0.0159031 || 0.0534839 || 0.0301581 || 0.104269 || -0.0841106 | |
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.00649171 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597854 | |
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626 | |
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805987 || 0.0834764 || 0.0631893 | |
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445722 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755 | |
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822076 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944 | |
output[9]=-0.0725742 -0.0725742 SAME || 0.0486592 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909432 || -0.0152857 | |
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161164 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338 | |
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812487 || 0.0532493 || -0.0183774 || -0.0937923 || -0.069912 | |
output[12]=0.0389741 0.0389741 SAME || 0.0809882 || 0.0370538 || 0.0241565 || -0.0582968 || 0.0437625 || 0.139931 || -0.065007 | |
output[13]=0.0349705 0.0349705 SAME || -0.0251775 || -0.0759114 || 0.0945214 || 0.00389841 || -0.0377205 || 0.17624 || -0.114476 | |
output[14]=0.0366689 0.0366689 SAME || -0.0348694 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913825 || -0.0745193 | |
output[15]=0.0186965 0.0186965 SAME || 0.0281147 || 0.00937999 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854 | |
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555238 || -0.0203267 || -0.0795422 || -0.123847 | |
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399 | |
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101773 || -0.120741 || -0.00821514 || 0.00894922 || -0.117651 || 0.0631629 | |
output[19]=-0.0110406 -0.0110406 SAME || 0.189081 || 0.0665268 || 0.0622702 || 0.151629 || -0.0172241 || -0.0215623 || 0.0457666 | |
clblas teardown | |
instance 3 | |
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} | |
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.0791599 || -0.118675 || 0.0416933 || 0.100887 || -0.106013 | |
output[1]=-0.0574008 -0.0574008 SAME || 0.0999841 || 0.0155394 || 0.00411648 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087 | |
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.0495299 || -0.109186 || 0.104814 | |
output[3]=-0.0805896 -0.0805895 SAME || 0.0216206 || -0.128649 || -0.0159031 || 0.053484 || 0.0301581 || 0.104269 || -0.0841106 | |
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.0064917 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597853 | |
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626 | |
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805988 || 0.0834764 || 0.0631893 | |
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445723 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755 | |
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822077 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944 | |
output[9]=-0.0725742 -0.0725742 SAME || 0.0486591 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909431 || -0.0152858 | |
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161163 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338 | |
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812488 || 0.0532493 || -0.0183774 || -0.0937924 || -0.069912 | |
output[12]=0.0389741 0.0389741 SAME || 0.0809881 || 0.0370537 || 0.0241565 || -0.0582967 || 0.0437625 || 0.139931 || -0.0650069 | |
output[13]=0.0349705 0.0349705 SAME || -0.0251774 || -0.0759114 || 0.0945214 || 0.00389844 || -0.0377205 || 0.17624 || -0.114476 | |
output[14]=0.0366689 0.0366688 SAME || -0.0348695 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913827 || -0.0745193 | |
output[15]=0.0186965 0.0186966 SAME || 0.0281147 || 0.00938 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854 | |
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555237 || -0.0203267 || -0.0795422 || -0.123847 | |
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399 | |
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101772 || -0.120741 || -0.00821514 || 0.00894924 || -0.117651 || 0.0631629 | |
output[19]=-0.0110406 -0.0110407 SAME || 0.189081 || 0.0665268 || 0.0622703 || 0.151629 || -0.0172242 || -0.0215623 || 0.0457666 | |
clblas teardown | |
[ OK ] testbackward.compare_1_n_kgsgo_32c5 (964 ms) | |
[----------] 12 tests from testbackward (5653 ms total) | |
[----------] 5 tests from testsinglebatch | |
[ RUN ] testsinglebatch.imagesize5_filtersize3_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=5 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ LINEAR } | |
layer 3:FullyConnectedLayer{ numPlanes=5 imageSize=1 } | |
layer 4:ActivationLayer{ TANH } | |
layer 5:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=50 17.9% | |
layer 3: params=230 82.1% | |
TOTAL : params=280 | |
weightsTotalSize=280 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 160ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 239ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 239ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 136ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 160ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 316ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 402ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 136ms | |
forward layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 316ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 402ms | |
calcGradWeights layer selected kernel 1 | |
batch time 2352 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] testsinglebatch.imagesize5_filtersize3_batchsize2 (2617 ms) | |
[ RUN ] testsinglebatch.imagesize28 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=28 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=28 numFilters=10 filterSize=3 outputSize=26 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 4:ActivationLayer{ TANH } | |
layer 5:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=100 0.1% | |
layer 3: params=67610 99.9% | |
TOTAL : params=67710 | |
weightsTotalSize=67710 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 1ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 2ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 1ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 1ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 1ms | |
forward try kernel 2 | |
ForwardAuto: kernel 2: this instance cant be used: cannot use forward2, since outputimagesize * outputimagesize > maxworkgroupsize | |
... not valid | |
forward try kernel 3 | |
ForwardAuto: kernel 3: this instance cant be used: cannot use forward3, since outputimagesize * outputimagesize > maxworkgroupsize | |
... not valid | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 2ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 3ms | |
backward try kernel 2 | |
BackwardAuto: kernel 2: this instance cant be used: cannot use BackwardGpuCached, since inputSize * inputSize > maxworkgroupsize | |
... not valid | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 137ms | |
calcGradWeights try kernel 2 | |
BackpropWeightsAuto: kernel 2: this instance cant be used: cannot use BackpropWeightsScratch, since filterSize * filterSize > maxworkgroupsize | |
... not valid | |
calcGradWeights try kernel 3 | |
BackpropWeightsAuto: kernel 3: this instance cant be used: cannot use BackpropWeightsScratchLarge, since filterSize * filterSize > maxworkgroupsize | |
... not valid | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 316ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 231ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 2ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 1ms | |
forward kernel 2: cannot be used | |
forward kernel 3: cannot be used | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 231ms | |
forward layer selected kernel 4 | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 1ms | |
backward kernel 2: cannot be used | |
backward kernel 3 time: 137ms | |
backward layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 1ms | |
calcGradWeights kernel 2: cannot be used | |
calcGradWeights kernel 3: cannot be used | |
calcGradWeights kernel 4 time: 316ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=26 -D gOutputSizeSquared=676 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=28 -DgInputStripeOuterNumRows=32 -DgInputStripeInnerSize=784 -DgInputStripeOuterSize=896 -DgInputStripeMarginSize=56 -DgOutputStripeNumRows=26 -DgOutputStripeSize=676 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 1ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 122ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 327ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 2ms | |
forward kernel 2 time: 2ms | |
forward kernel 3 time: 3ms | |
forward kernel 4 time: 2ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 1ms | |
forward kernel 7 time: 122ms | |
forward layer selected kernel 5 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 1ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 327ms | |
calcGradWeights layer selected kernel 2 | |
batch time 3039 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] testsinglebatch.imagesize28 (3289 ms) | |
[ RUN ] testsinglebatch.imagesize28_filtersize5 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=28 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=28 numFilters=10 filterSize=5 outputSize=24 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 4:ActivationLayer{ TANH } | |
layer 5:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=260 0.4% | |
layer 3: params=57610 99.6% | |
TOTAL : params=57870 | |
weightsTotalSize=57870 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 1ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 2ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 1ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 1ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 2ms | |
forward try kernel 2 | |
ForwardAuto: kernel 2: this instance cant be used: cannot use forward2, since outputimagesize * outputimagesize > maxworkgroupsize | |
... not valid | |
forward try kernel 3 | |
ForwardAuto: kernel 3: this instance cant be used: cannot use forward3, since outputimagesize * outputimagesize > maxworkgroupsize | |
... not valid | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 2ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 4ms | |
backward try kernel 2 | |
BackwardAuto: kernel 2: this instance cant be used: cannot use BackwardGpuCached, since inputSize * inputSize > maxworkgroupsize | |
... not valid | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 140ms | |
calcGradWeights try kernel 2 | |
BackpropWeightsAuto: kernel 2: this instance cant be used: cannot use BackpropWeightsScratch, since filterSize * filterSize > maxworkgroupsize | |
... not valid | |
calcGradWeights try kernel 3 | |
BackpropWeightsAuto: kernel 3: this instance cant be used: cannot use BackpropWeightsScratchLarge, since filterSize * filterSize > maxworkgroupsize | |
... not valid | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 308ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 1ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 242ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 2ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 1ms | |
forward kernel 2: cannot be used | |
forward kernel 3: cannot be used | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 242ms | |
forward layer selected kernel 4 | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 1ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 1ms | |
backward kernel 2: cannot be used | |
backward kernel 3 time: 140ms | |
backward layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 1ms | |
calcGradWeights kernel 2: cannot be used | |
calcGradWeights kernel 3: cannot be used | |
calcGradWeights kernel 4 time: 308ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=10 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=24 -D gOutputSizeSquared=576 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=4 -DgInputStripeInnerNumRows=28 -DgInputStripeOuterNumRows=36 -DgInputStripeInnerSize=784 -DgInputStripeOuterSize=1008 -DgInputStripeMarginSize=112 -DgOutputStripeNumRows=24 -DgOutputStripeSize=576 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 1ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 4ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 127ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 300ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 2ms | |
forward kernel 2 time: 2ms | |
forward kernel 3 time: 4ms | |
forward kernel 4 time: 2ms | |
forward kernel 5 time: 1ms | |
forward kernel 6 time: 4ms | |
forward kernel 7 time: 127ms | |
forward layer selected kernel 5 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 2ms | |
calcGradWeights kernel 2 time: 1ms | |
calcGradWeights kernel 3 time: 1ms | |
calcGradWeights kernel 4 time: 300ms | |
calcGradWeights layer selected kernel 2 | |
batch time 2752 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] testsinglebatch.imagesize28_filtersize5 (3025 ms) | |
[ RUN ] testsinglebatch.imagesize5_filtersize3_batchsize2_softmax | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=5 filterSize=3 outputSize=5 padZeros=1 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=5 inputSize=5 numFilters=5 filterSize=3 outputSize=5 padZeros=1 biased=1 skip=0} } | |
layer 4:ActivationLayer{ RELU } | |
layer 5:FullyConnectedLayer{ numPlanes=5 imageSize=1 } | |
layer 6:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=50 5.5% | |
layer 3: params=230 25.3% | |
layer 5: params=630 69.2% | |
TOTAL : params=910 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
layer 1 offset: 0 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 1 | |
from w: 0 | |
actual: -3.14851 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
layer 3 | |
from w: 0 | |
actual: -3.14851 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
layer 5 | |
from w: 0 | |
actual: -3.14851 | |
layer 6 offset: 910 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 245ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 42ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
full thisloss: 3.14851 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 245ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 42ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 139ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
layer 1 offset: 0 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 139ms | |
forward layer selected kernel 1 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 162ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=4 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=13 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=65 -DgInputStripeMarginSize=20 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 397ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 162ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 315ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 397ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 408ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 43ms | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 315ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 408ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 43ms | |
calcGradWeights layer selected kernel 1 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 3 | |
from w: 0 | |
actual: 0 | |
layer 4 offset: 280 | |
layer 5 offset: 280 | |
layer 5 | |
from w: 0 | |
actual: 0 | |
layer 6 offset: 910 | |
full thisloss: 3.14851 | |
batch time 3481 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] testsinglebatch.imagesize5_filtersize3_batchsize2_softmax (3705 ms) | |
[ RUN ] testsinglebatch.imagesize4_filtersize3_batchsize2_pooling | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=12 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=12 numFilters=5 filterSize=3 outputSize=12 padZeros=1 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:PoolingLayer{ inputPlanes=5 inputSize=12 poolingSize=2 } | |
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=5 inputSize=6 numFilters=5 filterSize=3 outputSize=6 padZeros=1 biased=1 skip=0} } | |
layer 5:ActivationLayer{ RELU } | |
layer 6:PoolingLayer{ inputPlanes=5 inputSize=6 poolingSize=2 } | |
layer 7:FullyConnectedLayer{ numPlanes=5 imageSize=1 } | |
layer 8:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 } | |
Parameters overview: (skipping 6 layers with 0 params) | |
layer 1: params=50 9.8% | |
layer 4: params=230 45.1% | |
layer 7: params=230 45.1% | |
TOTAL : params=510 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
layer 1 offset: 0 | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
layer 1 | |
from w: 0 | |
actual: -3.55299 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
layer 4 | |
from w: 0 | |
actual: -3.55299 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
layer 7 | |
from w: 0 | |
actual: -3.55299 | |
layer 8 offset: 510 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 210ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 224ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
full thisloss: 3.55299 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 210ms | |
forward layer selected kernel 1 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 224ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 137ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
layer 1 offset: 0 | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 137ms | |
forward layer selected kernel 1 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 158ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 362ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=6 -D gInputSizeSquared=36 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=6 -D gOutputSizeSquared=36 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=6 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=36 -DgInputStripeOuterSize=60 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=6 -DgOutputStripeSize=36 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=12 -D gInputSizeSquared=144 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=12 -D gOutputSizeSquared=144 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=12 -DgInputStripeOuterNumRows=16 -DgInputStripeInnerSize=144 -DgInputStripeOuterSize=192 -DgInputStripeMarginSize=24 -DgOutputStripeNumRows=12 -DgOutputStripeSize=144 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 158ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 315ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 362ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 321ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 212ms | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 315ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 321ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 212ms | |
calcGradWeights layer selected kernel 1 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
layer 1 offset: 0 | |
layer 1 | |
from w: 0 | |
actual: 0 | |
layer 2 offset: 50 | |
layer 3 offset: 50 | |
layer 4 offset: 50 | |
layer 4 | |
from w: 0 | |
actual: 0 | |
layer 5 offset: 280 | |
layer 6 offset: 280 | |
layer 7 offset: 280 | |
layer 7 | |
from w: 0 | |
actual: 0 | |
layer 8 offset: 510 | |
full thisloss: 3.55299 | |
batch time 3659 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] testsinglebatch.imagesize4_filtersize3_batchsize2_pooling (4086 ms) | |
[----------] 5 tests from testsinglebatch (16722 ms total) | |
[----------] 1 test from EXCLUDED_testsinglebatch | |
[ RUN ] EXCLUDED_testsinglebatch.imagesize5_filtersize3_batchsize2_10filters | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
initializing clblas | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=10 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} } | |
layer 2:ActivationLayer{ RELU } | |
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 4:ActivationLayer{ TANH } | |
layer 5:SquareLossLayer{} | |
Parameters overview: (skipping 4 layers with 0 params) | |
layer 1: params=100 9.9% | |
layer 3: params=910 90.1% | |
TOTAL : params=1010 | |
weightsTotalSize=1010 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
backward try kernel 0 | |
... not plausibly optimal, skipping | |
backward try kernel 1 | |
... seems valid | |
BackwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 2 | |
... seems valid | |
ForwardAuto: kernel 2 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
forward try kernel 3 | |
... seems valid | |
ForwardAuto: kernel 3 0ms | |
backward try kernel 2 | |
... seems valid | |
BackwardAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
calcGradWeights try kernel 2 | |
... seems valid | |
BackpropWeightsAuto: kernel 2 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 4 | |
... seems valid | |
ForwardAuto: kernel 4 0ms | |
forward try kernel 5 | |
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical | |
... not valid | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward try kernel 5 | |
... seems valid | |
ForwardAuto: kernel 5 0ms | |
backward try kernel 3 | |
... seems valid | |
BackwardAuto: kernel 3 152ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=10 -D gInputPlanes=10 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
calcGradWeights try kernel 3 | |
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9 | |
... seems valid | |
BackpropWeightsAuto: kernel 3 0ms | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 242ms | |
forward try kernel 6 | |
... seems valid | |
ForwardAuto: kernel 6 0ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5: cannot be used | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 242ms | |
forward layer selected kernel 1 | |
forward try kernel 7 | |
... seems valid | |
ForwardAuto: kernel 7 135ms | |
backward kernel 0: cannot be used | |
backward kernel 1 time: 0ms | |
backward kernel 2 time: 0ms | |
backward kernel 3 time: 152ms | |
backward layer selected kernel 1 | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 326ms | |
calcGradWeights try kernel 4 | |
... seems valid | |
BackpropWeightsAuto: kernel 4 447ms | |
forward kernel 0: cannot be used | |
forward kernel 1 time: 0ms | |
forward kernel 2 time: 0ms | |
forward kernel 3 time: 0ms | |
forward kernel 4 time: 0ms | |
forward kernel 5 time: 0ms | |
forward kernel 6 time: 0ms | |
forward kernel 7 time: 135ms | |
forward layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 326ms | |
calcGradWeights layer selected kernel 1 | |
calcGradWeights kernel 0: cannot be used | |
calcGradWeights kernel 1 time: 0ms | |
calcGradWeights kernel 2 time: 0ms | |
calcGradWeights kernel 3 time: 0ms | |
calcGradWeights kernel 4 time: 447ms | |
calcGradWeights layer selected kernel 1 | |
batch time 2668 ms | |
dump enabled=0 | |
clblas teardown | |
[ OK ] EXCLUDED_testsinglebatch.imagesize5_filtersize3_batchsize2_10filters (2916 ms) | |
[----------] 1 test from EXCLUDED_testsinglebatch (2917 ms total) | |
[----------] 9 tests from testpoolingforward | |
[ RUN ] testpoolingforward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.basic (51 ms) | |
[ RUN ] testpoolingforward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.basic_2plane_batchsize2 (44 ms) | |
[ RUN ] testpoolingforward.fromwrappers | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.fromwrappers (42 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling2 (43 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling3 (48 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling2_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling2_pz (42 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_pz (49 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_small | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_small (40 ms) | |
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_small2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_small2 (39 ms) | |
[----------] 9 tests from testpoolingforward (398 ms total) | |
[----------] 2 tests from testpoolingbackward | |
[ RUN ] testpoolingbackward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingbackward.basic (4 ms) | |
[ RUN ] testpoolingbackward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testpoolingbackward.basic_2plane_batchsize2 (1 ms) | |
[----------] 2 tests from testpoolingbackward (5 ms total) | |
[----------] 7 tests from teststringhelper | |
[ RUN ] teststringhelper.split | |
[ OK ] teststringhelper.split (0 ms) | |
[ RUN ] teststringhelper.split2 | |
[ OK ] teststringhelper.split2 (0 ms) | |
[ RUN ] teststringhelper.split3 | |
[ OK ] teststringhelper.split3 (0 ms) | |
[ RUN ] teststringhelper.tolower | |
[ OK ] teststringhelper.tolower (0 ms) | |
[ RUN ] teststringhelper.replace | |
[ OK ] teststringhelper.replace (0 ms) | |
[ RUN ] teststringhelper.replaceglobal | |
[ OK ] teststringhelper.replaceglobal (0 ms) | |
[ RUN ] teststringhelper.strcpy_safe | |
[ OK ] teststringhelper.strcpy_safe (0 ms) | |
[----------] 7 tests from teststringhelper (0 ms total) | |
[----------] 1 test from testGtestGlobals | |
[ RUN ] testGtestGlobals.basic | |
There are 1 parameters: | |
argv[0]=bin/deepcl_unittests | |
[ OK ] testGtestGlobals.basic (0 ms) | |
[----------] 1 test from testGtestGlobals (0 ms total) | |
[----------] 1 test from testMemset | |
[ RUN ] testMemset.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testMemset.basic (38 ms) | |
[----------] 1 test from testMemset (38 ms total) | |
[----------] 2 tests from testCopyBuffer | |
[ RUN ] testCopyBuffer.floats | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
[ OK ] testCopyBuffer.floats (110 ms) | |
[ RUN ] testCopyBuffer.ints | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
[ OK ] testCopyBuffer.ints (125 ms) | |
[----------] 2 tests from testCopyBuffer (235 ms total) | |
[----------] 2 tests from testCopyBlock | |
[ RUN ] testCopyBlock.testPos | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
in[0]=3076 | |
in[1]=8 | |
in[2]=14 | |
res[0]=3 | |
res[1]=4 | |
res[2]=8206 | |
res[3]=8 | |
res[4]=14 | |
[ OK ] testCopyBlock.testPos (48 ms) | |
[ RUN ] testCopyBlock.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
2 3 4 | |
6 7 8 | |
0 0 0 0 | |
5 6 7 | |
9 10 11 | |
0 0 0 0 | |
[ OK ] testCopyBlock.basic (49 ms) | |
[----------] 2 tests from testCopyBlock (97 ms total) | |
[----------] 1 test from testCopyLocal | |
[ RUN ] testCopyLocal.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
0 0 0 0 | |
1 2 3 4 | |
5 6 7 8 | |
9 10 11 12 | |
0 0 0 0 | |
[ OK ] testCopyLocal.basic (40 ms) | |
[----------] 1 test from testCopyLocal (40 ms total) | |
[----------] 8 tests from testNetdefToNet | |
[ RUN ] testNetdefToNet.empty | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testNetdefToNet.empty (2 ms) | |
[ RUN ] testNetdefToNet.onefc | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testNetdefToNet.onefc (76 ms) | |
[ RUN ] testNetdefToNet.onefclinear | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testNetdefToNet.onefclinear (73 ms) | |
[ RUN ] testNetdefToNet.150n_10n | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testNetdefToNet.150n_10n (75 ms) | |
[ RUN ] testNetdefToNet.3xfclinear | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
nnString: [3] | |
repeatNum 3 | |
remainderString [150n] | |
inner [150n] | |
multiplied string: 150n-150n-150n | |
layer 0:InputLayer{ outputPlanes=1 outputSize=19 } | |
layer 1:FullyConnectedLayer{ numPlanes=150 imageSize=1 } | |
layer 2:FullyConnectedLayer{ numPlanes=150 imageSize=1 } | |
layer 3:FullyConnectedLayer{ numPlanes=150 imageSize=1 } | |
layer 4:SoftMaxLayer{ perPlane=0 numPlanes=150 imageSize=1 } | |
Parameters overview: (skipping 2 layers with 0 params) | |
layer 1: params=54300 54.5% | |
layer 2: params=22650 22.7% | |
layer 3: params=22650 22.7% | |
TOTAL : params=99600 | |
[ OK ] testNetdefToNet.3xfclinear (74 ms) | |
[ RUN ] testNetdefToNet.mp2_3x32c5z_10n | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
prefix: [mp2] | |
nnString: [3] | |
repeatNum 3 | |
remainderString [32c5z-10n ] | |
postfix [10n ] | |
inner [32c5z] | |
multiplied string: mp2-32c5z-32c5z-32c5z-10n | |
layer 0:InputLayer{ outputPlanes=1 outputSize=19 } | |
layer 1:PoolingLayer{ inputPlanes=1 inputSize=19 poolingSize=2 } | |
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} } | |
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} } | |
layer 5:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 6:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 } | |
Parameters overview: (skipping 3 layers with 0 params) | |
layer 2: params=832 1.1% | |
layer 3: params=25632 32.9% | |
layer 4: params=25632 32.9% | |
layer 5: params=25930 33.2% | |
TOTAL : params=78026 | |
[ OK ] testNetdefToNet.mp2_3x32c5z_10n (182 ms) | |
[ RUN ] testNetdefToNet.3x32c5zmp2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
nnString: [3] | |
repeatNum 3 | |
remainderString [(32c5z-mp2)-10n] | |
inner [32c5z-mp2] | |
newRemainder [-10n] | |
postfix [10n] | |
multiplied string: 32c5z-mp2-32c5z-mp2-32c5z-mp2-10n | |
layer 0:InputLayer{ outputPlanes=1 outputSize=128 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=128 numFilters=32 filterSize=5 outputSize=128 padZeros=1 biased=1 skip=0} } | |
layer 2:PoolingLayer{ inputPlanes=32 inputSize=128 poolingSize=2 } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=64 numFilters=32 filterSize=5 outputSize=64 padZeros=1 biased=1 skip=0} } | |
layer 4:PoolingLayer{ inputPlanes=32 inputSize=64 poolingSize=2 } | |
layer 5:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=32 numFilters=32 filterSize=5 outputSize=32 padZeros=1 biased=1 skip=0} } | |
layer 6:PoolingLayer{ inputPlanes=32 inputSize=32 poolingSize=2 } | |
layer 7:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 8:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 } | |
Parameters overview: (skipping 5 layers with 0 params) | |
layer 1: params=832 0.6% | |
layer 3: params=25632 19.1% | |
layer 5: params=25632 19.1% | |
layer 7: params=81930 61.1% | |
TOTAL : params=134026 | |
[ OK ] testNetdefToNet.3x32c5zmp2 (385 ms) | |
[ RUN ] testNetdefToNet.2x32c7_3x32c5z | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
nnString: [2] | |
repeatNum 2 | |
remainderString [32c7z-3*32c5z-10n] | |
postfix [3*32c5z-10n] | |
inner [32c7z] | |
nnString: [3] | |
repeatNum 3 | |
remainderString [32c5z-10n] | |
postfix [10n] | |
inner [32c5z] | |
multiplied string: 32c5z-32c5z-32c5z-10n | |
multiplied string: 32c7z-32c7z-32c5z-32c5z-32c5z-10n | |
layer 0:InputLayer{ outputPlanes=1 outputSize=19 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=19 numFilters=32 filterSize=7 outputSize=19 padZeros=1 biased=1 skip=0} } | |
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=7 outputSize=19 padZeros=1 biased=1 skip=0} } | |
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} } | |
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} } | |
layer 5:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} } | |
layer 6:FullyConnectedLayer{ numPlanes=10 imageSize=1 } | |
layer 7:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 } | |
Parameters overview: (skipping 2 layers with 0 params) | |
layer 1: params=1600 0.7% | |
layer 2: params=50208 20.6% | |
layer 3: params=25632 10.5% | |
layer 4: params=25632 10.5% | |
layer 5: params=25632 10.5% | |
layer 6: params=115530 47.3% | |
TOTAL : params=244234 | |
[ OK ] testNetdefToNet.2x32c7_3x32c5z (69 ms) | |
[----------] 8 tests from testNetdefToNet (936 ms total) | |
[----------] 10 tests from testactivationforward | |
[ RUN ] testactivationforward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.basic (1 ms) | |
[ RUN ] testactivationforward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.basic_2plane_batchsize2 (2 ms) | |
[ RUN ] testactivationforward.fromwrappers | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.fromwrappers (34 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation2 (34 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation3 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation3 (34 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation2_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation2_pz (35 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation3_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation3_pz (34 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation3_small (34 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation3_small2 (35 ms) | |
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small2_tanh | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testactivationforward.comparespecific_0_1_activation3_small2_tanh (70 ms) | |
[----------] 10 tests from testactivationforward (313 ms total) | |
[----------] 2 tests from testactivationbackward | |
[ RUN ] testactivationbackward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
gradInput=3 | |
gradInput=0 | |
gradInput=-2.7 | |
gradInput=2 | |
gradInput=-0 | |
gradInput=2.1 | |
gradInput=0 | |
gradInput=-1.1 | |
gradInput=0 | |
[ OK ] testactivationbackward.basic (2 ms) | |
[ RUN ] testactivationbackward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
gradInput=3 | |
gradInput=0 | |
gradInput=0 | |
gradInput=9 | |
[ OK ] testactivationbackward.basic_2plane_batchsize2 (1 ms) | |
[----------] 2 tests from testactivationbackward (3 ms total) | |
[----------] 1 test from testRandomSingleton | |
[ RUN ] testRandomSingleton.testMockRandom | |
0.569795 | |
0.59168 | |
0.620742 | |
0.807657 | |
0.0113285 | |
0.359743 | |
0.556429 | |
0.334354 | |
0.476656 | |
0.0844408 | |
[ OK ] testRandomSingleton.testMockRandom (0 ms) | |
[----------] 1 test from testRandomSingleton (0 ms total) | |
[----------] 10 tests from testdropoutforward | |
[ RUN ] testdropoutforward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.basic (1 ms) | |
[ RUN ] testdropoutforward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.basic_2plane_batchsize2 (1 ms) | |
[ RUN ] testdropoutforward.fromwrappers | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.fromwrappers (1 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout2 (35 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout3 (34 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout2_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout2_pz (47 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_pz | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_pz (36 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small (37 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small2 | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small2 (36 ms) | |
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small2_tanh | |
instance0: 0 | |
instance1: 1 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small2_tanh (34 ms) | |
[----------] 10 tests from testdropoutforward (262 ms total) | |
[----------] 3 tests from testdropoutbackward | |
[ RUN ] testdropoutbackward.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutbackward.basic (34 ms) | |
[ RUN ] testdropoutbackward.basic_2plane_batchsize2 | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutbackward.basic_2plane_batchsize2 (36 ms) | |
[ RUN ] testdropoutbackward.compare_args | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testdropoutbackward.compare_args (35 ms) | |
[----------] 3 tests from testdropoutbackward (105 ms total) | |
[----------] 1 test from testsgd | |
[ RUN ] testsgd.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
layer 0:InputLayer{ outputPlanes=1 outputSize=5 } | |
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} } | |
layer 2:SquareLossLayer{} | |
inputtotalsize=50 outputTotalSize=18 | |
forward try kernel 0 | |
... not plausibly optimal, skipping | |
forward try kernel 1 | |
... seems valid | |
ForwardAuto: kernel 1 0ms | |
calcGradWeights try kernel 0 | |
... not plausibly optimal, skipping | |
calcGradWeights try kernel 1 | |
... seems valid | |
BackpropWeightsAuto: kernel 1 0ms | |
[ OK ] testsgd.basic (281 ms) | |
[----------] 1 test from testsgd (282 ms total) | |
[----------] 9 tests from testCLMathWrapper | |
[ RUN ] testCLMathWrapper.assign | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=4 | |
a[1]=2.1 | |
a[2]=5 | |
a[3]=3 | |
a[4]=9.2 | |
[ OK ] testCLMathWrapper.assign (35 ms) | |
[ RUN ] testCLMathWrapper.assignScalar | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=3.4 | |
a[1]=3.4 | |
a[2]=3.4 | |
a[3]=3.4 | |
a[4]=3.4 | |
[ OK ] testCLMathWrapper.assignScalar (36 ms) | |
[ RUN ] testCLMathWrapper.addinplace | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=5 | |
a[1]=5.1 | |
a[2]=14 | |
a[3]=15.5 | |
a[4]=11.7 | |
[ OK ] testCLMathWrapper.addinplace (35 ms) | |
[ RUN ] testCLMathWrapper.multiplyinplace | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1.5 | |
a[1]=4.5 | |
a[2]=13.5 | |
a[3]=18.75 | |
a[4]=3.75 | |
[ OK ] testCLMathWrapper.multiplyinplace (35 ms) | |
[ RUN ] testCLMathWrapper.addscalar | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=2.5 | |
a[1]=4.5 | |
a[2]=10.5 | |
a[3]=14 | |
a[4]=4 | |
[ OK ] testCLMathWrapper.addscalar (34 ms) | |
[ RUN ] testCLMathWrapper.sqrt | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1 | |
a[1]=1.73205 | |
a[2]=3 | |
a[3]=3.53553 | |
a[4]=1.58114 | |
[ OK ] testCLMathWrapper.sqrt (35 ms) | |
[ RUN ] testCLMathWrapper.squared | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1 | |
a[1]=9 | |
a[2]=81 | |
a[3]=156.25 | |
a[4]=6.25 | |
[ OK ] testCLMathWrapper.squared (34 ms) | |
[ RUN ] testCLMathWrapper.inverse | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1 | |
a[1]=0.333333 | |
a[2]=0.111111 | |
a[3]=0.08 | |
a[4]=0.4 | |
[ OK ] testCLMathWrapper.inverse (34 ms) | |
[ RUN ] testCLMathWrapper.perelementmult | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=4 | |
a[1]=6.3 | |
a[2]=45 | |
a[3]=37.5 | |
a[4]=23 | |
[ OK ] testCLMathWrapper.perelementmult (34 ms) | |
[----------] 9 tests from testCLMathWrapper (313 ms total) | |
[----------] 1 test from testreducesegments | |
[ RUN ] testreducesegments.basic | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
[ OK ] testreducesegments.basic (34 ms) | |
[----------] 1 test from testreducesegments (34 ms total) | |
[----------] 4 tests from testGpuOp | |
[ RUN ] testGpuOp.addinplace | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=5 | |
a[1]=5.1 | |
a[2]=14 | |
a[3]=15.5 | |
a[4]=11.7 | |
[ OK ] testGpuOp.addinplace (34 ms) | |
[ RUN ] testGpuOp.addoutofplace | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1 | |
a[1]=3 | |
a[2]=9 | |
a[3]=12.5 | |
a[4]=2.5 | |
c[0]=5 | |
c[1]=5.1 | |
c[2]=14 | |
c[3]=15.5 | |
c[4]=11.7 | |
[ OK ] testGpuOp.addoutofplace (35 ms) | |
[ RUN ] testGpuOp.inverse | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=1 | |
a[1]=0.333333 | |
a[2]=0.111111 | |
a[3]=0.08 | |
a[4]=0.4 | |
[ OK ] testGpuOp.inverse (34 ms) | |
[ RUN ] testGpuOp.addscalarinplace | |
Using Intel , OpenCL platform: Intel Gen OCL Driver | |
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2 | |
a[0]=5.2 | |
a[1]=7.2 | |
a[2]=13.2 | |
a[3]=16.7 | |
a[4]=6.7 | |
[ OK ] testGpuOp.addscalarinplace (34 ms) | |
[----------] 4 tests from testGpuOp (137 ms total) | |
[----------] 1 test from testjpeghelper | |
[ RUN ] testjpeghelper.writeread | |
[ OK ] testjpeghelper.writeread (0 ms) | |
[----------] 1 test from testjpeghelper (0 ms total) | |
[----------] Global test environment tear-down | |
[==========] 158 tests from 29 test cases ran. (66801 ms total) | |
[ PASSED ] 158 tests. | |
YOU HAVE 2 DISABLED TESTS |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment