Trained by me using https://github.com/soumith/imagenet-multiGPU.torch, achieves 62.6% top1 center crop accuracy on ImageNet validation set. Tested here: https://github.com/szagoruyko/imagenet-validation.torch
Download links:
- https://www.dropbox.com/s/mclw90yba6eml60/nin_bn_final.t7 (31 MB)
- https://www.dropbox.com/s/npmr5egvjbg7ovb/nin_nobn_final.t7 (31 MB) - Batch Normalization intergrated into convolutional layers
Load as:
net = torch.load'./nin_nobn_final.t7':unpack()
Input image size is 224.
Separate mean std per channel is saved with the network:
> print(net.transform)
{
mean :
{
1 : 0.48462227599918
2 : 0.45624044862054
3 : 0.40588363755159
}
std :
{
1 : 0.22889466674951
2 : 0.22446679341259
3 : 0.22495548344775
}
}
Can be loaded without CUDA support.
The model is train in 35 epochs, a bit more than a day on Titan X with CUDNN V4.
local regimes = {
-- start, end, LR, WD,
{ 1, 9, 1e-1, 5e-4, },
{ 10, 19, 1e-2, 5e-4 },
{ 20, 25, 1e-3, 0 },
{ 26, 30, 1e-4, 0 },
}
With Batch Normalization:
nn.Sequential {
(1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
(2): nn.SpatialBatchNormalization
(3): nn.ReLU
(4): nn.SpatialConvolution(96 -> 96, 1x1)
(5): nn.SpatialBatchNormalization
(6): nn.ReLU
(7): nn.SpatialConvolution(96 -> 96, 1x1)
(8): nn.SpatialBatchNormalization
(9): nn.ReLU
(10): nn.SpatialMaxPooling(3,3,2,2,1,1)
(11): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
(12): nn.SpatialBatchNormalization
(13): nn.ReLU
(14): nn.SpatialConvolution(256 -> 256, 1x1)
(15): nn.SpatialBatchNormalization
(16): nn.ReLU
(17): nn.SpatialConvolution(256 -> 256, 1x1)
(18): nn.SpatialBatchNormalization
(19): nn.ReLU
(20): nn.SpatialMaxPooling(3,3,2,2,1,1)
(21): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
(22): nn.SpatialBatchNormalization
(23): nn.ReLU
(24): nn.SpatialConvolution(384 -> 384, 1x1)
(25): nn.SpatialBatchNormalization
(26): nn.ReLU
(27): nn.SpatialConvolution(384 -> 384, 1x1)
(28): nn.SpatialBatchNormalization
(29): nn.ReLU
(30): nn.SpatialMaxPooling(3,3,2,2,1,1)
(31): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
(32): nn.SpatialBatchNormalization
(33): nn.ReLU
(34): nn.SpatialConvolution(1024 -> 1024, 1x1)
(35): nn.SpatialBatchNormalization
(36): nn.ReLU
(37): nn.SpatialConvolution(1024 -> 1024, 1x1)
(38): nn.SpatialBatchNormalization
(39): nn.ReLU
(40): nn.SpatialAveragePooling(7,7,1,1)
(41): nn.View(-1)
(42): nn.Linear(1024 -> 1000)
}
Without:
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> output]
(1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
(2): nn.ReLU
(3): nn.SpatialConvolution(96 -> 96, 1x1)
(4): nn.ReLU
(5): nn.SpatialConvolution(96 -> 96, 1x1)
(6): nn.ReLU
(7): nn.SpatialMaxPooling(3,3,2,2,1,1)
(8): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
(9): nn.ReLU
(10): nn.SpatialConvolution(256 -> 256, 1x1)
(11): nn.ReLU
(12): nn.SpatialConvolution(256 -> 256, 1x1)
(13): nn.ReLU
(14): nn.SpatialMaxPooling(3,3,2,2,1,1)
(15): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
(16): nn.ReLU
(17): nn.SpatialConvolution(384 -> 384, 1x1)
(18): nn.ReLU
(19): nn.SpatialConvolution(384 -> 384, 1x1)
(20): nn.ReLU
(21): nn.SpatialMaxPooling(3,3,2,2,1,1)
(22): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
(23): nn.ReLU
(24): nn.SpatialConvolution(1024 -> 1024, 1x1)
(25): nn.ReLU
(26): nn.SpatialConvolution(1024 -> 1024, 1x1)
(27): nn.ReLU
(28): nn.SpatialAveragePooling(7,7,1,1)
(29): nn.View(-1)
(30): nn.Linear(1024 -> 1000)
}
Nice gist, @szagoruyko!