Skip to content

Instantly share code, notes, and snippets.

@szagoruyko
Created May 5, 2015 15:10
Show Gist options
  • Save szagoruyko/9f3c0e41732ab485c602 to your computer and use it in GitHub Desktop.
Save szagoruyko/9f3c0e41732ab485c602 to your computer and use it in GitHub Desktop.
Using 1-th gpu
Loading ./data/ptb.train.txt, size of data = 929589
Loading ./data/ptb.valid.txt, size of data = 73760
Loading ./data/ptb.test.txt, size of data = 82430
Network parameters:
{
layers : 2
lr : 1
max_max_epoch : 13
max_grad_norm : 5
max_epoch : 4
init_weight : 0.1
decay : 2
vocab_size : 10000
seq_length : 20
batch_size : 20
dropout : 0
rnn_size : 200
}
Creating a RNN LSTM network.
Starting training.
epoch = 0.004, train perp. = 9935.947, wps = 5541, dw:norm() = 4.883, lr = 1.000, since beginning = 0 mins.
epoch = 0.104, train perp. = 7732.259, wps = 5237, dw:norm() = 2.990, lr = 1.000, since beginning = 0 mins.
epoch = 0.204, train perp. = 5706.938, wps = 5293, dw:norm() = 4.357, lr = 1.000, since beginning = 1 mins.
epoch = 0.304, train perp. = 4077.349, wps = 5386, dw:norm() = 4.563, lr = 1.000, since beginning = 1 mins.
epoch = 0.404, train perp. = 2859.994, wps = 5445, dw:norm() = 4.132, lr = 1.000, since beginning = 1 mins.
epoch = 0.504, train perp. = 1986.550, wps = 5480, dw:norm() = 4.853, lr = 1.000, since beginning = 1 mins.
epoch = 0.604, train perp. = 1352.858, wps = 5504, dw:norm() = 4.557, lr = 1.000, since beginning = 2 mins.
epoch = 0.703, train perp. = 918.462, wps = 5520, dw:norm() = 4.232, lr = 1.000, since beginning = 2 mins.
epoch = 0.803, train perp. = 619.978, wps = 5538, dw:norm() = 4.471, lr = 1.000, since beginning = 2 mins.
epoch = 0.903, train perp. = 413.348, wps = 5551, dw:norm() = 4.483, lr = 1.000, since beginning = 3 mins.
Validation set perplexity : 182.914
epoch = 1.003, train perp. = 275.773, wps = 5433, dw:norm() = 4.171, lr = 1.000, since beginning = 3 mins.
epoch = 1.103, train perp. = 233.116, wps = 5441, dw:norm() = 4.377, lr = 1.000, since beginning = 3 mins.
epoch = 1.203, train perp. = 210.494, wps = 5459, dw:norm() = 4.397, lr = 1.000, since beginning = 3 mins.
epoch = 1.303, train perp. = 193.608, wps = 5468, dw:norm() = 4.613, lr = 1.000, since beginning = 4 mins.
epoch = 1.402, train perp. = 180.954, wps = 5479, dw:norm() = 4.314, lr = 1.000, since beginning = 4 mins.
epoch = 1.502, train perp. = 170.699, wps = 5486, dw:norm() = 4.675, lr = 1.000, since beginning = 4 mins.
epoch = 1.602, train perp. = 162.019, wps = 5491, dw:norm() = 4.587, lr = 1.000, since beginning = 5 mins.
epoch = 1.702, train perp. = 155.125, wps = 5494, dw:norm() = 4.388, lr = 1.000, since beginning = 5 mins.
epoch = 1.802, train perp. = 149.027, wps = 5492, dw:norm() = 4.831, lr = 1.000, since beginning = 5 mins.
epoch = 1.902, train perp. = 143.042, wps = 5492, dw:norm() = 4.873, lr = 1.000, since beginning = 5 mins.
Validation set perplexity : 145.689
epoch = 2.002, train perp. = 138.108, wps = 5435, dw:norm() = 4.669, lr = 1.000, since beginning = 6 mins.
epoch = 2.102, train perp. = 133.313, wps = 5438, dw:norm() = 4.644, lr = 1.000, since beginning = 6 mins.
epoch = 2.201, train perp. = 129.637, wps = 5442, dw:norm() = 4.690, lr = 1.000, since beginning = 6 mins.
epoch = 2.301, train perp. = 125.792, wps = 5446, dw:norm() = 4.921, lr = 1.000, since beginning = 7 mins.
epoch = 2.401, train perp. = 122.429, wps = 5451, dw:norm() = 5.161, lr = 1.000, since beginning = 7 mins.
epoch = 2.501, train perp. = 119.234, wps = 5454, dw:norm() = 5.406, lr = 1.000, since beginning = 7 mins.
epoch = 2.601, train perp. = 116.342, wps = 5459, dw:norm() = 5.422, lr = 1.000, since beginning = 7 mins.
epoch = 2.701, train perp. = 113.744, wps = 5462, dw:norm() = 4.914, lr = 1.000, since beginning = 8 mins.
epoch = 2.801, train perp. = 111.380, wps = 5465, dw:norm() = 5.236, lr = 1.000, since beginning = 8 mins.
epoch = 2.901, train perp. = 108.860, wps = 5464, dw:norm() = 5.609, lr = 1.000, since beginning = 8 mins.
Validation set perplexity : 133.130
epoch = 3.000, train perp. = 106.643, wps = 5428, dw:norm() = 5.282, lr = 1.000, since beginning = 9 mins.
epoch = 3.100, train perp. = 104.406, wps = 5430, dw:norm() = 5.446, lr = 1.000, since beginning = 9 mins.
epoch = 3.200, train perp. = 102.490, wps = 5431, dw:norm() = 5.144, lr = 1.000, since beginning = 9 mins.
epoch = 3.300, train perp. = 100.519, wps = 5433, dw:norm() = 5.295, lr = 1.000, since beginning = 9 mins.
epoch = 3.400, train perp. = 98.692, wps = 5435, dw:norm() = 5.478, lr = 1.000, since beginning = 10 mins.
epoch = 3.500, train perp. = 96.904, wps = 5439, dw:norm() = 5.562, lr = 1.000, since beginning = 10 mins.
epoch = 3.600, train perp. = 95.230, wps = 5440, dw:norm() = 5.574, lr = 1.000, since beginning = 10 mins.
epoch = 3.700, train perp. = 93.712, wps = 5442, dw:norm() = 5.178, lr = 1.000, since beginning = 11 mins.
epoch = 3.799, train perp. = 92.325, wps = 5447, dw:norm() = 5.906, lr = 1.000, since beginning = 11 mins.
epoch = 3.899, train perp. = 90.923, wps = 5449, dw:norm() = 5.840, lr = 1.000, since beginning = 11 mins.
epoch = 3.999, train perp. = 89.584, wps = 5454, dw:norm() = 5.529, lr = 1.000, since beginning = 11 mins.
Validation set perplexity : 127.328
epoch = 4.099, train perp. = 88.279, wps = 5428, dw:norm() = 5.925, lr = 1.000, since beginning = 12 mins.
epoch = 4.199, train perp. = 87.022, wps = 5431, dw:norm() = 6.372, lr = 1.000, since beginning = 12 mins.
epoch = 4.299, train perp. = 85.795, wps = 5433, dw:norm() = 5.526, lr = 1.000, since beginning = 12 mins.
epoch = 4.399, train perp. = 84.654, wps = 5437, dw:norm() = 5.546, lr = 1.000, since beginning = 13 mins.
epoch = 4.498, train perp. = 83.559, wps = 5439, dw:norm() = 6.128, lr = 1.000, since beginning = 13 mins.
epoch = 4.598, train perp. = 82.538, wps = 5434, dw:norm() = 5.637, lr = 1.000, since beginning = 13 mins.
epoch = 4.698, train perp. = 81.571, wps = 5437, dw:norm() = 6.196, lr = 1.000, since beginning = 13 mins.
epoch = 4.798, train perp. = 80.591, wps = 5440, dw:norm() = 6.400, lr = 1.000, since beginning = 14 mins.
epoch = 4.898, train perp. = 79.679, wps = 5438, dw:norm() = 6.017, lr = 1.000, since beginning = 14 mins.
epoch = 4.998, train perp. = 78.830, wps = 5441, dw:norm() = 5.711, lr = 1.000, since beginning = 14 mins.
Validation set perplexity : 126.823
epoch = 5.098, train perp. = 77.727, wps = 5420, dw:norm() = 6.446, lr = 0.500, since beginning = 15 mins.
epoch = 5.198, train perp. = 76.395, wps = 5424, dw:norm() = 5.612, lr = 0.500, since beginning = 15 mins.
epoch = 5.297, train perp. = 74.835, wps = 5425, dw:norm() = 6.142, lr = 0.500, since beginning = 15 mins.
epoch = 5.397, train perp. = 73.236, wps = 5420, dw:norm() = 5.697, lr = 0.500, since beginning = 15 mins.
epoch = 5.497, train perp. = 71.511, wps = 5423, dw:norm() = 6.664, lr = 0.500, since beginning = 16 mins.
epoch = 5.597, train perp. = 69.709, wps = 5426, dw:norm() = 6.144, lr = 0.500, since beginning = 16 mins.
epoch = 5.697, train perp. = 67.828, wps = 5428, dw:norm() = 5.746, lr = 0.500, since beginning = 16 mins.
epoch = 5.797, train perp. = 65.865, wps = 5431, dw:norm() = 6.841, lr = 0.500, since beginning = 17 mins.
epoch = 5.897, train perp. = 63.817, wps = 5434, dw:norm() = 6.982, lr = 0.500, since beginning = 17 mins.
epoch = 5.997, train perp. = 61.716, wps = 5436, dw:norm() = 6.604, lr = 0.500, since beginning = 17 mins.
Validation set perplexity : 118.348
epoch = 6.096, train perp. = 60.570, wps = 5418, dw:norm() = 6.958, lr = 0.250, since beginning = 17 mins.
epoch = 6.196, train perp. = 59.494, wps = 5420, dw:norm() = 6.041, lr = 0.250, since beginning = 18 mins.
epoch = 6.296, train perp. = 58.424, wps = 5423, dw:norm() = 6.475, lr = 0.250, since beginning = 18 mins.
epoch = 6.396, train perp. = 57.363, wps = 5425, dw:norm() = 6.490, lr = 0.250, since beginning = 18 mins.
epoch = 6.496, train perp. = 56.288, wps = 5428, dw:norm() = 6.282, lr = 0.250, since beginning = 19 mins.
epoch = 6.596, train perp. = 55.201, wps = 5428, dw:norm() = 6.741, lr = 0.250, since beginning = 19 mins.
epoch = 6.696, train perp. = 54.109, wps = 5431, dw:norm() = 6.365, lr = 0.250, since beginning = 19 mins.
epoch = 6.796, train perp. = 52.997, wps = 5432, dw:norm() = 6.839, lr = 0.250, since beginning = 19 mins.
epoch = 6.895, train perp. = 51.859, wps = 5433, dw:norm() = 6.951, lr = 0.250, since beginning = 20 mins.
epoch = 6.995, train perp. = 50.713, wps = 5435, dw:norm() = 6.796, lr = 0.250, since beginning = 20 mins.
Validation set perplexity : 117.957
epoch = 7.095, train perp. = 50.108, wps = 5419, dw:norm() = 7.358, lr = 0.125, since beginning = 20 mins.
epoch = 7.195, train perp. = 49.562, wps = 5421, dw:norm() = 6.771, lr = 0.125, since beginning = 21 mins.
epoch = 7.295, train perp. = 49.002, wps = 5421, dw:norm() = 6.510, lr = 0.125, since beginning = 21 mins.
epoch = 7.395, train perp. = 48.450, wps = 5421, dw:norm() = 6.812, lr = 0.125, since beginning = 21 mins.
epoch = 7.495, train perp. = 47.900, wps = 5423, dw:norm() = 6.981, lr = 0.125, since beginning = 21 mins.
epoch = 7.594, train perp. = 47.336, wps = 5425, dw:norm() = 6.868, lr = 0.125, since beginning = 22 mins.
epoch = 7.694, train perp. = 46.778, wps = 5422, dw:norm() = 6.359, lr = 0.125, since beginning = 22 mins.
epoch = 7.794, train perp. = 46.206, wps = 5425, dw:norm() = 7.311, lr = 0.125, since beginning = 22 mins.
epoch = 7.894, train perp. = 45.625, wps = 5428, dw:norm() = 7.165, lr = 0.125, since beginning = 23 mins.
epoch = 7.994, train perp. = 45.032, wps = 5429, dw:norm() = 7.149, lr = 0.125, since beginning = 23 mins.
Validation set perplexity : 118.972
epoch = 8.094, train perp. = 44.728, wps = 5416, dw:norm() = 7.817, lr = 0.062, since beginning = 23 mins.
epoch = 8.194, train perp. = 44.462, wps = 5418, dw:norm() = 7.434, lr = 0.062, since beginning = 23 mins.
epoch = 8.294, train perp. = 44.187, wps = 5419, dw:norm() = 6.871, lr = 0.062, since beginning = 24 mins.
epoch = 8.393, train perp. = 43.917, wps = 5420, dw:norm() = 6.820, lr = 0.062, since beginning = 24 mins.
epoch = 8.493, train perp. = 43.657, wps = 5421, dw:norm() = 7.047, lr = 0.062, since beginning = 24 mins.
epoch = 8.593, train perp. = 43.383, wps = 5423, dw:norm() = 7.161, lr = 0.062, since beginning = 25 mins.
epoch = 8.693, train perp. = 43.103, wps = 5425, dw:norm() = 6.832, lr = 0.062, since beginning = 25 mins.
epoch = 8.793, train perp. = 42.809, wps = 5423, dw:norm() = 7.211, lr = 0.062, since beginning = 25 mins.
epoch = 8.893, train perp. = 42.515, wps = 5424, dw:norm() = 7.431, lr = 0.062, since beginning = 25 mins.
epoch = 8.993, train perp. = 42.220, wps = 5425, dw:norm() = 6.714, lr = 0.062, since beginning = 26 mins.
Validation set perplexity : 119.620
epoch = 9.093, train perp. = 42.067, wps = 5413, dw:norm() = 7.714, lr = 0.031, since beginning = 26 mins.
epoch = 9.192, train perp. = 41.931, wps = 5415, dw:norm() = 7.569, lr = 0.031, since beginning = 26 mins.
epoch = 9.292, train perp. = 41.789, wps = 5417, dw:norm() = 6.908, lr = 0.031, since beginning = 27 mins.
epoch = 9.392, train perp. = 41.648, wps = 5419, dw:norm() = 6.910, lr = 0.031, since beginning = 27 mins.
epoch = 9.492, train perp. = 41.519, wps = 5420, dw:norm() = 7.521, lr = 0.031, since beginning = 27 mins.
epoch = 9.592, train perp. = 41.382, wps = 5421, dw:norm() = 6.930, lr = 0.031, since beginning = 27 mins.
epoch = 9.692, train perp. = 41.238, wps = 5423, dw:norm() = 6.685, lr = 0.031, since beginning = 28 mins.
epoch = 9.792, train perp. = 41.085, wps = 5425, dw:norm() = 6.841, lr = 0.031, since beginning = 28 mins.
epoch = 9.892, train perp. = 40.930, wps = 5424, dw:norm() = 7.028, lr = 0.031, since beginning = 28 mins.
epoch = 9.991, train perp. = 40.772, wps = 5426, dw:norm() = 7.311, lr = 0.031, since beginning = 29 mins.
Validation set perplexity : 119.814
epoch = 10.091, train perp. = 40.693, wps = 5417, dw:norm() = 8.143, lr = 0.016, since beginning = 29 mins.
epoch = 10.191, train perp. = 40.615, wps = 5420, dw:norm() = 7.294, lr = 0.016, since beginning = 29 mins.
epoch = 10.291, train perp. = 40.536, wps = 5423, dw:norm() = 7.189, lr = 0.016, since beginning = 29 mins.
epoch = 10.391, train perp. = 40.459, wps = 5427, dw:norm() = 7.707, lr = 0.016, since beginning = 30 mins.
epoch = 10.491, train perp. = 40.387, wps = 5431, dw:norm() = 8.128, lr = 0.016, since beginning = 30 mins.
epoch = 10.591, train perp. = 40.315, wps = 5434, dw:norm() = 7.859, lr = 0.016, since beginning = 30 mins.
epoch = 10.690, train perp. = 40.237, wps = 5438, dw:norm() = 6.835, lr = 0.016, since beginning = 30 mins.
epoch = 10.790, train perp. = 40.157, wps = 5442, dw:norm() = 7.033, lr = 0.016, since beginning = 31 mins.
epoch = 10.890, train perp. = 40.070, wps = 5445, dw:norm() = 6.878, lr = 0.016, since beginning = 31 mins.
epoch = 10.990, train perp. = 39.981, wps = 5448, dw:norm() = 7.155, lr = 0.016, since beginning = 31 mins.
Validation set perplexity : 119.710
epoch = 11.090, train perp. = 39.933, wps = 5442, dw:norm() = 7.794, lr = 0.008, since beginning = 32 mins.
epoch = 11.190, train perp. = 39.889, wps = 5444, dw:norm() = 7.713, lr = 0.008, since beginning = 32 mins.
epoch = 11.290, train perp. = 39.845, wps = 5447, dw:norm() = 7.806, lr = 0.008, since beginning = 32 mins.
epoch = 11.390, train perp. = 39.800, wps = 5450, dw:norm() = 7.524, lr = 0.008, since beginning = 32 mins.
epoch = 11.489, train perp. = 39.758, wps = 5453, dw:norm() = 7.467, lr = 0.008, since beginning = 33 mins.
epoch = 11.589, train perp. = 39.716, wps = 5456, dw:norm() = 7.250, lr = 0.008, since beginning = 33 mins.
epoch = 11.689, train perp. = 39.670, wps = 5459, dw:norm() = 7.190, lr = 0.008, since beginning = 33 mins.
epoch = 11.789, train perp. = 39.627, wps = 5462, dw:norm() = 7.380, lr = 0.008, since beginning = 33 mins.
epoch = 11.889, train perp. = 39.577, wps = 5464, dw:norm() = 7.213, lr = 0.008, since beginning = 34 mins.
epoch = 11.989, train perp. = 39.527, wps = 5464, dw:norm() = 6.977, lr = 0.008, since beginning = 34 mins.
Validation set perplexity : 119.381
epoch = 12.089, train perp. = 39.496, wps = 5457, dw:norm() = 8.173, lr = 0.004, since beginning = 34 mins.
epoch = 12.189, train perp. = 39.472, wps = 5460, dw:norm() = 7.360, lr = 0.004, since beginning = 35 mins.
epoch = 12.288, train perp. = 39.448, wps = 5463, dw:norm() = 7.378, lr = 0.004, since beginning = 35 mins.
epoch = 12.388, train perp. = 39.423, wps = 5466, dw:norm() = 7.412, lr = 0.004, since beginning = 35 mins.
epoch = 12.488, train perp. = 39.399, wps = 5469, dw:norm() = 7.842, lr = 0.004, since beginning = 35 mins.
epoch = 12.588, train perp. = 39.377, wps = 5472, dw:norm() = 7.571, lr = 0.004, since beginning = 36 mins.
epoch = 12.688, train perp. = 39.350, wps = 5475, dw:norm() = 7.074, lr = 0.004, since beginning = 36 mins.
epoch = 12.788, train perp. = 39.327, wps = 5477, dw:norm() = 6.777, lr = 0.004, since beginning = 36 mins.
epoch = 12.888, train perp. = 39.300, wps = 5480, dw:norm() = 7.226, lr = 0.004, since beginning = 36 mins.
epoch = 12.988, train perp. = 39.273, wps = 5482, dw:norm() = 6.856, lr = 0.004, since beginning = 37 mins.
Validation set perplexity : 119.098
Test set perplexity : 114.470
Training is over.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment