Skip to content

Instantly share code, notes, and snippets.

@szagoruyko
Created May 5, 2015 13:33
Show Gist options
  • Save szagoruyko/b4b153f52e1a8a95bd16 to your computer and use it in GitHub Desktop.
Save szagoruyko/b4b153f52e1a8a95bd16 to your computer and use it in GitHub Desktop.
Using 1-th gpu
Loading ./data/ptb.train.txt, size of data = 929589
Loading ./data/ptb.valid.txt, size of data = 73760
Loading ./data/ptb.test.txt, size of data = 82430
Network parameters:
{
layers : 2
lr : 1
max_max_epoch : 13
max_grad_norm : 5
max_epoch : 4
init_weight : 0.1
decay : 2
vocab_size : 10000
seq_length : 20
batch_size : 20
dropout : 0
rnn_size : 200
}
Creating a RNN LSTM network.
Starting training.
epoch = 0.004, train perp. = 9917.064, wps = 4371, dw:norm() = 4.221, lr = 1.000, since beginning = 0 mins.
epoch = 0.104, train perp. = 7734.153, wps = 5802, dw:norm() = 3.223, lr = 1.000, since beginning = 0 mins.
epoch = 0.204, train perp. = 5736.531, wps = 5917, dw:norm() = 3.785, lr = 1.000, since beginning = 1 mins.
epoch = 0.304, train perp. = 4114.988, wps = 5958, dw:norm() = 4.435, lr = 1.000, since beginning = 1 mins.
epoch = 0.404, train perp. = 2893.520, wps = 5979, dw:norm() = 4.360, lr = 1.000, since beginning = 1 mins.
epoch = 0.504, train perp. = 2012.100, wps = 5992, dw:norm() = 4.767, lr = 1.000, since beginning = 1 mins.
epoch = 0.604, train perp. = 1371.406, wps = 6002, dw:norm() = 4.756, lr = 1.000, since beginning = 2 mins.
epoch = 0.703, train perp. = 931.551, wps = 6009, dw:norm() = 4.025, lr = 1.000, since beginning = 2 mins.
epoch = 0.803, train perp. = 628.879, wps = 6014, dw:norm() = 4.582, lr = 1.000, since beginning = 2 mins.
epoch = 0.903, train perp. = 419.396, wps = 6018, dw:norm() = 4.722, lr = 1.000, since beginning = 2 mins.
Validation set perplexity : 181.341
epoch = 1.003, train perp. = 279.842, wps = 5896, dw:norm() = 4.123, lr = 1.000, since beginning = 3 mins.
epoch = 1.103, train perp. = 236.158, wps = 5910, dw:norm() = 4.577, lr = 1.000, since beginning = 3 mins.
epoch = 1.203, train perp. = 212.386, wps = 5922, dw:norm() = 4.532, lr = 1.000, since beginning = 3 mins.
epoch = 1.303, train perp. = 194.677, wps = 5930, dw:norm() = 4.752, lr = 1.000, since beginning = 3 mins.
epoch = 1.402, train perp. = 181.680, wps = 5938, dw:norm() = 4.501, lr = 1.000, since beginning = 4 mins.
epoch = 1.502, train perp. = 171.193, wps = 5947, dw:norm() = 4.769, lr = 1.000, since beginning = 4 mins.
epoch = 1.602, train perp. = 162.405, wps = 5952, dw:norm() = 4.595, lr = 1.000, since beginning = 4 mins.
epoch = 1.702, train perp. = 155.512, wps = 5956, dw:norm() = 4.611, lr = 1.000, since beginning = 4 mins.
epoch = 1.802, train perp. = 149.490, wps = 5953, dw:norm() = 5.044, lr = 1.000, since beginning = 5 mins.
epoch = 1.902, train perp. = 143.469, wps = 5946, dw:norm() = 4.752, lr = 1.000, since beginning = 5 mins.
Validation set perplexity : 145.447
epoch = 2.002, train perp. = 138.542, wps = 5888, dw:norm() = 4.721, lr = 1.000, since beginning = 5 mins.
epoch = 2.102, train perp. = 133.725, wps = 5895, dw:norm() = 4.875, lr = 1.000, since beginning = 6 mins.
epoch = 2.201, train perp. = 129.933, wps = 5902, dw:norm() = 4.589, lr = 1.000, since beginning = 6 mins.
epoch = 2.301, train perp. = 126.118, wps = 5909, dw:norm() = 4.897, lr = 1.000, since beginning = 6 mins.
epoch = 2.401, train perp. = 122.758, wps = 5914, dw:norm() = 5.228, lr = 1.000, since beginning = 6 mins.
epoch = 2.501, train perp. = 119.639, wps = 5919, dw:norm() = 5.107, lr = 1.000, since beginning = 7 mins.
epoch = 2.601, train perp. = 116.710, wps = 5923, dw:norm() = 5.269, lr = 1.000, since beginning = 7 mins.
epoch = 2.701, train perp. = 114.168, wps = 5927, dw:norm() = 5.042, lr = 1.000, since beginning = 7 mins.
epoch = 2.801, train perp. = 111.823, wps = 5931, dw:norm() = 4.907, lr = 1.000, since beginning = 7 mins.
epoch = 2.901, train perp. = 109.392, wps = 5935, dw:norm() = 5.382, lr = 1.000, since beginning = 8 mins.
Validation set perplexity : 133.951
epoch = 3.000, train perp. = 107.199, wps = 5896, dw:norm() = 5.677, lr = 1.000, since beginning = 8 mins.
epoch = 3.100, train perp. = 104.977, wps = 5900, dw:norm() = 5.456, lr = 1.000, since beginning = 8 mins.
epoch = 3.200, train perp. = 103.157, wps = 5904, dw:norm() = 5.183, lr = 1.000, since beginning = 8 mins.
epoch = 3.300, train perp. = 101.241, wps = 5908, dw:norm() = 5.057, lr = 1.000, since beginning = 9 mins.
epoch = 3.400, train perp. = 99.492, wps = 5912, dw:norm() = 5.338, lr = 1.000, since beginning = 9 mins.
epoch = 3.500, train perp. = 97.796, wps = 5916, dw:norm() = 5.140, lr = 1.000, since beginning = 9 mins.
epoch = 3.600, train perp. = 96.223, wps = 5919, dw:norm() = 5.509, lr = 1.000, since beginning = 9 mins.
epoch = 3.700, train perp. = 94.739, wps = 5920, dw:norm() = 5.301, lr = 1.000, since beginning = 10 mins.
epoch = 3.799, train perp. = 93.312, wps = 5922, dw:norm() = 5.879, lr = 1.000, since beginning = 10 mins.
epoch = 3.899, train perp. = 91.883, wps = 5925, dw:norm() = 5.287, lr = 1.000, since beginning = 10 mins.
epoch = 3.999, train perp. = 90.557, wps = 5928, dw:norm() = 5.871, lr = 1.000, since beginning = 10 mins.
Validation set perplexity : 129.228
epoch = 4.099, train perp. = 89.258, wps = 5900, dw:norm() = 6.362, lr = 1.000, since beginning = 11 mins.
epoch = 4.199, train perp. = 88.041, wps = 5904, dw:norm() = 5.872, lr = 1.000, since beginning = 11 mins.
epoch = 4.299, train perp. = 86.847, wps = 5907, dw:norm() = 5.515, lr = 1.000, since beginning = 11 mins.
epoch = 4.399, train perp. = 85.725, wps = 5910, dw:norm() = 5.669, lr = 1.000, since beginning = 12 mins.
epoch = 4.498, train perp. = 84.619, wps = 5912, dw:norm() = 6.013, lr = 1.000, since beginning = 12 mins.
epoch = 4.598, train perp. = 83.630, wps = 5915, dw:norm() = 5.475, lr = 1.000, since beginning = 12 mins.
epoch = 4.698, train perp. = 82.630, wps = 5917, dw:norm() = 5.926, lr = 1.000, since beginning = 12 mins.
epoch = 4.798, train perp. = 81.651, wps = 5920, dw:norm() = 6.337, lr = 1.000, since beginning = 13 mins.
epoch = 4.898, train perp. = 80.736, wps = 5922, dw:norm() = 5.998, lr = 1.000, since beginning = 13 mins.
epoch = 4.998, train perp. = 79.884, wps = 5924, dw:norm() = 5.797, lr = 1.000, since beginning = 13 mins.
Validation set perplexity : 127.408
epoch = 5.098, train perp. = 78.803, wps = 5901, dw:norm() = 6.709, lr = 0.500, since beginning = 13 mins.
epoch = 5.198, train perp. = 77.477, wps = 5903, dw:norm() = 5.780, lr = 0.500, since beginning = 14 mins.
epoch = 5.297, train perp. = 75.903, wps = 5906, dw:norm() = 6.548, lr = 0.500, since beginning = 14 mins.
epoch = 5.397, train perp. = 74.283, wps = 5898, dw:norm() = 5.748, lr = 0.500, since beginning = 14 mins.
epoch = 5.497, train perp. = 72.535, wps = 5885, dw:norm() = 6.706, lr = 0.500, since beginning = 14 mins.
epoch = 5.597, train perp. = 70.703, wps = 5884, dw:norm() = 6.130, lr = 0.500, since beginning = 15 mins.
epoch = 5.697, train perp. = 68.802, wps = 5886, dw:norm() = 6.020, lr = 0.500, since beginning = 15 mins.
epoch = 5.797, train perp. = 66.852, wps = 5889, dw:norm() = 6.667, lr = 0.500, since beginning = 15 mins.
epoch = 5.897, train perp. = 64.812, wps = 5891, dw:norm() = 6.888, lr = 0.500, since beginning = 15 mins.
epoch = 5.997, train perp. = 62.675, wps = 5894, dw:norm() = 6.477, lr = 0.500, since beginning = 16 mins.
Validation set perplexity : 120.604
epoch = 6.096, train perp. = 61.519, wps = 5876, dw:norm() = 7.066, lr = 0.250, since beginning = 16 mins.
epoch = 6.196, train perp. = 60.450, wps = 5878, dw:norm() = 6.206, lr = 0.250, since beginning = 16 mins.
epoch = 6.296, train perp. = 59.331, wps = 5881, dw:norm() = 6.496, lr = 0.250, since beginning = 17 mins.
epoch = 6.396, train perp. = 58.248, wps = 5883, dw:norm() = 6.420, lr = 0.250, since beginning = 17 mins.
epoch = 6.496, train perp. = 57.159, wps = 5885, dw:norm() = 6.532, lr = 0.250, since beginning = 17 mins.
epoch = 6.596, train perp. = 56.044, wps = 5888, dw:norm() = 6.597, lr = 0.250, since beginning = 17 mins.
epoch = 6.696, train perp. = 54.947, wps = 5888, dw:norm() = 6.321, lr = 0.250, since beginning = 18 mins.
epoch = 6.796, train perp. = 53.831, wps = 5889, dw:norm() = 6.719, lr = 0.250, since beginning = 18 mins.
epoch = 6.895, train perp. = 52.683, wps = 5892, dw:norm() = 7.492, lr = 0.250, since beginning = 18 mins.
epoch = 6.995, train perp. = 51.514, wps = 5894, dw:norm() = 6.759, lr = 0.250, since beginning = 18 mins.
Validation set perplexity : 119.888
epoch = 7.095, train perp. = 50.914, wps = 5878, dw:norm() = 7.124, lr = 0.125, since beginning = 19 mins.
epoch = 7.195, train perp. = 50.357, wps = 5880, dw:norm() = 6.863, lr = 0.125, since beginning = 19 mins.
epoch = 7.295, train perp. = 49.787, wps = 5882, dw:norm() = 6.866, lr = 0.125, since beginning = 19 mins.
epoch = 7.395, train perp. = 49.225, wps = 5884, dw:norm() = 6.902, lr = 0.125, since beginning = 19 mins.
epoch = 7.495, train perp. = 48.669, wps = 5886, dw:norm() = 6.994, lr = 0.125, since beginning = 20 mins.
epoch = 7.594, train perp. = 48.101, wps = 5887, dw:norm() = 7.311, lr = 0.125, since beginning = 20 mins.
epoch = 7.694, train perp. = 47.532, wps = 5889, dw:norm() = 6.200, lr = 0.125, since beginning = 20 mins.
epoch = 7.794, train perp. = 46.964, wps = 5891, dw:norm() = 7.147, lr = 0.125, since beginning = 20 mins.
epoch = 7.894, train perp. = 46.371, wps = 5890, dw:norm() = 7.208, lr = 0.125, since beginning = 21 mins.
epoch = 7.994, train perp. = 45.763, wps = 5866, dw:norm() = 7.007, lr = 0.125, since beginning = 21 mins.
Validation set perplexity : 120.694
epoch = 8.094, train perp. = 45.463, wps = 5836, dw:norm() = 6.953, lr = 0.062, since beginning = 21 mins.
epoch = 8.194, train perp. = 45.191, wps = 5838, dw:norm() = 7.360, lr = 0.062, since beginning = 22 mins.
epoch = 8.294, train perp. = 44.910, wps = 5840, dw:norm() = 6.803, lr = 0.062, since beginning = 22 mins.
epoch = 8.393, train perp. = 44.628, wps = 5843, dw:norm() = 7.038, lr = 0.062, since beginning = 22 mins.
epoch = 8.493, train perp. = 44.356, wps = 5844, dw:norm() = 7.071, lr = 0.062, since beginning = 23 mins.
epoch = 8.593, train perp. = 44.078, wps = 5847, dw:norm() = 7.185, lr = 0.062, since beginning = 23 mins.
epoch = 8.693, train perp. = 43.787, wps = 5849, dw:norm() = 6.847, lr = 0.062, since beginning = 23 mins.
epoch = 8.793, train perp. = 43.503, wps = 5851, dw:norm() = 7.154, lr = 0.062, since beginning = 23 mins.
epoch = 8.893, train perp. = 43.206, wps = 5853, dw:norm() = 7.368, lr = 0.062, since beginning = 24 mins.
epoch = 8.993, train perp. = 42.904, wps = 5855, dw:norm() = 6.851, lr = 0.062, since beginning = 24 mins.
Validation set perplexity : 121.126
epoch = 9.093, train perp. = 42.749, wps = 5843, dw:norm() = 7.673, lr = 0.031, since beginning = 24 mins.
epoch = 9.192, train perp. = 42.609, wps = 5845, dw:norm() = 7.562, lr = 0.031, since beginning = 24 mins.
epoch = 9.292, train perp. = 42.467, wps = 5846, dw:norm() = 7.170, lr = 0.031, since beginning = 25 mins.
epoch = 9.392, train perp. = 42.326, wps = 5848, dw:norm() = 7.043, lr = 0.031, since beginning = 25 mins.
epoch = 9.492, train perp. = 42.182, wps = 5850, dw:norm() = 7.302, lr = 0.031, since beginning = 25 mins.
epoch = 9.592, train perp. = 42.040, wps = 5851, dw:norm() = 6.767, lr = 0.031, since beginning = 25 mins.
epoch = 9.692, train perp. = 41.890, wps = 5853, dw:norm() = 6.562, lr = 0.031, since beginning = 26 mins.
epoch = 9.792, train perp. = 41.741, wps = 5854, dw:norm() = 7.022, lr = 0.031, since beginning = 26 mins.
epoch = 9.892, train perp. = 41.587, wps = 5856, dw:norm() = 6.936, lr = 0.031, since beginning = 26 mins.
epoch = 9.991, train perp. = 41.427, wps = 5857, dw:norm() = 7.196, lr = 0.031, since beginning = 26 mins.
Validation set perplexity : 121.279
epoch = 10.091, train perp. = 41.344, wps = 5847, dw:norm() = 7.943, lr = 0.016, since beginning = 27 mins.
epoch = 10.191, train perp. = 41.268, wps = 5848, dw:norm() = 7.833, lr = 0.016, since beginning = 27 mins.
epoch = 10.291, train perp. = 41.191, wps = 5850, dw:norm() = 6.838, lr = 0.016, since beginning = 27 mins.
epoch = 10.391, train perp. = 41.117, wps = 5851, dw:norm() = 7.905, lr = 0.016, since beginning = 27 mins.
epoch = 10.491, train perp. = 41.040, wps = 5853, dw:norm() = 7.777, lr = 0.016, since beginning = 28 mins.
epoch = 10.591, train perp. = 40.963, wps = 5855, dw:norm() = 7.496, lr = 0.016, since beginning = 28 mins.
epoch = 10.690, train perp. = 40.883, wps = 5856, dw:norm() = 6.996, lr = 0.016, since beginning = 28 mins.
epoch = 10.790, train perp. = 40.800, wps = 5858, dw:norm() = 6.884, lr = 0.016, since beginning = 29 mins.
epoch = 10.890, train perp. = 40.712, wps = 5859, dw:norm() = 6.615, lr = 0.016, since beginning = 29 mins.
epoch = 10.990, train perp. = 40.622, wps = 5861, dw:norm() = 7.243, lr = 0.016, since beginning = 29 mins.
Validation set perplexity : 121.220
epoch = 11.090, train perp. = 40.575, wps = 5851, dw:norm() = 7.736, lr = 0.008, since beginning = 29 mins.
epoch = 11.190, train perp. = 40.533, wps = 5852, dw:norm() = 7.366, lr = 0.008, since beginning = 30 mins.
epoch = 11.290, train perp. = 40.489, wps = 5854, dw:norm() = 7.565, lr = 0.008, since beginning = 30 mins.
epoch = 11.390, train perp. = 40.446, wps = 5855, dw:norm() = 7.204, lr = 0.008, since beginning = 30 mins.
epoch = 11.489, train perp. = 40.404, wps = 5857, dw:norm() = 7.290, lr = 0.008, since beginning = 30 mins.
epoch = 11.589, train perp. = 40.362, wps = 5858, dw:norm() = 7.373, lr = 0.008, since beginning = 31 mins.
epoch = 11.689, train perp. = 40.317, wps = 5860, dw:norm() = 7.427, lr = 0.008, since beginning = 31 mins.
epoch = 11.789, train perp. = 40.269, wps = 5861, dw:norm() = 7.377, lr = 0.008, since beginning = 31 mins.
epoch = 11.889, train perp. = 40.218, wps = 5862, dw:norm() = 7.501, lr = 0.008, since beginning = 31 mins.
epoch = 11.989, train perp. = 40.167, wps = 5864, dw:norm() = 7.092, lr = 0.008, since beginning = 32 mins.
Validation set perplexity : 120.997
epoch = 12.089, train perp. = 40.137, wps = 5855, dw:norm() = 8.201, lr = 0.004, since beginning = 32 mins.
epoch = 12.189, train perp. = 40.113, wps = 5856, dw:norm() = 6.993, lr = 0.004, since beginning = 32 mins.
epoch = 12.288, train perp. = 40.090, wps = 5858, dw:norm() = 7.350, lr = 0.004, since beginning = 32 mins.
epoch = 12.388, train perp. = 40.065, wps = 5859, dw:norm() = 7.076, lr = 0.004, since beginning = 33 mins.
epoch = 12.488, train perp. = 40.042, wps = 5860, dw:norm() = 7.277, lr = 0.004, since beginning = 33 mins.
epoch = 12.588, train perp. = 40.020, wps = 5862, dw:norm() = 7.907, lr = 0.004, since beginning = 33 mins.
epoch = 12.688, train perp. = 39.995, wps = 5863, dw:norm() = 7.579, lr = 0.004, since beginning = 34 mins.
epoch = 12.788, train perp. = 39.967, wps = 5864, dw:norm() = 6.889, lr = 0.004, since beginning = 34 mins.
epoch = 12.888, train perp. = 39.940, wps = 5865, dw:norm() = 7.454, lr = 0.004, since beginning = 34 mins.
epoch = 12.988, train perp. = 39.912, wps = 5866, dw:norm() = 6.979, lr = 0.004, since beginning = 34 mins.
Validation set perplexity : 120.722
Test set perplexity : 116.110
Training is over.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment