Skip to content

Instantly share code, notes, and snippets.

@mrdrozdov
Last active October 19, 2016 12:51
Show Gist options
  • Select an option

  • Save mrdrozdov/ec63f356f077cbd790e5c464c2e77c84 to your computer and use it in GitHub Desktop.

Select an option

Save mrdrozdov/ec63f356f077cbd790e5c464c2e77c84 to your computer and use it in GitHub Desktop.
SPINN Chainer Logs
$ python -m spinn.models.fat_classifier \
--batch_size 32 \
--ckpt_path rnn_scratch_6 \
--connect_tracking_comp \
--data_type snli \
--embedding_data_path ../glove/glove.840B.300d.txt \
--embedding_keep_rate 0.9 \
--eval_data_path ../snli_1.0/snli_1.0_dev.jsonl \
--eval_seq_length 25 \
--experiment_name model0_scratch_6 \
--log_path ../logs_chainer \
--model_dim 300 \
--model_type SPINN \
--seq_length 25 \
--training_data_path ../snli_1.0/snli_1.0_train.jsonl \
--word_embedding_dim 300 \
--eval_interval_steps 1500
[1] Flag values:
{ '?': None,
'allow_gt_transitions_in_eval': False,
'batch_size': 32,
'ckpt_interval_steps': 5000,
'ckpt_on_best_dev_error': True,
'ckpt_path': 'rnn_scratch_6',
'classifier_type': 'MLP',
'clipping_max_value': 5.0,
'connect_tracking_comp': True,
'context_sensitive_shift': False,
'context_sensitive_use_relu': False,
'data_type': 'snli',
'embedding_data_path': '../glove/glove.840B.300d.txt',
'embedding_keep_rate': 0.9,
'eval_data_limit': -1,
'eval_data_path': '../snli_1.0/snli_1.0_dev.jsonl',
'eval_interval_steps': 1500,
'eval_output_paths': None,
'eval_seq_length': 25,
'expanded_eval_only_mode': False,
'experiment_name': 'model0_scratch_6',
'gpu': -1,
'help': None,
'helpshort': None,
'helpxml': None,
'init_range': 0.005,
'initialize_hyp_tracking_state': False,
'l2_lambda': 1e-05,
'learning_rate': 0.001,
'learning_rate_decay_per_10k_steps': 0.75,
'log_path': '../logs_chainer',
'lstm_composition': True,
'model_dim': 300,
'model_type': 'SPINN',
'num_sentence_pair_combination_layers': 2,
'predict_use_cell': False,
'resnet_unit_depth': 2,
'scheduled_sampling_exponent_base': 0.99,
'semantic_classifier_keep_rate': 0.5,
'sentence_pair_combination_layer_dim': 1024,
'seq_length': 25,
'skip_saved_unsavables': False,
'statistics_interval_steps': 100,
'tracking_lstm_hidden_dim': 4,
'training_data_path': '../snli_1.0/snli_1.0_train.jsonl',
'training_steps': 500000,
'transition_cost_scale': 1.0,
'use_difference_feature': True,
'use_gru': False,
'use_product_feature': True,
'use_tracking_lstm': True,
'word_embedding_dim': 300,
'write_predicted_label': False}
Loading ../snli_1.0/snli_1.0_train.jsonl
Loading ../snli_1.0/snli_1.0_dev.jsonl
[1] In open vocabulary mode. Using loaded embeddings without fine-tuning.
[1] Constructing vocabulary...
[1] Found 36675 word types.
[1] Loading vocabulary with 32589 words from ../glove/glove.840B.300d.txt
[1] Preprocessing training data.
[1] Preprocessing eval data: ../snli_1.0/snli_1.0_dev.jsonl
WARNING: May be discarding eval examples.
Skipping 18 examples.
[1] Building model.
Init: /x2h_hypothesis/reduce/treelstm/W_l/W:(750, 150)
Init: /x2h_hypothesis/reduce/treelstm/W_l/b:(750,)
Init: /x2h_hypothesis/reduce/treelstm/W_r/W:(750, 150)
Init: /x2h_hypothesis/reduce/treelstm/W_r/b:(750,)
Init: /h2y/0/W:(1024, 600)
Init: /h2y/0/b:(1024,)
Init: /h2y/1/W:(512, 1024)
Init: /h2y/1/b:(512,)
Init: /h2y/2/W:(3, 512)
Init: /h2y/2/b:(3,)
Init: /projection/W:(300, 300)
Init: /projection/b:(300,)
Init: /x2h_premise/reduce/treelstm/W_l/W:(750, 150)
Init: /x2h_premise/reduce/treelstm/W_l/b:(750,)
Init: /x2h_premise/reduce/treelstm/W_r/W:(750, 150)
Init: /x2h_premise/reduce/treelstm/W_r/b:(750,)
[1] Training.
Training [ ] 1% 2s / 290s
[1] Step: 0 Acc: 0.250000 0.000000 Cost: 1.877214 1.877214 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 100 Acc: 0.500000 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 200 Acc: 0.281250 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 300 Acc: 0.343750 0.000000 Cost: 1.092432 1.092432 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 400 Acc: 0.281250 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 500 Acc: 0.281250 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 600 Acc: 0.343750 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 700 Acc: 0.531250 0.000000 Cost: 1.094383 1.094383 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 800 Acc: 0.250000 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 900 Acc: 0.375000 0.000000 Cost: 1.116331 1.116331 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 1000 Acc: 0.375000 0.000000 Cost: 1.107622 1.107622 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 1100 Acc: 0.531250 0.000000 Cost: 1.097006 1.097006 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 1200 Acc: 0.281250 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 1300 Acc: 0.406250 0.000000 Cost: 1.086858 1.086858 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 1400 Acc: 0.375000 0.000000 Cost: 1.077801 1.077801 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 1500 Acc: 0.437500 0.000000 Cost: 1.106823 1.106823 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 214s / 214s
[1] Step: 1500 Eval acc: 0.355252 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.355252
Training [================================================================================] 100% 305s / 305s
[1] Step: 1600 Acc: 0.343750 0.000000 Cost: 1.085510 1.085510 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 1700 Acc: 0.375000 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 1800 Acc: 0.375000 0.000000 Cost: 1.081344 1.081344 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 1900 Acc: 0.187500 0.000000 Cost: 1.126509 1.126509 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2000 Acc: 0.375000 0.000000 Cost: 1.099937 1.099937 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2100 Acc: 0.218750 0.000000 Cost: 1.087157 1.087157 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2200 Acc: 0.218750 0.000000 Cost: 1.114965 1.114965 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 2300 Acc: 0.406250 0.000000 Cost: 1.075331 1.075331 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2400 Acc: 0.156250 0.000000 Cost: 1.111402 1.111402 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2500 Acc: 0.343750 0.000000 Cost: 1.111017 1.111017 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2600 Acc: 0.437500 0.000000 Cost: 1.078955 1.078955 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 2700 Acc: 0.375000 0.000000 Cost: 1.085429 1.085429 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 2800 Acc: 0.531250 0.000000 Cost: 1.122796 1.122796 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 2900 Acc: 0.375000 0.000000 Cost: 1.146852 1.146852 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 3000 Acc: 0.312500 0.000000 Cost: 1.078576 1.078576 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 214s / 214s
[1] Step: 3000 Eval acc: 0.379988 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.379988
Training [================================================================================] 100% 305s / 305s
[1] Step: 3100 Acc: 0.468750 0.000000 Cost: 1.073898 1.073898 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 3200 Acc: 0.218750 0.000000 Cost: 1.082206 1.082206 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 3300 Acc: 0.468750 0.000000 Cost: 1.063477 1.063477 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 3400 Acc: 0.343750 0.000000 Cost: 1.099557 1.099557 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 3500 Acc: 0.437500 0.000000 Cost: 1.099835 1.099835 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 3600 Acc: 0.593750 0.000000 Cost: 1.078814 1.078814 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 3700 Acc: 0.437500 0.000000 Cost: 1.119319 1.119319 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 3800 Acc: 0.312500 0.000000 Cost: 1.119129 1.119129 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 3900 Acc: 0.406250 0.000000 Cost: 1.069315 1.069315 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 4000 Acc: 0.312500 0.000000 Cost: 1.098226 1.098226 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 4100 Acc: 0.250000 0.000000 Cost: 1.096237 1.096237 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 4200 Acc: 0.312500 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 4300 Acc: 0.406250 0.000000 Cost: 1.106863 1.106863 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 4400 Acc: 0.218750 0.000000 Cost: 1.085950 1.085950 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 4500 Acc: 0.406250 0.000000 Cost: 1.098612 1.098612 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 4500 Eval acc: 0.372252 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
Training [================================================================================] 100% 307s / 307s
[1] Step: 4600 Acc: 0.500000 0.000000 Cost: 1.084996 1.084996 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 4700 Acc: 0.531250 0.000000 Cost: 1.102973 1.102973 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 4800 Acc: 0.531250 0.000000 Cost: 1.077647 1.077647 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 4900 Acc: 0.375000 0.000000 Cost: 1.076459 1.076459 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5000 Acc: 0.218750 0.000000 Cost: 1.111341 1.111341 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5100 Acc: 0.375000 0.000000 Cost: 1.095690 1.095690 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 5200 Acc: 0.281250 0.000000 Cost: 1.081936 1.081936 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5300 Acc: 0.281250 0.000000 Cost: 1.111412 1.111412 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5400 Acc: 0.437500 0.000000 Cost: 1.083025 1.083025 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5500 Acc: 0.375000 0.000000 Cost: 1.100099 1.100099 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 5600 Acc: 0.218750 0.000000 Cost: 1.141485 1.141485 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5700 Acc: 0.187500 0.000000 Cost: 1.095224 1.095224 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 5800 Acc: 0.312500 0.000000 Cost: 1.134330 1.134330 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 5900 Acc: 0.375000 0.000000 Cost: 1.098454 1.098454 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 6000 Acc: 0.250000 0.000000 Cost: 1.100114 1.100114 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 217s / 217s
[1] Step: 6000 Eval acc: 0.378054 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
Training [================================================================================] 100% 308s / 308s
[1] Step: 6100 Acc: 0.312500 0.000000 Cost: 1.093293 1.093293 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 6200 Acc: 0.187500 0.000000 Cost: 1.100203 1.100203 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 6300 Acc: 0.500000 0.000000 Cost: 1.123378 1.123378 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 6400 Acc: 0.343750 0.000000 Cost: 1.093215 1.093215 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 6500 Acc: 0.406250 0.000000 Cost: 1.065132 1.065132 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 6600 Acc: 0.250000 0.000000 Cost: 1.082592 1.082592 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 6700 Acc: 0.406250 0.000000 Cost: 1.082443 1.082443 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 6800 Acc: 0.406250 0.000000 Cost: 1.126872 1.126872 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 6900 Acc: 0.406250 0.000000 Cost: 1.076969 1.076969 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 7000 Acc: 0.437500 0.000000 Cost: 1.078647 1.078647 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 7100 Acc: 0.468750 0.000000 Cost: 1.097078 1.097078 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 7200 Acc: 0.312500 0.000000 Cost: 1.095784 1.095784 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 7300 Acc: 0.406250 0.000000 Cost: 1.082036 1.082036 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 7400 Acc: 0.437500 0.000000 Cost: 1.095886 1.095886 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 7500 Acc: 0.437500 0.000000 Cost: 1.089329 1.089329 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 7500 Eval acc: 0.389963 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.389963
Training [================================================================================] 100% 306s / 306s
[1] Step: 7600 Acc: 0.312500 0.000000 Cost: 1.137161 1.137161 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 7700 Acc: 0.312500 0.000000 Cost: 1.111806 1.111806 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 7800 Acc: 0.406250 0.000000 Cost: 1.106893 1.106893 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 7900 Acc: 0.468750 0.000000 Cost: 1.146132 1.146132 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 8000 Acc: 0.375000 0.000000 Cost: 1.098773 1.098773 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8100 Acc: 0.531250 0.000000 Cost: 1.071780 1.071780 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8200 Acc: 0.468750 0.000000 Cost: 1.088614 1.088614 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8300 Acc: 0.343750 0.000000 Cost: 1.091694 1.091694 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 8400 Acc: 0.312500 0.000000 Cost: 1.089461 1.089461 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8500 Acc: 0.406250 0.000000 Cost: 1.158216 1.158216 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8600 Acc: 0.281250 0.000000 Cost: 1.092488 1.092488 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8700 Acc: 0.281250 0.000000 Cost: 1.091191 1.091191 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 8800 Acc: 0.406250 0.000000 Cost: 1.115413 1.115413 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 8900 Acc: 0.281250 0.000000 Cost: 1.089024 1.089024 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 9000 Acc: 0.406250 0.000000 Cost: 1.067368 1.067368 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 9000 Eval acc: 0.398921 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.398921
Training [================================================================================] 100% 306s / 306s
[1] Step: 9100 Acc: 0.375000 0.000000 Cost: 1.100011 1.100011 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 9200 Acc: 0.468750 0.000000 Cost: 1.090692 1.090692 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 9300 Acc: 0.250000 0.000000 Cost: 1.133367 1.133367 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 9400 Acc: 0.343750 0.000000 Cost: 1.103071 1.103071 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 9500 Acc: 0.406250 0.000000 Cost: 1.098316 1.098316 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 9600 Acc: 0.375000 0.000000 Cost: 1.082073 1.082073 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 9700 Acc: 0.531250 0.000000 Cost: 1.090313 1.090313 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 9800 Acc: 0.343750 0.000000 Cost: 1.108983 1.108983 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 9900 Acc: 0.500000 0.000000 Cost: 1.087912 1.087912 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 10000 Acc: 0.375000 0.000000 Cost: 1.114786 1.114786 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 10100 Acc: 0.375000 0.000000 Cost: 1.116527 1.116527 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 10200 Acc: 0.500000 0.000000 Cost: 1.058278 1.058278 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 10300 Acc: 0.437500 0.000000 Cost: 1.059135 1.059135 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 10400 Acc: 0.281250 0.000000 Cost: 1.138080 1.138080 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 10500 Acc: 0.468750 0.000000 Cost: 1.091681 1.091681 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 214s / 214s
[1] Step: 10500 Eval acc: 0.408795 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.408795
Training [================================================================================] 100% 306s / 306s
[1] Step: 10600 Acc: 0.281250 0.000000 Cost: 1.115492 1.115492 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 10700 Acc: 0.468750 0.000000 Cost: 1.061459 1.061459 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 10800 Acc: 0.312500 0.000000 Cost: 1.120672 1.120672 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 10900 Acc: 0.406250 0.000000 Cost: 1.087863 1.087863 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11000 Acc: 0.406250 0.000000 Cost: 1.071164 1.071164 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 11100 Acc: 0.500000 0.000000 Cost: 1.055906 1.055906 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 11200 Acc: 0.406250 0.000000 Cost: 1.054652 1.054652 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 11300 Acc: 0.375000 0.000000 Cost: 1.050360 1.050360 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11400 Acc: 0.468750 0.000000 Cost: 1.125689 1.125689 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11500 Acc: 0.281250 0.000000 Cost: 1.134853 1.134853 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11600 Acc: 0.406250 0.000000 Cost: 1.078301 1.078301 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 11700 Acc: 0.468750 0.000000 Cost: 1.062546 1.062546 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11800 Acc: 0.500000 0.000000 Cost: 1.082999 1.082999 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 11900 Acc: 0.468750 0.000000 Cost: 1.060507 1.060507 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12000 Acc: 0.375000 0.000000 Cost: 1.092496 1.092496 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 215s / 215s
[1] Step: 12000 Eval acc: 0.401059 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
Training [================================================================================] 100% 306s / 306s
[1] Step: 12100 Acc: 0.312500 0.000000 Cost: 1.094483 1.094483 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12200 Acc: 0.343750 0.000000 Cost: 1.041826 1.041826 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 12300 Acc: 0.406250 0.000000 Cost: 1.068268 1.068268 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12400 Acc: 0.500000 0.000000 Cost: 1.064596 1.064596 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 12500 Acc: 0.281250 0.000000 Cost: 1.098655 1.098655 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12600 Acc: 0.468750 0.000000 Cost: 1.038236 1.038236 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 12700 Acc: 0.468750 0.000000 Cost: 1.054318 1.054318 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12800 Acc: 0.375000 0.000000 Cost: 1.083427 1.083427 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 12900 Acc: 0.437500 0.000000 Cost: 1.060724 1.060724 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 13000 Acc: 0.312500 0.000000 Cost: 1.084953 1.084953 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 13100 Acc: 0.406250 0.000000 Cost: 1.072052 1.072052 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 13200 Acc: 0.218750 0.000000 Cost: 1.081473 1.081473 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 13300 Acc: 0.312500 0.000000 Cost: 1.101915 1.101915 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 13400 Acc: 0.375000 0.000000 Cost: 1.060277 1.060277 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 13500 Acc: 0.375000 0.000000 Cost: 1.108615 1.108615 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 13500 Eval acc: 0.411543 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
Training [================================================================================] 100% 307s / 307s
[1] Step: 13600 Acc: 0.531250 0.000000 Cost: 1.050899 1.050899 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 13700 Acc: 0.437500 0.000000 Cost: 1.019719 1.019719 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 13800 Acc: 0.343750 0.000000 Cost: 1.073750 1.073750 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 13900 Acc: 0.406250 0.000000 Cost: 1.090007 1.090007 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14000 Acc: 0.437500 0.000000 Cost: 1.108856 1.108856 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14100 Acc: 0.406250 0.000000 Cost: 1.110420 1.110420 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 14200 Acc: 0.375000 0.000000 Cost: 1.070153 1.070153 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14300 Acc: 0.437500 0.000000 Cost: 1.076140 1.076140 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 14400 Acc: 0.343750 0.000000 Cost: 1.143887 1.143887 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14500 Acc: 0.437500 0.000000 Cost: 1.045461 1.045461 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 14600 Acc: 0.312500 0.000000 Cost: 1.074042 1.074042 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14700 Acc: 0.406250 0.000000 Cost: 1.069158 1.069158 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 14800 Acc: 0.468750 0.000000 Cost: 1.077461 1.077461 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 14900 Acc: 0.437500 0.000000 Cost: 1.067259 1.067259 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15000 Acc: 0.375000 0.000000 Cost: 1.094726 1.094726 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 15000 Eval acc: 0.415106 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.415106
Training [================================================================================] 100% 305s / 305s
[1] Step: 15100 Acc: 0.500000 0.000000 Cost: 1.030034 1.030034 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15200 Acc: 0.406250 0.000000 Cost: 1.082673 1.082673 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15300 Acc: 0.218750 0.000000 Cost: 1.163438 1.163438 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 15400 Acc: 0.468750 0.000000 Cost: 1.115485 1.115485 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 15500 Acc: 0.406250 0.000000 Cost: 1.103464 1.103464 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15600 Acc: 0.281250 0.000000 Cost: 1.064159 1.064159 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15700 Acc: 0.437500 0.000000 Cost: 1.115651 1.115651 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 15800 Acc: 0.500000 0.000000 Cost: 1.065050 1.065050 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 15900 Acc: 0.375000 0.000000 Cost: 1.072550 1.072550 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16000 Acc: 0.375000 0.000000 Cost: 1.090301 1.090301 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16100 Acc: 0.343750 0.000000 Cost: 1.105175 1.105175 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16200 Acc: 0.406250 0.000000 Cost: 1.021194 1.021194 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 16300 Acc: 0.312500 0.000000 Cost: 1.087172 1.087172 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16400 Acc: 0.437500 0.000000 Cost: 1.117678 1.117678 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16500 Acc: 0.468750 0.000000 Cost: 1.065603 1.065603 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 215s / 215s
[1] Step: 16500 Eval acc: 0.429662 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.429662
Training [================================================================================] 100% 306s / 306s
[1] Step: 16600 Acc: 0.343750 0.000000 Cost: 1.140348 1.140348 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16700 Acc: 0.281250 0.000000 Cost: 1.072178 1.072178 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 16800 Acc: 0.468750 0.000000 Cost: 1.117922 1.117922 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 16900 Acc: 0.406250 0.000000 Cost: 1.084913 1.084913 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17000 Acc: 0.406250 0.000000 Cost: 1.052146 1.052146 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17100 Acc: 0.468750 0.000000 Cost: 1.055959 1.055959 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17200 Acc: 0.468750 0.000000 Cost: 1.065861 1.065861 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17300 Acc: 0.468750 0.000000 Cost: 1.028856 1.028856 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 17400 Acc: 0.468750 0.000000 Cost: 1.130354 1.130354 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17500 Acc: 0.312500 0.000000 Cost: 1.116898 1.116898 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17600 Acc: 0.468750 0.000000 Cost: 1.073465 1.073465 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 17700 Acc: 0.500000 0.000000 Cost: 1.066458 1.066458 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 17800 Acc: 0.500000 0.000000 Cost: 1.065701 1.065701 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 17900 Acc: 0.406250 0.000000 Cost: 0.994078 0.994078 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18000 Acc: 0.437500 0.000000 Cost: 1.077106 1.077106 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 214s / 214s
[1] Step: 18000 Eval acc: 0.435566 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.435566
Training [================================================================================] 100% 306s / 306s
[1] Step: 18100 Acc: 0.531250 0.000000 Cost: 1.078030 1.078030 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 18200 Acc: 0.343750 0.000000 Cost: 1.056519 1.056519 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18300 Acc: 0.500000 0.000000 Cost: 1.041512 1.041512 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 18400 Acc: 0.531250 0.000000 Cost: 1.099308 1.099308 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18500 Acc: 0.468750 0.000000 Cost: 1.080408 1.080408 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18600 Acc: 0.406250 0.000000 Cost: 1.034689 1.034689 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18700 Acc: 0.500000 0.000000 Cost: 1.078049 1.078049 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 18800 Acc: 0.343750 0.000000 Cost: 1.034947 1.034947 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 18900 Acc: 0.531250 0.000000 Cost: 1.074095 1.074095 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 19000 Acc: 0.437500 0.000000 Cost: 1.094167 1.094167 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 19100 Acc: 0.437500 0.000000 Cost: 1.100367 1.100367 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 19200 Acc: 0.562500 0.000000 Cost: 1.015020 1.015020 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 19300 Acc: 0.625000 0.000000 Cost: 0.998726 0.998726 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 19400 Acc: 0.500000 0.000000 Cost: 1.068777 1.068777 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 19500 Acc: 0.437500 0.000000 Cost: 1.110436 1.110436 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 214s / 214s
[1] Step: 19500 Eval acc: 0.457655 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.457655
Training [================================================================================] 100% 305s / 305s
[1] Step: 19600 Acc: 0.437500 0.000000 Cost: 1.050200 1.050200 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 19700 Acc: 0.437500 0.000000 Cost: 1.083301 1.083301 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 19800 Acc: 0.375000 0.000000 Cost: 1.112352 1.112352 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 19900 Acc: 0.468750 0.000000 Cost: 1.037724 1.037724 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 20000 Acc: 0.468750 0.000000 Cost: 1.090298 1.090298 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 20100 Acc: 0.375000 0.000000 Cost: 1.140127 1.140127 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 20200 Acc: 0.468750 0.000000 Cost: 1.066080 1.066080 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 20300 Acc: 0.593750 0.000000 Cost: 0.987435 0.987435 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 20400 Acc: 0.343750 0.000000 Cost: 1.074129 1.074129 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 20500 Acc: 0.343750 0.000000 Cost: 1.044995 1.044995 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 20600 Acc: 0.343750 0.000000 Cost: 1.165304 1.165304 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 20700 Acc: 0.406250 0.000000 Cost: 1.019419 1.019419 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 20800 Acc: 0.343750 0.000000 Cost: 1.084186 1.084186 0.000000 l2-not-exposed
Training [================================================================================] 100% 90s / 90s
[1] Step: 20900 Acc: 0.406250 0.000000 Cost: 1.101259 1.101259 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 21000 Acc: 0.437500 0.000000 Cost: 1.038850 1.038850 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 216s / 216s
[1] Step: 21000 Eval acc: 0.483713 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.483713
Training [================================================================================] 100% 307s / 307s
[1] Step: 21100 Acc: 0.593750 0.000000 Cost: 0.869412 0.869412 0.000000 l2-not-exposed
Training [================================================================================] 100% 89s / 89s
[1] Step: 21200 Acc: 0.593750 0.000000 Cost: 0.965400 0.965400 0.000000 l2-not-exposed
Training [================================================================================] 100% 93s / 93s
[1] Step: 21300 Acc: 0.500000 0.000000 Cost: 1.011637 1.011637 0.000000 l2-not-exposed
Training [================================================================================] 100% 94s / 94s
[1] Step: 21400 Acc: 0.656250 0.000000 Cost: 1.006077 1.006077 0.000000 l2-not-exposed
Training [================================================================================] 100% 91s / 91s
[1] Step: 21500 Acc: 0.531250 0.000000 Cost: 0.986071 0.986071 0.000000 l2-not-exposed
Training [================================================================================] 100% 93s / 93s
[1] Step: 21600 Acc: 0.562500 0.000000 Cost: 0.811805 0.811805 0.000000 l2-not-exposed
Training [================================================================================] 100% 96s / 96s
[1] Step: 21700 Acc: 0.437500 0.000000 Cost: 1.051043 1.051043 0.000000 l2-not-exposed
Training [================================================================================] 100% 103s / 103s
[1] Step: 21800 Acc: 0.406250 0.000000 Cost: 1.096644 1.096644 0.000000 l2-not-exposed
Training [================================================================================] 100% 93s / 93s
[1] Step: 21900 Acc: 0.562500 0.000000 Cost: 1.032705 1.032705 0.000000 l2-not-exposed
Training [================================================================================] 100% 92s / 92s
[1] Step: 22000 Acc: 0.468750 0.000000 Cost: 1.000828 1.000828 0.000000 l2-not-exposed
Training [================================================================================] 100% 95s / 95s
[1] Step: 22100 Acc: 0.437500 0.000000 Cost: 1.107982 1.107982 0.000000 l2-not-exposed
Training [================================================================================] 100% 101s / 101s
[1] Step: 22200 Acc: 0.437500 0.000000 Cost: 1.017677 1.017677 0.000000 l2-not-exposed
Training [================================================================================] 100% 101s / 101s
[1] Step: 22300 Acc: 0.468750 0.000000 Cost: 1.033083 1.033083 0.000000 l2-not-exposed
Training [================================================================================] 100% 99s / 99s
[1] Step: 22400 Acc: 0.343750 0.000000 Cost: 1.157516 1.157516 0.000000 l2-not-exposed
Training [================================================================================] 100% 96s / 96s
[1] Step: 22500 Acc: 0.437500 0.000000 Cost: 1.085885 1.085885 0.000000 l2-not-exposed
Run Eval [================================================================================] 100% 243s / 243s
[1] Step: 22500 Eval acc: 0.510586 0.000000 ../snli_1.0/snli_1.0_dev.jsonl
[1] [TODO: NOT IMPLEMENTED] Checkpointing with new best dev accuracy of 0.510586
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment