Last active
June 18, 2018 07:20
-
-
Save joyhuang9473/e8e32540f35b7693bbf97403f3e677ec to your computer and use it in GitHub Desktop.
dqn-trianing-0615214336
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
config = { | |
'network': [ | |
('input', {}), | |
('conv1', {'W_size': 8, 'stride': 4, 'in': 4, 'out': 16}), | |
('conv2', {'W_size': 4, 'stride': 2, 'in': 16, 'out': 32}), | |
('fc1', {'num_relus': 256}), | |
('output', {}), | |
], | |
'input_size': [84, 84], # height, width | |
'num_actions': 4, | |
'var_init_mean': 0.0, | |
'var_init_stddev': 0.01, | |
'minibatch_size': 32, | |
'replay_memory_size': 10 ** 6, | |
'agent_history_length': 4, | |
'discount_factor': 0.95, | |
'learning_rate': 0.00025, | |
'rms_prop_decay': 0.95, | |
'gradient_momentum': 0.0, | |
'min_squared_gradient': 0.01, | |
'final_exploration': 0.1, | |
'final_exploration_frame': 10 ** 6, | |
'replay_start_size': 5 * (10 ** 4), | |
'validation_size': 500, | |
'evaluation_exploration': 0.05, | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2018-06-18 12:12:18,471 - __main__ - INFO - Load model: checkpoints/0615214336/breakout-v4-7500000 | |
2018-06-18 12:12:42,881 - __main__ - INFO - episode: 1, reward: 11, ave. reward: 11, | |
2018-06-18 12:13:39,545 - __main__ - INFO - episode: 2, reward: 11, ave. reward: 11, | |
2018-06-18 12:14:16,055 - __main__ - INFO - episode: 3, reward: 11, ave. reward: 11, | |
2018-06-18 12:15:12,720 - __main__ - INFO - episode: 4, reward: 11, ave. reward: 11, | |
2018-06-18 12:16:03,299 - __main__ - INFO - episode: 5, reward: 11, ave. reward: 11, | |
2018-06-18 12:17:04,777 - __main__ - INFO - episode: 6, reward: 11, ave. reward: 11, | |
2018-06-18 12:17:55,533 - __main__ - INFO - episode: 7, reward: 9, ave. reward: 10.7143, | |
2018-06-18 12:18:24,553 - __main__ - INFO - episode: 8, reward: 11, ave. reward: 10.75, | |
2018-06-18 12:19:00,727 - __main__ - INFO - episode: 9, reward: 11, ave. reward: 10.7778, | |
2018-06-18 12:19:32,899 - __main__ - INFO - episode: 10, reward: 11, ave. reward: 10.8, | |
2018-06-18 12:19:59,240 - __main__ - INFO - Finished: the best reward 11, the ave. reward 10.8. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
iter_1500000 the best reward 2, the ave. reward 0.2. | |
iter_2000000 the best reward 1, the ave. reward 0.1. | |
iter_4500000 the best reward 11, the ave. reward 10.7. | |
iter_5500000 the best reward 11, the ave. reward 10.2. | |
iter_7000000 the best reward 11, the ave. reward 10. | |
iter_7500000 the best reward 11, the ave. reward 10.8. | |
iter_10500000 the best reward 11, the ave. reward 10.7. | |
iter_12500000 the best reward 3, the ave. reward 2.1. | |
iter_13000000 the best reward 14, the ave. reward 8.2. | |
iter_13500000 the best reward 5, the ave. reward 4.2. | |
iter_14000000 the best reward 3, the ave. reward 3. | |
iter_14500000 the best reward 7, the ave. reward 3.4. | |
iter_15000000 the best reward 9, the ave. reward 7.1. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
train_op_loss