This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I have used some aggressive measure to achieve 19 Episodes before solve, inculding: | |
1) Large batch_size used for experience replay. | |
2) 30 epoch size used for every training batch. | |
3) 0 probability for random action. | |
With this approach, the NN start to "know" how to balance at episode 10-15. | |
Keras was used for creating and training the NN. |