This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Implementation of DDPG - Deep Deterministic Policy Gradient | |
Algorithm and hyperparameter details can be found here: http://arxiv.org/pdf/1509.02971v2.pdf | |
Variance scaling paper: https://arxiv.org/pdf/1502.01852v1.pdf | |
Thanks to GitHub users yanpanlau, pemami4911, songrotek and JunhongXu for their DDPG examples | |
Batch normalisation on the actor accelerates learning but has poor long term stability. Applying to the critic breaks | |
it, particularly on the state branch. Not sure why but I think this issue is specific to this environment | |
""" | |
import numpy as np |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.