Alvaro avalcarce

9 followers · 2 following

Nokia Bell-Labs
Paris

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

avalcarce / README.md

Created March 8, 2017 09:27

Solving Acrobot-v1 with Double DQN and Prioritized Experience Replay (with proportional prioritization)

Synopsis

This is a Deep Reinforcement Learning solution to the Acrobot-v1 environment in OpenAI's Gym. This code uses Tensorflow to model a value function for a Reinforcement Learning agent. I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.

The algorithm is a Double Deep Q Network (DQN) with Prioritized Experience Replay (PER), where the proportional prioritization variant has been implemented. All hyper parameters have been chosen by hand based on several experiments. However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.

The hyperparameters are:

avalcarce / README.md

Created March 8, 2017 11:53

Solving Acrobot-v1 with DQN and Prioritized Experience Replay (with proportional prioritization)

Synopsis

The algorithm is a Deep Q Network (DQN) with Prioritized Experience Replay (PER), where the proportional prioritization variant has been implemented. All hyper parameters have been chosen by hand based on several experiments. However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.

The hyperparameters are:

OlderNewer