DDPG solution
Jupyter notebook file at
https://github.com/lirnli/OpenAI-gym-solutions/blob/master/Continuous_Deep_Deterministic_Policy_Gradient_Net/DDPG%20Class%20ver2.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## A simple Q-learning net with memory relay | |
import numpy as np | |
import tensorflow as tf | |
from matplotlib import pyplot as plt | |
import gym | |
# Get env parameters | |
GYM_NAME = 'CartPole-v0' | |
env = gym.make(GYM_NAME) | |
obs_shape = env.observation_space.shape |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Two Q nets are used. They use themsevleves to estimate Q(t,a) but use the other to estimate argmax Q(t+1,a). | |
# Still use one-step algorithm in training Q nets. | |
# Based on arXiv:1509.06461 [cs.LG] | |
# https://lirnli.wordpress.com/2017/08/17/debugging-reinforcement-neural-network-deep-q-net/ | |
# | |
# Hyperparameter summary: | |
# reward decay rate = 0.999 | |
# memory relay (with 10000 observations) | |
# AdamOptimizer, with learning rate decay | |
# tanh activation function |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
OlderNewer