Skip to content

Instantly share code, notes, and snippets.

@pythonlessons
Created January 15, 2020 07:56
Show Gist options
  • Select an option

  • Save pythonlessons/a78943578ab20069d74426ea67bbfa4d to your computer and use it in GitHub Desktop.

Select an option

Save pythonlessons/a78943578ab20069d74426ea67bbfa4d to your computer and use it in GitHub Desktop.
05_CartPole-reinforcement-learning_PER_D3QN
class Memory(object): # stored as ( state, action, reward, next_state ) in SumTree
PER_e = 0.01 # Hyperparameter that we use to avoid some experiences to have 0 probability of being taken
PER_a = 0.6 # Hyperparameter that we use to make a tradeoff between taking only exp with high priority and sampling randomly
PER_b = 0.4 # importance-sampling, from initial value increasing to 1
PER_b_increment_per_sampling = 0.001
absolute_error_upper = 1. # clipped abs error
def __init__(self, capacity):
# Making the tree
self.tree = SumTree(capacity)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment