Implementation of Double Dueling Deep-Q Network

tropical32 commented Feb 13, 2017 •

edited

Loading

@mphielipp Replace that line with:
self.AW = tf.Variable(tf.random_normal([h_size // 2, env.actions]))
It expects an integer, not a float.

nathanin commented Sep 8, 2017 •

edited

Loading

Hi, First thanks so much for your detailed write ups and commented implementations. I have been working through them while developing my own RL environment outside of gym.

I have a few questions regarding the implementation for Double-DQN here:

The Double-DQN paper (https://arxiv.org/pdf/1511.06581.pdf) algorithm mentions updating \theta with each step t. It looks like the implementation here updates \theta every update_freq steps, and updates \theta- immediately afterwards. Is there something I don't understand? I guess it ends up being a heuristic decision when to perform these updates, just wondering what your intuition is for the \theta, \theta- update cycle.
Second is your nice tensorflow hack to update the targetQ weights. Does it rely on the order of initialization? Might there be a more verbose but explicit way to do it, maybe storing the targetQ ops by name in a dictionary?
Last is there a reason for not using a nonlinearity/activation in the network?

samsenyang commented Dec 17, 2018

I would like to ask a question: do we have to split the inputs in order to achieve dueling DQN?
why can't i just input all the inputs into value layer and advantage layer?

awjuliani/Double-Dueling-DQN-Tutorial.ipynb

tropical32 commented Feb 13, 2017 •

edited

Loading

Uh oh!

nathanin commented Sep 8, 2017 •

edited

Loading

Uh oh!

samsenyang commented Dec 17, 2018

Uh oh!

awjuliani/Double-Dueling-DQN-Tutorial.ipynb

tropical32 commented Feb 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathanin commented Sep 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samsenyang commented Dec 17, 2018

Uh oh!

tropical32 commented Feb 13, 2017 •

edited

Loading

nathanin commented Sep 8, 2017 •

edited

Loading