-
-
Save awjuliani/b5d83fcf3bf2898656be5730f098e08b to your computer and use it in GitHub Desktop.
This is a great demo. Can you also suggest how to I store the model as .h5 file, (like in Keras), and re-use it ?
@alphashuro agreed. I struggle the same prob what is OH mean?
Hi, I'm learning RL with your articles, great work 👍
Here is a quick diff to use raw TF (as of 1.3) instead of slim :
- state_in_OH = slim.one_hot_encoding(self.state_in, s_size)
- output = slim.fully_connected(state_in_OH,
- a_size,
- biases_initializer=None,
- activation_fn=tf.nn.sigmoid,
- weights_initializer=ones)
+ state_in_OH = tf.one_hot(self.state_in, s_size)
+ output = tf.layers.dense(state_in_OH, a_size, tf.nn.sigmoid,
+ use_bias=False, kernel_initializer=ones)
@Riotpiaole OH = one hot [encoding]
According to my experiment (tensorflow 1.3), I suggest to use AdamOptimizer
instead of GradientDescentOptimizer
since GradientDescentOptimizer
suffers from training stability issue.
@Riotpiaole I've re-implement the tutorial codes here, you may take a look at it.
can anyone explain to me why we do not use softmax instead of sigmoid? and also why we don't use bias?(I tried both and it wouldn't work)
@lipixun do you know the answer to my question? it would really help me thanks
@pooriaPoorsarvi as seen above we already got the responsible_weight variable, now we are getting the negative
Log likelihood to optimize for the maxium (tf only can optimize) no need to consider every other classes
Instead of using slim, can use tf as:
state_in_OH = tf.one_hot(self.state_in, s_size)
output = tf.layers.dense(state_in_OH, a_size, tf.nn.sigmoid, use_bias=False, kernel_initializer = tf.ones_initializer())
Thanks Arthur! this is helpful tutorial for beginers like me. Here is tensorflow 2 implementation may be helpful for someone
Thanks Arthur! this is helpful tutorial for beginers like me. Here is tensorflow 2 implementation may be helpful for someone
Thanks for the implementation. I wonder how is the implementation a policy network? I don't see policy gradient is used.
also, i didn't see anything on one hot encoding in the post, is it perhaps in one of your other posts?