Thomas Simonini simoninithomas

Developer Advocacy @huggingface 🤗 Love building products using NLP and RL Founded Deep Reinforcement Learning course 📚 bit.ly/34fMhwc

simoninithomas / understand_a2c_sonic.ipynb

Created July 23, 2018 06:22

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

simoninithomas / correct_replay.py

Created July 8, 2018 16:07

	with tf.Session() as sess:

	game, possible_actions = create_environment()

	totalScore = 0

	# Load the model
	saver.restore(sess, "./models/model.ckpt")
	game.init()
	for i in range(1):

Last active February 26, 2020 13:37

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

simoninithomas / model_ddqn.py

Created June 25, 2018 07:36

	class DDDQNNet:
	def __init__(self, state_size, action_size, learning_rate, name):
	self.state_size = state_size
	self.action_size = action_size
	self.learning_rate = learning_rate
	self.name = name


	# We use tf.variable_scope here to know which network we're using (DQN or target_net)
	# it will be useful when we will update our w- parameters (by copy the DQN parameters)

simoninithomas / double_dqn.py

Created June 24, 2018 09:16

	### DOUBLE DQN Logic
	# Use DQNNetwork to select the action to take at next_state (a') (action with the highest Q-value)
	# Use TargetNetwork to calculate the Q_val of Q(s',a')

	# Get Q values for next_state
	Qs_next_state = sess.run(DQNetwork.output, feed_dict = {DQNetwork.inputs_: next_states_mb})

	# Calculate Qtarget(s')
	q_target = sess.run(TargetNetwork.output, feed_dict = {TargetNetwork.inputs_: next_states_mb})

simoninithomas / target_update.py

Created June 24, 2018 09:13

	# Get Q values for next_state
	Qs_next_state = sess.run(TargetNetwork.output, feed_dict = {TargetNetwork.inputs_: next_states_mb})

	...

	if tau > max_tau:
	# Update the parameters of our TargetNetwork with DQN_weights
	update_target = update_target_graph()
	sess.run(update_target)
	tau = 0

simoninithomas / instantiate_target.py

Created June 24, 2018 09:10

	# Instantiate the DQNetwork
	DQNetwork = DQNNet(state_size, action_size, learning_rate, name="DQNetwork")

	# Instantiate the target network
	TargetNetwork = DQNNet(state_size, action_size, learning_rate, name="TargetNetwork")

simoninithomas / update_target_graph.py

Created June 24, 2018 09:09

	# This function helps us to copy one set of variables to another
	# In our case we use it when we want to copy the parameters of DQN to Target_network
	# Thanks of the very good implementation of Arthur Juliani https://github.com/awjuliani
	def update_target_graph():

	# Get the parameters of our DQNNetwork
	from_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "DQNetwork")

	# Get the parameters of our Target_network
	to_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "TargetNetwork")

simoninithomas / Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb

Last active April 20, 2021 00:44

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

simoninithomas / Doom REINFORCE Monte Carlo Policy gradients.ipynb

Last active April 19, 2019 07:26

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.