Code for Keras plays catch blog post
python qlearn.py- Generate figures
Code for Keras plays catch blog post
python qlearn.py| """Information Retrieval metrics | |
| Useful Resources: | |
| http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt | |
| http://www.nii.ac.jp/TechReports/05-014E.pdf | |
| http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf | |
| http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf | |
| Learning to Rank for Information Retrieval (Tie-Yan Liu) | |
| """ | |
| import numpy as np | 
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 42 out of 57 games.
Authors: Tom Schaul [email protected], John Quan [email protected], Ioannis Antonoglou [email protected], David Silver [email protected]
| """ | |
| This is a batched LSTM forward and backward pass | |
| """ | |
| import numpy as np | |
| import code | |
| class LSTM: | |
| @staticmethod | |
| def init(input_size, hidden_size, fancy_forget_bias_init = 3): | 
| """ | |
| Simple implementation of Identity Recurrent Neural Networks (IRNN) | |
| Reference | |
| A Simple Way to Initialize Recurrent Networks of Rectified Linear Units | |
| http://arxiv.org/abs/1504.00941 | |
| """ | |
| import numpy as np | 
| import theano | |
| from pylearn2.models import mlp | |
| from pylearn2.training_algorithms import sgd | |
| from pylearn2.termination_criteria import EpochCounter | |
| from pylearn2.datasets.dense_design_matrix import DenseDesignMatrix | |
| import numpy as np | |
| from random import randint | |
| class XOR(DenseDesignMatrix): | 
| """ | |
| This is a batched LSTM forward and backward pass | |
| """ | |
| import numpy as np | |
| import code | |
| class LSTM: | |
| @staticmethod | |
| def init(input_size, hidden_size, fancy_forget_bias_init = 3): |