Code for Keras plays catch blog post
python qlearn.py
- Generate figures
Code for Keras plays catch blog post
python qlearn.py
"""Information Retrieval metrics | |
Useful Resources: | |
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt | |
http://www.nii.ac.jp/TechReports/05-014E.pdf | |
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf | |
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf | |
Learning to Rank for Information Retrieval (Tie-Yan Liu) | |
""" | |
import numpy as np |
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 42 out of 57 games.
Authors: Tom Schaul [email protected], John Quan [email protected], Ioannis Antonoglou [email protected], David Silver [email protected]
""" | |
This is a batched LSTM forward and backward pass | |
""" | |
import numpy as np | |
import code | |
class LSTM: | |
@staticmethod | |
def init(input_size, hidden_size, fancy_forget_bias_init = 3): |
""" | |
Simple implementation of Identity Recurrent Neural Networks (IRNN) | |
Reference | |
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units | |
http://arxiv.org/abs/1504.00941 | |
""" | |
import numpy as np |
import theano | |
from pylearn2.models import mlp | |
from pylearn2.training_algorithms import sgd | |
from pylearn2.termination_criteria import EpochCounter | |
from pylearn2.datasets.dense_design_matrix import DenseDesignMatrix | |
import numpy as np | |
from random import randint | |
class XOR(DenseDesignMatrix): |
""" | |
This is a batched LSTM forward and backward pass | |
""" | |
import numpy as np | |
import code | |
class LSTM: | |
@staticmethod | |
def init(input_size, hidden_size, fancy_forget_bias_init = 3): |