This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'date' | |
# Actually doesn't matter WHAT you choose as the epoch, it | |
# won't change the algorithm. Just don't change it after you | |
# have cached computed scores. Choose something before your first | |
# post to avoid annoying negative numbers. Choose something close | |
# to your first post to keep the numbers smaller. This is, I think, | |
# reddit's own epoch. | |
$our_epoch = Time.local(2005, 12, 8, 7, 46, 43).to_time |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# From http://stackoverflow.com/a/11158224 | |
# Solution A - If the script importing the module is in a package | |
from .. import mymodule | |
# Solution B - If the script importing the module is not in a package | |
import os,sys,inspect | |
current_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) | |
parent_dir = os.path.dirname(current_dir) | |
sys.path.insert(0, parent_dir) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """ | |
import numpy as np | |
import cPickle as pickle | |
import gym | |
# hyperparameters | |
H = 200 # number of hidden layer neurons | |
batch_size = 10 # every how many episodes to do a param update? | |
learning_rate = 1e-4 | |
gamma = 0.99 # discount factor for reward |