- Deep Subspace Clustering Network, http://arxiv.org/abs/1709.02508v1
- Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization, http://arxiv.org/abs/1707.06468v2
- Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models, http://arxiv.org/abs/1606.06841v4
- Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning, http://arxiv.org/abs/1704.02882v2
- Parametric Simplex Method for Sparse Learning, http://arxiv.org/abs/1704.01079v1
- Group Sparse Additive Machine, http://arxiv.org/abs/1206.4673v1
- The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings, http://arxiv.org/abs/1703.00864v2
- Inferring Generative Model Structure with Static Analysis, http://arxiv.org/abs/1709.02477v1
- On Structured Prediction Theory with Calibrated Convex Surrogate Losses, http://arxiv.org/abs/1703.02403v2
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """ | |
import numpy as np | |
import cPickle as pickle | |
import gym | |
# hyperparameters | |
H = 200 # number of hidden layer neurons | |
batch_size = 10 # every how many episodes to do a param update? | |
learning_rate = 1e-4 | |
gamma = 0.99 # discount factor for reward |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Borrowed from NAF. Might be plugged into DeepSets. | |
class SigmoidFlow(nn.Module): | |
def __init__(self,K): | |
super().__init__() | |
self.W = nn.Parameter(torch.FloatTensor(K,1)) | |
self.b = nn.Parameter(torch.FloatTensor(1,K)) | |
self.alpha = nn.Linear(1,K) | |
def forward(self,x): |