Skip to content

Instantly share code, notes, and snippets.

View skaae's full-sized avatar

Søren Kaae Sønderby skaae

View GitHub Profile
--[[
LSTM cell. Modified from
https://github.com/oxford-cs-ml-2015/practical6/blob/master/LSTM.lua
--]]
local LSTM = {}
-- Creates one timestep of one LSTM
function LSTM.lstm(opt)
local x = nn.Identity()()
def adam(loss, all_params, learning_rate=0.001, b1=0.9, b2=0.999, e=1e-8,
gamma=1-1e-8):
"""
ADAM update rules
Default values are taken from [Kingma2014]
References:
[Kingma2014] Kingma, Diederik, and Jimmy Ba.
"Adam: A Method for Stochastic Optimization."
arXiv preprint arXiv:1412.6980 (2014).
def adam(loss, all_params, learning_rate=0.001, b1=0.9, b2=0.999, e=1e-8,
gamma=1-1e-8):
"""
ADAM update rules
Default values are taken from [Kingma2014]
References:
[Kingma2014] Kingma, Diederik, and Jimmy Ba.
"Adam: A Method for Stochastic Optimization."
arXiv preprint arXiv:1412.6980 (2014).
@skaae
skaae / adam.py
Created February 26, 2015 14:53
def adam(loss, all_params, learning_rate=0.0002, beta1=0.1, beta2=0.001,
epsilon=1e-8, gamma=1-1e-7):
"""
ADAM update rules
Default values are taken from [Kingma2014]
References:
[Kingma2014] Kingma, Diederik, and Jimmy Ba.
"Adam: A Method for Stochastic Optimization."
arXiv preprint arXiv:1412.6980 (2014).
import numpy as np
import theano
import theano.tensor as T
from theano import ifelse
from .. import init
from .. import nonlinearities
from .base import Layer
from __future__ import print_function
import gzip
import itertools
import pickle
import os
import sys
PY2 = sys.version_info[0] == 2
from __future__ import print_function
import gzip
import itertools
import pickle
import os
import sys
PY2 = sys.version_info[0] == 2
Function profiling
==================
Message: experiment.py:196
Time in 1 calls to Function.__call__: 7.953641e+00s
Time in Function.fn.__call__: 7.953413e+00s (99.997%)
Time in thunks: 7.929524e+00s (99.697%)
Total compile time: 1.766550e+02s
Number of Apply nodes: 1214
Theano Optimizer time: 1.688680e+02s
Function profiling
==================
Message: experiment.py:196
Time in 12 calls to Function.__call__: 8.222084e+01s
Time in Function.fn.__call__: 8.221798e+01s (99.997%)
Time in thunks: 8.196062e+01s (99.684%)
Total compile time: 2.509097e+02s
Number of Apply nodes: 1214
Theano Optimizer time: 1.675486e+02s
Theano validate time: 1.319681e+00s
class BidirectionalLSTMLayer(Layer):
'''
A long short-term memory (LSTM) layer. Includes "peephole connections" and
forget gate. Based on the definition in [#graves2014generating]_, which is
the current common definition. Gate names are taken from [#zaremba2014],
figure 1.
:references:
.. [#graves2014generating] Alex Graves, "Generating Sequences With
Recurrent Neural Networks".