Skip to content

Instantly share code, notes, and snippets.

View soumith's full-sized avatar

Soumith Chintala soumith

View GitHub Profile
==> processing options
options are:
table: 0x41910ed8
==> loading dataset
==> preprocessing data
==> Preprocessing total time: 0.28143811225891
==> construct ReLU model
==> define loss
model add to cuda
==> defining some tools
@soumith
soumith / gist:d364081b2fa70f8cc307
Created February 25, 2015 21:50
overfeat-small full log
overfeat schedule
=================
LR=0.01,epochNumber=1,weightDecay=5e-4
LR=0.005,epochNumber=20,weightDecay=5e-4
LR=0.001,epochNumber=33,weightDecay=0
LR=0.0005,epochNumber=48,weightDecay=0
Run till epoch 76
Epoch logs
=============
@soumith
soumith / gist:f2605f9f0a632e02b4fb
Created March 25, 2015 00:00
lstmrnn-digitsbox
~/code/lstm$ th main.lua
Loading ./data/ptb.train.txt, size of data = 929589
Loading ./data/ptb.valid.txt, size of data = 73760
Loading ./data/ptb.test.txt, size of data = 82430
Using 1-th gpu
Network parameters:
{
layers : 2
lr : 1
max_max_epoch : 13
@soumith
soumith / gist:e3f722173ea16c1ea0d9
Created June 22, 2015 21:34
CIFAR-10 eyescream
----------------------------------------------------------------------
-- CIFAR 8x8
opt.scale = 8
opt.geometry = {3, opt.scale, opt.scale}
local input_sz = opt.geometry[1] * opt.geometry[2] * opt.geometry[3]
local numhid = 600
model_D = nn.Sequential()
model_D:add(nn.Reshape(input_sz))
model_D:add(nn.Linear(input_sz, numhid))
model_D:add(nn.ReLU())
model_G = nn.Sequential()
model_G:add(nn.JoinTable(2, 2))
model_G:add(cudnn.SpatialConvolutionUpsample(3+1, 64, 7, 7, 1, 1)):add(cudnn.ReLU(true))
model_G:add(nn.SpatialBatchNormalization(64, nil, nil, false))
model_G:add(cudnn.SpatialConvolutionUpsample(64, 368, 7, 7, 1, 4)):add(cudnn.ReLU(true))
model_G:add(nn.SpatialBatchNormalization(368, nil, nil, false))
model_G:add(nn.SpatialDropout(0.5))
model_G:add(cudnn.SpatialConvolutionUpsample(368, 128, 7, 7, 1, 4)):add(cudnn.ReLU(true))
model_G:add(nn.SpatialBatchNormalization(128, nil, nil, false))
model_G:add(nn.FeatureLPPooling(2,2,2,true))
@soumith
soumith / gist:4766a592cb3645035ef8
Created November 6, 2015 17:25
alexnet-owt-bn 28 epoch convergence
2015-03-28 22:57:04.475 Epoch: [1][10000/10000] Time 0.304 DataTime 0.005 Err 3.9184
2015-03-28 22:57:04.476 ==> Validation epoch # 1
2015-03-28 22:58:07.329 json_stats: {"learningRate":0.25,"batchSize":128,"train_loss":4.7269560290813,"manualSeed":1,"msra_mul":0,"decay":0.5,"backend":"cudnn","epochSize":10000,"nDonkeys":16,"train_time":3231.9889090061,"test_accuracy":14.646,"weightDecay":0.0005,"epoch":1,"nEpochs":30,"model":"alexnetowtbn","momentum":0.9,"test_loss":4.6024402600098,"GPU":1,"retrain":"","train_accuracy":13.362265625,"test_time":62.852720022202,"best_accuracy":0,"bestAccuracy":0,"nGPU":4}
--
2015-03-28 23:53:36.332 Epoch: [2][10000/10000] Time 0.286 DataTime 0.006 Err 3.7128
2015-03-28 23:53:36.332 ==> Validation epoch # 2
2015-03-28 23:54:37.711 json_stats: {"learningRate":0.25,"batchSize":128,"train_loss":3.6877586714745,"manualSeed":1,"msra_mul":0,"decay":0.5,"backend":"cudnn","epochSize":10000,"nDonkeys":16,"train_time":3214.1851511002,"test_accuracy":16.416,"weightDecay":0.0005,"ep
--[[
This file implements Batch Normalization as described in the paper:
"Batch Normalization: Accelerating Deep Network Training
by Reducing Internal Covariate Shift"
by Sergey Ioffe, Christian Szegedy
This implementation is useful for inputs coming from convolution layers.
For Non-convolutional layers, see BatchNormalization.lua
The operation implemented is:
-- Torch Android demo script
-- Script: main.lua
-- Copyright (C) 2013 Soumith Chintala
require 'torch'
require 'cunn'
require 'nnx'
require 'dok'
require 'image'
@soumith
soumith / multiple_learning_rates.lua
Created May 26, 2016 21:35 — forked from farrajota/multiple_learning_rates.lua
Example code for how to set different learning rates per layer. Note that when calling :parameters(), the weights and bias of a given layer are separate, consecutive tensors. Therefore, when calling :parameters(), a network with N layers will output a table with N*2 tensors, where the i'th and i'th+1 tensors belong to the same layer.
-- multiple learning rates per network. Optimizes two copies of a model network and checks if the optimization steps (2) and (3) produce the same weights/parameters.
require 'torch'
require 'nn'
require 'optim'
torch.setdefaulttensortype('torch.FloatTensor')
-- (1) Define a model for this example.
local model = nn.Sequential()
model:add(nn.Linear(10,20))
[
{
"oC":32,
"name":"conv",
"strideH":2,
"kW":3,
"iC":3,
"padW":0,
"strideW":2,
"padH":0,