kaychaks/gist:376f7294df772060a04c

Forked from debasishg/gist:b4df1648d3f1776abdff

Last active August 29, 2015 14:18

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/kaychaks/376f7294df772060a04c.js"></script>
Save kaychaks/376f7294df772060a04c to your computer and use it in GitHub Desktop.

Download ZIP

Raw

gistfile1.md

Feature Learning

Learning Feature Representations with K-means by Adam Coates and Andrew Y. Ng
The devil is in the details: an evaluation of recent feature encoding methods by Chatfield et. al.
Emergence of Object-Selective Features in Unsupervised Feature Learning by Coates, Ng
Scaling Learning Algorithms towards AI Benjio & LeCun

Deep Learning

Dropout: A Simple Way to Prevent Neural Networks from Overfitting by Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov
Understanding the difficulty of training deep feedforward neural networks by Xavier Glorot and Yoshua Bengio
On the difficulty of training Recurrent Neural Networks by Razvan Pascanu, Tomas Mikolov and Yoshua Bengio
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift by Sergey Ioffe and Christian Szegedy
Deep Learning in Neural Networks: An Overview by Jurgen Schmidhuber
Qualitatively characterizing neural network optimization problems by Ian J. Goodfellow, Oriol Vinyals
On Recurrent and Deep Neural Networks Phd thesis of Razvan Pascanu
Scaling Learning Algorithms towards AI by Yann LeCun and Yoshua Benjio
Efficient Backprop by LeCun, Bottou et al
Towards Biologically Plausible Deep Learning by Yoshua Bengio, Dong-Hyun Lee, Jorg Bornschein, Zhouhan Lin
Training Recurrent Neural Networks Phd thesis of Ilya Sutskever
A Probabilistic Theory of Deep Learning by Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk

Scalable Machine Learning

Bring the Noise: Embracing Randomness is the Key to Scaling Up Machine Learning Algorithms by Brian Delssandro
Large Scale Machine Learning with Stochastic Gradient Descent by Leon Bottou
The TradeOffs of Large Scale Learning by Leon Bottou & Olivier Bousquet
Hash Kernels for Structured Data by Qinfeng Shi et. al.
Feature Hashing for Large Scale Multitask Learning by Weinberger et. al.
Large-Scale Learning with Less RAM via Randomization by a group of authors from Google

Gradient based Training

Practical Recommendations for Gradient-Based Training of Deep Architectures by Yoshua Bengio
Stochastic Gradient Descent Tricks by L´eon Bottou

Non Linear Units

Rectified Linear Units Improve Restricted Boltzmann Machines by Nair & Hinton
Mathematical Intuition for Performance of Rectified Linear Unit in Deep Neural Networks by Alexandre Dalyec

Interesting blog posts

Hacker's Guide to Neural Networks by Andrej Karpathy
Breaking Linear Classifiers on ImageNet by Andrej Karpathy
Classifying plankton with Deep Neural Networks
Deep stuff about deep learning?
Understanding Convolution in Deep Learning
A Brief Overview of Deep Learning by Ilya Sutskever
Recurrent Neural Networks for Collaborative Filtering

Interesting courses

CS231n: Convolutional Neural Networks for Visual Recognition at Stanford by Andrej Karpathy
CS224d: Deep Learning for Natural Language Processing at Stanford by Richard Socher
STA 4273H (Winter 2015): Large Scale Machine Learning at Toronto by Russ Salakhutdinov
AM 207 Monte Carlo Methods, Stochastic Optimization at Harvard by Verena Kaynig-Fittkau and Pavlos Protopapas
ACL 2012 + NAACL 2013 Tutorial: Deep Learning for NLP (without Magic) at NAACL 2013 by Richard Socher, Chris Manning and Yoshua Bengio
Video course on Deep Learning by Hugo Larochelle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment