shagunsodhani/RecurrentNeuralNetworkRegularization.md

Created July 24, 2016 15:01

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/shagunsodhani/d66245692b276cd0b6dcbaf43e4211db.js"></script>
Save shagunsodhani/d66245692b276cd0b6dcbaf43e4211db to your computer and use it in GitHub Desktop.

Download ZIP

Notes for 'Recurrent Neural Network Regularization' paper

Raw

RecurrentNeuralNetworkRegularization.md

Recurrent Neural Network Regularization

Introduction

The paper explains how to apply dropout to LSTMs and how it could reduce overfitting in tasks like language modelling, speech recognition, image caption generation and machine translation.
Link to the paper

Dropout

Regularisation method that drops out (or temporarily removes) units in a neural network. the network, along with all its incoming and outgoing connections
Conventional dropout does not work well with RNNs as the recurrence amplifies the noise and hurts learning.

Regularization

The paper proposes to apply dropout to only the non-recurrent connections.
The dropout operator would corrupt information carried by some units (and not all) forcing them to perform intermediate computations more robustly.
The information is corrupted L+1 times where L is the number of layers and is independent of timestamps traversed by the information.

Observation

In the context of language modelling, image caption generation, speech recognition and machine translation, dropout enables training larger networks and reduces the testing error in terms of perplexity and frame accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment