Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save shagunsodhani/d66245692b276cd0b6dcbaf43e4211db to your computer and use it in GitHub Desktop.
Save shagunsodhani/d66245692b276cd0b6dcbaf43e4211db to your computer and use it in GitHub Desktop.
Notes for 'Recurrent Neural Network Regularization' paper

Recurrent Neural Network Regularization

Introduction

  • The paper explains how to apply dropout to LSTMs and how it could reduce overfitting in tasks like language modelling, speech recognition, image caption generation and machine translation.
  • Link to the paper
  • Regularisation method that drops out (or temporarily removes) units in a neural network. the network, along with all its incoming and outgoing connections
  • Conventional dropout does not work well with RNNs as the recurrence amplifies the noise and hurts learning.

Regularization

  • The paper proposes to apply dropout to only the non-recurrent connections.
  • The dropout operator would corrupt information carried by some units (and not all) forcing them to perform intermediate computations more robustly.
  • The information is corrupted L+1 times where L is the number of layers and is independent of timestamps traversed by the information.

Observation

  • In the context of language modelling, image caption generation, speech recognition and machine translation, dropout enables training larger networks and reduces the testing error in terms of perplexity and frame accuracy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment