Skip to content

Instantly share code, notes, and snippets.

@kemingy
Created August 4, 2020 07:31
Show Gist options
  • Save kemingy/13a5885e948cad60f8fde4dfb85e7d6c to your computer and use it in GitHub Desktop.
Save kemingy/13a5885e948cad60f8fde4dfb85e7d6c to your computer and use it in GitHub Desktop.

Training

  • learning rate
  • dropout
  • max token
  • clip norm
  • tokenization method
  • update freq (mini-batch with delayed update)
  • optimizer
  • learning rate scheduler
  • warmup update
  • warmup init learning rate
  • min learning rate
  • label smoothing
  • quantization (fp16)
  • share all embedding
  • criterion (cross entropy)

Generation

  • beam search size
  • max length (ax+b where x is the original length)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment