Skip to content

Instantly share code, notes, and snippets.

@ismarou
Forked from kaniblu/rnn_init.py
Created October 20, 2019 16:45
Show Gist options
  • Save ismarou/20bca89213a02b4b0734b2ec9ad85425 to your computer and use it in GitHub Desktop.
Save ismarou/20bca89213a02b4b0734b2ec9ad85425 to your computer and use it in GitHub Desktop.
PyTorch LSTM and GRU Orthogonal Initialization and Positive Bias
def init_gru(cell, gain=1):
cell.reset_parameters()
# orthogonal initialization of recurrent weights
for _, hh, _, _ in cell.all_weights:
for i in range(0, hh.size(0), cell.hidden_size):
I.orthogonal(hh[i:i + cell.hidden_size], gain=gain)
def init_lstm(cell, gain=1):
init_gru(cell, gain)
# positive forget gate bias (Jozefowicz et al., 2015)
for _, _, ih_b, hh_b in cell.all_weights:
l = len(ih_b)
ih_b[l // 4:l // 2].data.fill_(1.0)
hh_b[l // 4:l // 2].data.fill_(1.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment