Skip to content

Instantly share code, notes, and snippets.

@williamFalcon
Created July 29, 2019 17:46
Show Gist options
  • Save williamFalcon/857ea2e6fa6cf80b9e0d43e79027f881 to your computer and use it in GitHub Desktop.
Save williamFalcon/857ea2e6fa6cf80b9e0d43e79027f881 to your computer and use it in GitHub Desktop.
# clear last step
optimizer.zero_grad()
# 16 accumulated gradient steps
scaled_loss = 0
for accumulated_step_i in range(16):
out = model.forward()
loss = some_loss(out,y)
loss.backward()
scaled_loss += loss.item()
# update weights after 8 steps. effective batch = 8*16
optimizer.step()
# loss is now scaled up by the number of accumulated batches
actual_loss = scaled_loss / 16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment