This example is based on main_amp.py
from the Apex imagenet amp examples and can be used with the same example commands. It demonstrates batch replay (instead of batch skipping) with the dynamic gradient scaling used by Amp.
Batch replay requires a bit of user-side control flow, but is fairly straightforward.
Ctrl+f "added for batch replay" in main_amp_replay.py
below to see what was changed. There should only be 5 instances, found entirely in this section.
Vimdiffing main_amp_replay.py
and main_amp.py
from the Apex example directory is also instructive. Again, there should be few differences.
See the "Batch replay" example in the Automatic Mixed Precision RFC for a preview of how I plan this will wor