Last active
August 18, 2024 08:34
-
-
Save Arunprakash-A/c27ebe06e6c8fbd21263fc54013bbf49 to your computer and use it in GitHub Desktop.
GradientAccumulation-for-continual-pretraining.ipynb
Author
Arunprakash-A
commented
Jul 17, 2024
•
- Gradient accumulation is not supported for optimizers like GaLore
- Inspired from the doc : https://huggingface.co/docs/transformers/v4.18.0/en/performance
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment