You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
Instantly share code, notes, and snippets.
🤠
Howdy y'all
Zhanwen Chen
zhanwenchen
🤠
Howdy y'all
Data Science Ph.D. student at University of Virginia.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Overwrite HfArgumentParser config options with CLI arguments
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10
Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10
Should you keep your NVIDIA driver?
CUDA 12.1.1 toolkit is gonna offer to install Nvidia driver 530 for us. It's from New Feature branch. It's likely to be newer than the default Nvidia driver you would've installed via apt-get (apt would prefer to give you 525, i.e. Production Branch).
If you're confident that you already have a new enough Nvidia driver for CUDA 12.1.1, and you'd like to keep your driver: feel free to skip this "uninstall driver" step.
But if you're not sure, or you know your driver is too old: let's uninstall it. CUDA will install a new driver for us later.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PyTorch's gradient checkpointing is a technique used to reduce the memory footprint during the training of deep neural networks, especially those with very deep architectures. This is particularly useful for training large models that would otherwise require more GPU memory than is available.
### How Gradient Checkpointing Works
1. **Standard Training Process**:
- During the forward pass, activations (outputs of layers) are computed and stored for each layer.
- During the backward pass, these stored activations are used to compute gradients.