Skip to content

Instantly share code, notes, and snippets.

View zhanwenchen's full-sized avatar
🤠
Howdy y'all

Zhanwen Chen zhanwenchen

🤠
Howdy y'all
View GitHub Profile
@vanbasten23
vanbasten23 / gist:94b33839fa37469f30a03487d1fac746
Created June 21, 2024 01:38
What is PyTorch's gradient checkpointing?
What is PyTorch's gradient checkpointing?
PyTorch's gradient checkpointing is a technique used to reduce the memory footprint during the training of deep neural networks, especially those with very deep architectures. This is particularly useful for training large models that would otherwise require more GPU memory than is available.
### How Gradient Checkpointing Works
1. **Standard Training Process**:
- During the forward pass, activations (outputs of layers) are computed and stored for each layer.
- During the backward pass, these stored activations are used to compute gradients.
@ritwikraha
ritwikraha / Pretraining-LLM.md
Last active August 14, 2025 17:23
Pretraining of Large Language Models

Pretraining


A Map for Studying Pre-training in LLMs

  • Data Collection
    • General Text Data
    • Specialized Data
  • Data Preprocessing
    • Quality Filtering
  • Deduplication
@Birch-san
Birch-san / magma-readme.md
Created April 27, 2023 21:58
Build magma from source
@Birch-san
Birch-san / CUDA-12-1-1-pytorch.md
Last active February 2, 2025 20:31
Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Should you keep your NVIDIA driver?

CUDA 12.1.1 toolkit is gonna offer to install Nvidia driver 530 for us. It's from New Feature branch. It's likely to be newer than the default Nvidia driver you would've installed via apt-get (apt would prefer to give you 525, i.e. Production Branch).

If you're confident that you already have a new enough Nvidia driver for CUDA 12.1.1, and you'd like to keep your driver: feel free to skip this "uninstall driver" step.

But if you're not sure, or you know your driver is too old: let's uninstall it. CUDA will install a new driver for us later.

@BramVanroy
BramVanroy / run.py
Last active July 13, 2024 22:20
Overwrite HfArgumentParser config options with CLI arguments
# See https://gist.github.com/BramVanroy/f78530673b1437ed0d6be7c61cdbdd7c
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, HyperOptArguments))
try:
# Assumes that the first .json file is the config file (if any)
config_file = next(iter(arg for arg in sys.argv if arg.endswith(".json")))
except StopIteration:
config_file = None
run_name_specified = False
import numpy as np
from statsmodels.nonparametric.smoothers_lowess import lowess
from sklearn.datasets import load_breast_cancer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import KFold, RepeatedKFold, GridSearchCV, cross_val_score
from sklearn.metrics import make_scorer, brier_score_loss
from sklearn.utils import resample
@guillermogotre
guillermogotre / unnormalize.py
Created March 20, 2022 15:44
PyTorch Torchvision UnNormalize (reverse Normalize)
import torchvision
class UnNormalize(torchvision.transforms.Normalize):
def __init__(self,mean,std,*args,**kwargs):
new_mean = [-m/s for m,s in zip(mean,std)]
new_std = [1/s for s in std]
super().__init__(new_mean, new_std, *args, **kwargs)
# imagenet_norm = dict(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])
# UnNormalize(**imagenet_norm)

This is a collection of Ubuntu fixes for Lenovo Legion 5i

Tested on: Lenovo Legion 5i with below specs:
AMD® Ryzen 7 4800h with radeon graphics × 16
NVIDIA Corporation / NVIDIA GeForce RTX 2060/PCIe/SSE2

1. GPU ISSUES for RTX 2060:

nvidia-driver-470 - HDMI doesn't have to work from the beginning
nvidia-driver-495 - HDMI works from the beginning, unstable (random reboots)\

@kinoc
kinoc / j6b_train_hf_ds.py
Last active September 17, 2024 18:53
So now you want to finetune that GPT-J-6B on a 3090/TITAN GPU ... okay, using HF and DeepSpeed too
# So now you want to finetune that GPT-J-6B on a 3090/TITAN GPU ... okay
# More exploratory coding. It uses the Huggingface model port, deepspeed and reads all text/md files from a target directory
# It is a fragment of a larger system with remote editing, but that's another story
# This is the raw, training tester. Items to look out for:
# - uses DeepSpeed and has a DS config
# - to save space uses SGD instead of ADAM
# - uses gradient checkpointing
# - freezes 25% of the layers to fit
# Assumes you can already run https://gist.github.com/kinoc/2d636a68876cd3de7b6e9c9452b61089
@peterhurford
peterhurford / install_xelatex_on_mac.txt
Last active August 5, 2025 16:13
How to install latex and xelatex on Mac so that Jupyter "Download as PDF" will work
brew install pandoc
brew tap homebrew/cask
brew install --cask basictex
eval "$(/usr/libexec/path_helper)"
# Update $PATH to include `/usr/local/texlive/2022basic/bin/universal-darwin`
sudo tlmgr update --self
sudo tlmgr install texliveonfly
sudo tlmgr install xelatex
sudo tlmgr install adjustbox
sudo tlmgr install tcolorbox