Zhanwen Chen zhanwenchen

This is a collection of Ubuntu fixes for Lenovo Legion 5i

Tested on: Lenovo Legion 5i with below specs:
AMD® Ryzen 7 4800h with radeon graphics × 16
NVIDIA Corporation / NVIDIA GeForce RTX 2060/PCIe/SSE2

1. GPU ISSUES for RTX 2060:

nvidia-driver-470 - HDMI doesn't have to work from the beginning
nvidia-driver-495 - HDMI works from the beginning, unstable (random reboots)\

Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Should you keep your NVIDIA driver?

CUDA 12.1.1 toolkit is gonna offer to install Nvidia driver 530 for us. It's from New Feature branch. It's likely to be newer than the default Nvidia driver you would've installed via apt-get (apt would prefer to give you 525, i.e. Production Branch).

If you're confident that you already have a new enough Nvidia driver for CUDA 12.1.1, and you'd like to keep your driver: feel free to skip this "uninstall driver" step.

But if you're not sure, or you know your driver is too old: let's uninstall it. CUDA will install a new driver for us later.

I wrote these instructions as part of "installing PyTorch with CUDA 12.1.1".
I extracted them into this separate gist, because I realised there's a much easier way to install magma for CUDA 12.1.1:
https://anaconda.org/pytorch/magma-cuda121

There's a conda package!

conda install -c pytorch magma-cuda121

Pretraining

A Map for Studying Pre-training in LLMs

Data Collection
- General Text Data
- Specialized Data
Data Preprocessing
- Quality Filtering
Deduplication

	import torchvision

	class UnNormalize(torchvision.transforms.Normalize):
	def __init__(self,mean,std,args,*kwargs):
	new_mean = [-m/s for m,s in zip(mean,std)]
	new_std = [1/s for s in std]
	super().__init__(new_mean, new_std, args, *kwargs)

	# imagenet_norm = dict(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])
	# UnNormalize(**imagenet_norm)

	import numpy as np
	from statsmodels.nonparametric.smoothers_lowess import lowess

	from sklearn.datasets import load_breast_cancer
	from sklearn.pipeline import Pipeline
	from sklearn.preprocessing import StandardScaler
	from sklearn.linear_model import LogisticRegression
	from sklearn.model_selection import KFold, RepeatedKFold, GridSearchCV, cross_val_score
	from sklearn.metrics import make_scorer, brier_score_loss
	from sklearn.utils import resample

	# See https://gist.github.com/BramVanroy/f78530673b1437ed0d6be7c61cdbdd7c
	parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, HyperOptArguments))

	try:
	# Assumes that the first .json file is the config file (if any)
	config_file = next(iter(arg for arg in sys.argv if arg.endswith(".json")))
	except StopIteration:
	config_file = None

	run_name_specified = False

	What is PyTorch's gradient checkpointing?

	PyTorch's gradient checkpointing is a technique used to reduce the memory footprint during the training of deep neural networks, especially those with very deep architectures. This is particularly useful for training large models that would otherwise require more GPU memory than is available.

	### How Gradient Checkpointing Works

	1. Standard Training Process:
	- During the forward pass, activations (outputs of layers) are computed and stored for each layer.
	- During the backward pass, these stored activations are used to compute gradients.