Skip to content

Instantly share code, notes, and snippets.

View zhanwenchen's full-sized avatar
🤠
Howdy y'all

Zhanwen Chen zhanwenchen

🤠
Howdy y'all
View GitHub Profile
@ritwikraha
ritwikraha / Pretraining-LLM.md
Last active March 3, 2025 07:16
Pretraining of Large Language Models

Pretraining


A Map for Studying Pre-training in LLMs

  • Data Collection
    • General Text Data
    • Specialized Data
  • Data Preprocessing
    • Quality Filtering
  • Deduplication
@Birch-san
Birch-san / magma-readme.md
Created April 27, 2023 21:58
Build magma from source
@Birch-san
Birch-san / CUDA-12-1-1-pytorch.md
Last active February 2, 2025 20:31
Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10

Should you keep your NVIDIA driver?

CUDA 12.1.1 toolkit is gonna offer to install Nvidia driver 530 for us. It's from New Feature branch. It's likely to be newer than the default Nvidia driver you would've installed via apt-get (apt would prefer to give you 525, i.e. Production Branch).

If you're confident that you already have a new enough Nvidia driver for CUDA 12.1.1, and you'd like to keep your driver: feel free to skip this "uninstall driver" step.

But if you're not sure, or you know your driver is too old: let's uninstall it. CUDA will install a new driver for us later.

@BramVanroy
BramVanroy / run.py
Last active July 13, 2024 22:20
Overwrite HfArgumentParser config options with CLI arguments
# See https://gist.github.com/BramVanroy/f78530673b1437ed0d6be7c61cdbdd7c
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, HyperOptArguments))
try:
# Assumes that the first .json file is the config file (if any)
config_file = next(iter(arg for arg in sys.argv if arg.endswith(".json")))
except StopIteration:
config_file = None
run_name_specified = False
import numpy as np
from statsmodels.nonparametric.smoothers_lowess import lowess
from sklearn.datasets import load_breast_cancer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import KFold, RepeatedKFold, GridSearchCV, cross_val_score
from sklearn.metrics import make_scorer, brier_score_loss
from sklearn.utils import resample

This is a collection of Ubuntu fixes for Lenovo Legion 5i

Tested on: Lenovo Legion 5i with below specs:
AMD® Ryzen 7 4800h with radeon graphics × 16
NVIDIA Corporation / NVIDIA GeForce RTX 2060/PCIe/SSE2

1. GPU ISSUES for RTX 2060:

nvidia-driver-470 - HDMI doesn't have to work from the beginning
nvidia-driver-495 - HDMI works from the beginning, unstable (random reboots)\

@kinoc
kinoc / j6b_train_hf_ds.py
Last active September 17, 2024 18:53
So now you want to finetune that GPT-J-6B on a 3090/TITAN GPU ... okay, using HF and DeepSpeed too
# So now you want to finetune that GPT-J-6B on a 3090/TITAN GPU ... okay
# More exploratory coding. It uses the Huggingface model port, deepspeed and reads all text/md files from a target directory
# It is a fragment of a larger system with remote editing, but that's another story
# This is the raw, training tester. Items to look out for:
# - uses DeepSpeed and has a DS config
# - to save space uses SGD instead of ADAM
# - uses gradient checkpointing
# - freezes 25% of the layers to fit
# Assumes you can already run https://gist.github.com/kinoc/2d636a68876cd3de7b6e9c9452b61089
@peterhurford
peterhurford / install_xelatex_on_mac.txt
Last active March 5, 2025 09:21
How to install latex and xelatex on Mac so that Jupyter "Download as PDF" will work
brew install pandoc
brew tap homebrew/cask
brew install --cask basictex
eval "$(/usr/libexec/path_helper)"
# Update $PATH to include `/usr/local/texlive/2022basic/bin/universal-darwin`
sudo tlmgr update --self
sudo tlmgr install texliveonfly
sudo tlmgr install xelatex
sudo tlmgr install adjustbox
sudo tlmgr install tcolorbox
@ktosiek
ktosiek / PA profile-set astro-a50-gen4.conf
Last active December 29, 2024 22:09
Astro A50 support on Linux - basic configuration for PulseAudio 13 (tested on Ubuntu's 13.99.1). Install the files and reboot, to make sure udev and PA reloaded :-)
; /usr/share/pulseaudio/alsa-mixer/profile-sets/astro-a50-gen4.conf
[General]
auto-profiles = yes
[Mapping analog-voice]
description = Voice
device-strings = hw:%f,0,0
channel-map = left,right
paths-output = steelseries-arctis-output-chat-common
@kmhofmann
kmhofmann / installing_nvidia_driver_cuda_cudnn_linux.md
Last active January 10, 2025 22:30
Installing the NVIDIA driver, CUDA and cuDNN on Linux

Installing the NVIDIA driver, CUDA and cuDNN on Linux (Ubuntu 20.04)

This is a companion piece to my instructions on building TensorFlow from source. In particular, the aim is to install the following pieces of software

on an Ubuntu Linux system, in particular Ubuntu 20.04.