Skip to content

Instantly share code, notes, and snippets.

View yassineAlouini's full-sized avatar
⚙️
PyTorch Exploration...

Yassine Alouini yassineAlouini

⚙️
PyTorch Exploration...
View GitHub Profile
@3outeille
3outeille / pipeline_parallel.py
Last active April 28, 2025 00:54
Self contained example of how pipeline parallel works (AFAB and 1F1B) in 200 LOC
#VERBOSE=0 torchrun --nproc_per_node 3 self_contained_pp_LOC.py
import os, random, numpy as np, torch, torch.nn as nn, torch.distributed as dist, torch.nn.functional as F
from torch.optim import AdamW
from torch.utils.data import DataLoader, DistributedSampler
from datasets import load_dataset
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
STEP, local_rank, world_size, verbose = 0, int(os.environ["LOCAL_RANK"]), int(os.environ["WORLD_SIZE"]), os.environ.get("VERBOSE", "0") == "1"
def set_all_seed(seed):
import math
import os
from collections import defaultdict
from pathlib import Path
from huggingface_hub import CommitOperationAdd, preupload_lfs_files, create_commit
# fast transfers using a Rust library, `pip install hf-transfer`
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
@shawwn
shawwn / JAX_compliation_cache.md
Last active January 2, 2024 15:46
JAX persistent compilation cache

JAX released a persistent compilation cache for TPU VMs! When enabled, the cache writes compiled JAX computations to disk so they don’t have to be re-compiled the next time you start your JAX program. This can save startup time if any of y’all have long compilation times.

First upgrade to the latest jax release:

pip install -U "jax[tpu]>=0.2.18" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

Then use the following to enable the cache in your jax code:

from jax.experimental.compilation_cache import compilation_cache as cc
@meseta
meseta / quora_dump_parser.py
Last active June 16, 2021 21:53
Converts quora dumps that you can request Quora to send you (multiple zip files) into parsed json and markdown
""" Converts quora dumps that you can request Quora to send you (multiple zip files) into parsed json and markdown
To use:
1. install dateparser, beautifulsoup4, markdownify
2. copy the zips you've received to ./data, make sure to keep only the ones containing answers (some will
contain comments, blog posts, and other metadata)
3. run this script
4. see ./output folder
License (MIT):
Copyright 2021 Yuan Gao
import torch
import cv2
import h5py
import numpy as np
from scipy.io import loadmat
import torch.utils.data as data
import torch.nn.functional as F
from torchvision.transforms import Compose
@kiyoon
kiyoon / ffmpeg_nvidia_conda_install.sh
Last active May 6, 2025 08:15
Install nvidia accelerated ffmpeg in a conda environment.
git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git
cd nv-codec-headers
vi Makefile # change the first line to PREFIX = ${CONDA_PREFIX}
make install
cd ..
git clone https://git.ffmpeg.org/ffmpeg.git
cd ffmpeg
git checkout n4.2.2
conda install nasm
@Mlawrence95
Mlawrence95 / confusion_matrix.py
Last active March 26, 2024 10:25
Python: create a confusion matrix across two columns in a Pandas dataframe having only categorical data
import pandas as pd
def confusion_matrix(df: pd.DataFrame, col1: str, col2: str):
"""
Given a dataframe with at least
two categorical columns, create a
confusion matrix of the count of the columns
cross-counts
use like:
@leandrobmarinho
leandrobmarinho / yolo.py
Last active December 11, 2021 09:37
An example in Python using Yolo from Opencv.
import cv2
import numpy as np
scale = 0.00392
classes_file = "coco.names"
weights = "yolov2.weights"
config_file = "yolov2.cfg"
# read class names from text file
classes = None
@zblz
zblz / mlflow_plugin_system_proposal.md
Last active August 17, 2020 09:13
Proposal for plugin system in MLflow

Proposal for a plugin system in MLflow

Motivation

MLflow has an internally pluggable architecture to enable using different backends for both the tracking store and the artifact store. This makes it easy to add new backends in the mlflow package, but does not allow for other packages to provide new handlers for new backends.

This would be useful for several reasons:

@jeremyjordan
jeremyjordan / sgdr.py
Last active December 4, 2023 13:41
Keras Callback for implementing Stochastic Gradient Descent with Restarts
from keras.callbacks import Callback
import keras.backend as K
import numpy as np
class SGDRScheduler(Callback):
'''Cosine annealing learning rate scheduler with periodic restarts.
# Usage
```python
schedule = SGDRScheduler(min_lr=1e-5,