Skip to content

Instantly share code, notes, and snippets.

View cloneofsimo's full-sized avatar

Simo Ryu cloneofsimo

View GitHub Profile
@cloneofsimo
cloneofsimo / identitcal_training_dynamics.py
Created August 7, 2024 09:07
Demonstrate ABC invariance
# Suppose you have neural network that
# x_l = a_l * W_l x_{l-1}, W_l_{i,j} ~ N(0, b_l^2), Learning rate of W_l := c_l,
# If you are using adam, you can
# a_l <- a_l * A , b_l <- b_l / A, c_l <- c_l / A
# and it will have exactly identical training dynamics as before.
# This is known as ABC (ABCD) redundancy. For more general case: https://arxiv.org/abs/2308.01814
# Let me show you what I mean:
import torch
@cloneofsimo
cloneofsimo / pp.py
Created August 8, 2024 05:11
vae_preprocess
import os
import torch
import json
from PIL import Image
from torch.utils.data import Dataset, DataLoader
from diffusers.models import AutoencoderKL
from streaming import MDSWriter
import logging
import time
@cloneofsimo
cloneofsimo / hr.py
Created August 8, 2024 08:54
bucket
import os
import torch
import json
from PIL import Image
from torch.utils.data import Dataset
from diffusers.models import AutoencoderKL
from streaming import MDSWriter
import logging
import time
@cloneofsimo
cloneofsimo / plots.md
Created August 16, 2024 05:23
GPT-generated-Plots
def plot_lr_final_loss_batchsize(file_path):
    # Load the data
    data = pd.read_csv(file_path)
    
    # Extract columns that match 'val_loss/val_loss_'
    val_loss_columns = [col for col in data.columns if col.startswith('val_loss/val_loss_')]
    
    # Sort val_loss_columns by K (numeric value after 'val_loss/val_loss_') in increasing order
@cloneofsimo
cloneofsimo / prompt.md
Created August 19, 2024 04:55
Proooooompt

Summarization:

As a professional summarizer, create a concise and comprehensive summary of the provided text, be it an article, post, conversation, or passage, while adhering to these guidelines:

Craft a summary that is detailed, thorough, in-depth, and complex, while maintaining clarity and conciseness.

Incorporate main ideas and essential information, eliminating extraneous language and focusing on critical aspects.

Rely strictly on the provided text, without including external information.

Variant of AM-GM for Minimization

When dealing with functions of the form $f(x) = x^a + \frac{1}{x^b}$, a variant of the AM-GM inequality can be used to find the minimum. Specifically, if you have:

$$ f(x) = c_1 \cdot x^a + c_2 \cdot \frac{1}{x^b} $$

The minimum occurs at:

@cloneofsimo
cloneofsimo / polynomial-sphere-map.md
Last active September 28, 2024 09:35
Does there exists polynomial map of degree m sending the S^n to itself?
@cloneofsimo
cloneofsimo / infinite_parameterized_fractal.py
Created October 3, 2024 18:03
Parameterized Fractal Triton
import torch
import triton
import triton.language as tl
from triton.language.extra import libdevice
@triton.jit
def fractal_kernel(
zr_ptr, zi_ptr, cr_ptr, ci_ptr, output_ptr,
alpha_ptr, beta_ptr, poly0_ptr, poly1_ptr, poly2_ptr, poly3_ptr, p_ptr, R, max_iter,
H, W,
import torch
import time
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = False
@torch.no_grad()
def benchmark_gemm(m, k, n, dtype=torch.bfloat16, allow_bf16_reduce=True):
torch.backends.cuda.matmul.allow_bf16_reduced_precision_reduction = allow_bf16_reduce
@cloneofsimo
cloneofsimo / unit_activation_reinitializer.py
Created October 15, 2024 10:41
Unit-Scale Activation Initialization by Brute Force search
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
import numpy as np
import math
def compute_activation_std(model, dataset, device='cpu', batch_size=32, num_workers=0, layer_names=None):
activations = {}