Skip to content

Instantly share code, notes, and snippets.

View sayakpaul's full-sized avatar
:octocat:
Learn, unlearn and relearn.

Sayak Paul sayakpaul

:octocat:
Learn, unlearn and relearn.
View GitHub Profile
@sayakpaul
sayakpaul / grade_images_with_gemini.py
Last active February 15, 2025 11:13
Shows how to use Gemini Flash 2.0 to grade images on multiple aspects like accuracy to prompt, emotional and thematic response, etc.
from google import genai
from google.genai import types
import typing_extensions as typing
from PIL import Image
import requests
import io
import json
import os
@sayakpaul
sayakpaul / generate_labels_with_deepseek.py
Last active February 7, 2025 15:56
Generate labels with DeepSeek and `transformers`.
"""
Implementation of the label generation part in https://danielvanstrien.xyz/posts/2025/deepseek/distil-deepseek-modernbert.html
using `transformers` and DeepSeek.
"""
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import re
import contextlib
import math
@sayakpaul
sayakpaul / create_collage_videos.py
Created January 30, 2025 10:42
Create nice collage videos from videos.
from moviepy.editor import VideoFileClip, clips_array
import glob
def create_video_collage(video_paths, output_path="collage.mp4"):
"""
Combine four videos of the same resolution into a 2×2 collage.
Args:
video_paths (list[str]): List of paths to the four video files.
output_path (str): Filename for the output collage video.
@sayakpaul
sayakpaul / benchmark_flux_without_compile.py
Created January 24, 2025 10:15
Benchmarking Flux across different optimizations.
from diffusers import DiffusionPipeline
from diffusers import FluxTransformer2DModel, BitsAndBytesConfig
from transformers import T5EncoderModel, BitsAndBytesConfig as BnbConfig
from offloader import ModelOffloaderV2
import torch.utils.benchmark as benchmark
from pathlib import Path
import os
import sys
import torch
import json
import torch
from diffusers.utils import export_to_video
from diffusers import LTXPipeline, LTXVideoTransformer3DModel, GGUFQuantizationConfig
ckpt_path = (
"https://huggingface.co/city96/LTX-Video-gguf/blob/main/ltx-video-2b-v0.9-Q3_K_S.gguf"
)
transformer = LTXVideoTransformer3DModel.from_single_file(
ckpt_path,
quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
@sayakpaul
sayakpaul / aot_compile_with_int8_quant.py
Last active February 2, 2025 17:54
Shows how to AoT compile the Flux.1 Dev Transformer with int8 quant and perform inference.
import torch
from diffusers import FluxTransformer2DModel
import torch.utils.benchmark as benchmark
from torchao.quantization import quantize_, int8_weight_only
from torchao.utils import unwrap_tensor_subclass
import torch._inductor
torch._inductor.config.mixed_mm_choice = "triton"
@sayakpaul
sayakpaul / inference.md
Last active February 5, 2025 14:13
(Not so rigrously tested) example showing how to use `bitsandbytes`, `peft`, etc. to LoRA fine-tune Flux.1 Dev.

When loading the LoRA params (that were obtained on a quantized base model) and merging them into the base model, it is recommended to first dequantize the base model, merge the LoRA params into it, and then quantize the model again. This is because merging into 4bit quantized models can lead to some rounding errors. Below, we provide an end-to-end example:

  1. First, load the original model and merge the LoRA params into it:
from diffusers import FluxPipeline 
import torch 

ckpt_id = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(
@sayakpaul
sayakpaul / low_rank_lora.py
Last active December 15, 2024 22:41
Make a high-rank LoRA low-rank.
"""
Usage:
python low_rank_lora.py --repo_id=glif/how2draw --filename="How2Draw-V2_000002800.safetensors" \
--new_rank=4 --new_lora_path="How2Draw-V2_000002800_rank_4.safetensors"
"""
import torch
from huggingface_hub import hf_hub_download
import safetensors.torch
@sayakpaul
sayakpaul / pipeline_flux_with_cfg_batched.py
Last active September 20, 2024 18:41
Flux with CFG (batched) 💣
# Copyright 2024 Black Forest Labs and The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
@sayakpaul
sayakpaul / README.md
Last active October 22, 2024 03:02
This code snippet shows how to split the Flux transformer across two 16GB GPUs and run inference with the full pipeline.