Skip to content

Instantly share code, notes, and snippets.

View smthemex's full-sized avatar
💭
Working,

smthemex

💭
Working,
  • Guangdong, China
View GitHub Profile
@sayakpaul
sayakpaul / inference.md
Last active February 5, 2025 14:13
(Not so rigrously tested) example showing how to use `bitsandbytes`, `peft`, etc. to LoRA fine-tune Flux.1 Dev.

When loading the LoRA params (that were obtained on a quantized base model) and merging them into the base model, it is recommended to first dequantize the base model, merge the LoRA params into it, and then quantize the model again. This is because merging into 4bit quantized models can lead to some rounding errors. Below, we provide an end-to-end example:

  1. First, load the original model and merge the LoRA params into it:
from diffusers import FluxPipeline 
import torch 

ckpt_id = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(