smthemex

💭

Working,

363 followers · 5 following

Guangdong, China

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

sayakpaul / inference.md

Last active February 5, 2025 14:13

(Not so rigrously tested) example showing how to use `bitsandbytes`, `peft`, etc. to LoRA fine-tune Flux.1 Dev.

When loading the LoRA params (that were obtained on a quantized base model) and merging them into the base model, it is recommended to first dequantize the base model, merge the LoRA params into it, and then quantize the model again. This is because merging into 4bit quantized models can lead to some rounding errors. Below, we provide an end-to-end example:

First, load the original model and merge the LoRA params into it:

from diffusers import FluxPipeline 
import torch 

ckpt_id = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(