UPDATE: A faster (20x) approach for running Stable Diffusion using MLIR/Vulkan/IREE is available on Windows:
conda create --name sd39 python=3.9 -y
conda activate sd39
pip install diffusers==0.3.0
pip install transformers
pip install onnxruntime
pip install onnx
You can download the nightly onnxruntime-directml release from the link below
Run python --version
to find out, which whl file to download.
- If you are on Python3.7, download the file that ends with **-cp37-cp37m-win_amd64.whl.
- If you are on Python3.8, download the file that ends with **-cp38-cp38m-win_amd64.whl
- and likewise
pip install ort_nightly_directml-1.13.0.dev20220908001-cp39-cp39-win_amd64.whl --force-reinstall
This apporach is faster than downloading the onnx models files.
- Download diffusers/scripts/convert_stable_diffusion_checkpoint_to_onnx.py to your working directory. You can try the command below to download the script.
wget https://raw.githubusercontent.com/huggingface/diffusers/main/scripts/convert_stable_diffusion_checkpoint_to_onnx.py
- Run
huggingface-cli.exe login
and provide huggingface access token. - Convert the model using the command below. Models are stored in
stable_diffusion_onnx
folder.
python convert_stable_diffusion_checkpoint_to_onnx.py --model_path="CompVis/stable-diffusion-v1-4" --output_path="./stable_diffusion_onnx"
Here is an example python code for stable diffusion pipeline using huggingface diffusers.
from diffusers import StableDiffusionOnnxPipeline
pipe = StableDiffusionOnnxPipeline.from_pretrained("./stable_diffusion_onnx", provider="DmlExecutionProvider")
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]
image.save("astronaut_rides_horse.png")
FYI, @harishanand95 is documenting how to use IREE (https://iree-org.github.io/iree/) through the Vulkan API to run StableDiffusion text->image. We expect to release the instructions next week. In our tests, this alternative toolchain runs >10X faster than ONNX RT->DirectML for text->image, and Nod.ai is also working to support img->img soon... we think the performance difference is in part explained by MLIR and IREE being a compiler toolchain, compared to ORT that's more of an interpreter. If you're interested in learning more and supporting this new code path, please email me at claforte at my employer's domain, or send me a Discord friend invite at claforte (my number is #7115). BTW I'm also trying to get the authorization to reward the most helpful open-source developers with a few Navi2 and Navi3 GPUs (soon after they are officially released). :-)