Skip to content

Instantly share code, notes, and snippets.

@harishanand95
Last active June 6, 2024 08:42
Show Gist options
  • Save harishanand95/75f4515e6187a6aa3261af6ac6f61269 to your computer and use it in GitHub Desktop.
Save harishanand95/75f4515e6187a6aa3261af6ac6f61269 to your computer and use it in GitHub Desktop.
Stable Diffusion on AMD GPUs on Windows using DirectML

Stable Diffusion for AMD GPUs on Windows using DirectML

UPDATE: A faster (20x) approach for running Stable Diffusion using MLIR/Vulkan/IREE is available on Windows:

https://github.com/nod-ai/SHARK/blob/main/shark/examples/shark_inference/stable_diffusion/stable_diffusion_amd.md

Install 🤗 diffusers

conda create --name sd39 python=3.9 -y
conda activate sd39
pip install diffusers==0.3.0
pip install transformers
pip install onnxruntime
pip install onnx

Install DirectML latest release

You can download the nightly onnxruntime-directml release from the link below

Run python --version to find out, which whl file to download.

  • If you are on Python3.7, download the file that ends with **-cp37-cp37m-win_amd64.whl.
  • If you are on Python3.8, download the file that ends with **-cp38-cp38m-win_amd64.whl
  • and likewise
pip install ort_nightly_directml-1.13.0.dev20220908001-cp39-cp39-win_amd64.whl --force-reinstall

Convert Stable Diffusion model to ONNX format

This apporach is faster than downloading the onnx models files.

wget https://raw.githubusercontent.com/huggingface/diffusers/main/scripts/convert_stable_diffusion_checkpoint_to_onnx.py
  • Run huggingface-cli.exe login and provide huggingface access token.
  • Convert the model using the command below. Models are stored in stable_diffusion_onnx folder.
python convert_stable_diffusion_checkpoint_to_onnx.py --model_path="CompVis/stable-diffusion-v1-4" --output_path="./stable_diffusion_onnx"

Run Stable Diffusion on AMD GPUs

Here is an example python code for stable diffusion pipeline using huggingface diffusers.

from diffusers import StableDiffusionOnnxPipeline
pipe = StableDiffusionOnnxPipeline.from_pretrained("./stable_diffusion_onnx", provider="DmlExecutionProvider")
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0] 
image.save("astronaut_rides_horse.png")
@averad
Copy link

averad commented Oct 23, 2022

I think for now there is no way to use img2img with AMD, I hope soon we can use it. Also if I'm wrong I want to know too :)

@lordzerg @Stable777 An Onnx Img2Img Pipeline has been added in Diffusers 0.6.0
huggingface/diffusers#552
https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/stable_diffusion

@averad
Copy link

averad commented Oct 23, 2022

change scheduler. There is astype(np.int64) in scheduling_pndm.py line 168 but not in other schedulers That's why change to PNDMScheduler can fix this. Or modify the other schedulers yourself. have a look here for a more verbose explanation: https://www.travelneil.com/stable-diffusion-updates.html#the-first-thing

If anyone is wondering how to change to PNDMScheduler for your specific model that is not working (Such as the trinart or wifu models). Open the model_index.json file (Located in the model folder you are trying to use) and edit the scheduler option.

@exaltedb
Copy link

image
I'm having a bit of an issue trying to convert the model. Every time I try to run the command under python 3.10.8 it fails referring to line 24 of the .py file. Anything that I could be doing wrong?

@SpandexWizard
Copy link

i'm now trying to convert other models i've already downloaded and the conversion script is yelling at me about invalid repo id's. but i'm not trying to use a repo? does anyone know how to point the convert_stable_diffusion_checkpoint_to_onnx.py at a downloaded model?

@harishanand95
Copy link
Author

Unfortunately I don't have time to update the instructions, please follow @averad 's instructions for diffusers>=0.6.0 Thanks! https://gist.github.com/averad/256c507baa3dcc9464203dc14610d674

@averad
Copy link

averad commented Nov 3, 2022

Unfortunately I don't have time to update the instructions, please follow @averad 's instructions for diffusers>=0.6.0 Thanks! https://gist.github.com/averad/256c507baa3dcc9464203dc14610d674

Thank you @harishanand95 for all you and your team at AMD are doing!

@claforte
Copy link

FYI, @harishanand95 is documenting how to use IREE (https://iree-org.github.io/iree/) through the Vulkan API to run StableDiffusion text->image. We expect to release the instructions next week. In our tests, this alternative toolchain runs >10X faster than ONNX RT->DirectML for text->image, and Nod.ai is also working to support img->img soon... we think the performance difference is in part explained by MLIR and IREE being a compiler toolchain, compared to ORT that's more of an interpreter. If you're interested in learning more and supporting this new code path, please email me at claforte at my employer's domain, or send me a Discord friend invite at claforte (my number is #7115). BTW I'm also trying to get the authorization to reward the most helpful open-source developers with a few Navi2 and Navi3 GPUs (soon after they are officially released). :-)

@nomanHasan
Copy link

Thank you @claforte @harishanand95 for your efforts at making Stable Diffusion more accessible. I run an RX 580, GFX803 which seems to have lost AMD ROCM support long ago. Still, the internet is full of workarounds that do not work in my experience. Looking forward to your guy's hard work to get us to use the open-source API method.

@cpietsch
Copy link

The main issue here is the windows route. If you use linux you can even use the goto stable diffusion UI: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs
Still, I would love to see windows support through the Vulkan API.
If I understand it correctly we need to convert the SD model to SPIR-V using iree-compiler?
There is an example using SHARK: https://github.com/nod-ai/SHARK/blob/b448770ec26d8b8b0cf332f752915ac39b02d935/shark/examples/shark_inference/stable_diff.py

@nomanHasan
Copy link

@cpietsch It doesn't work for Linux very well. The Linux-exclusive ROCM only properly support their workstation GPUs and support for consumer GPUs is lagging. You'd have to follow weird workarounds to get them working on the recent cards. And for slightly older cards like GFX803, it turns out to be impossible.

@cpietsch
Copy link

Oh sorry about that. It worked out of the box for my Radeon VII and I thought that that this was the same for the rest.

@harishanand95
Copy link
Author

Hello everyone. As Christian mentioned, we have added a new pipeline for AMD GPUs using MLIR/IREE. This approach significantly boosts the performance of running Stable Diffusion in Windows and avoids the current ONNX/DirectML approach.

Instructions: https://github.com/nod-ai/SHARK/blob/main/shark/examples/shark_inference/stable_diffusion/stable_diffusion_amd.md

Please reach out to us on the discord link on the instructions page or create GitHub issues if something does not work for you.

Thanks!

@averad, Could you please give it a try and update your instructions too? You can reach us on the discord channel if you have any questions, Thanks!

@averad
Copy link

averad commented Dec 1, 2022

@harishanand95 I will give it a try and update the Instructions.

@averad
Copy link

averad commented Dec 2, 2022

@harishanand95 I wasn't able to test the process as IREE doesn't have support for RX 500 series cards - GCNv3

I've suggested adding def VK_TTA_RGCNv3 : I32EnumAttrCase<"AMD_RGCNv3", 103, "rgcn3">; and am working on compiling IREE with my suggested changes for testing.

@cpietsch
Copy link

cpietsch commented Dec 4, 2022

I am getting 3.85 it/s on my 6900xt on SHARK (vulkan), that is 13 seconds for 50 iterations

@phreeware
Copy link

hi, the exe doesnt work for me following your little guide (using the MLIR driver on 6900XT), im getting errors:
image

ill try the manual guide

@cpietsch
Copy link

cpietsch commented Dec 4, 2022

For me the Advanced Installation worked

@Dwakener
Copy link

Dwakener commented Mar 2, 2023

Time generation ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment