Skip to content

Instantly share code, notes, and snippets.

@AmosLewis
Last active October 14, 2022 16:54
Show Gist options
  • Save AmosLewis/585abe610e6e11fc7eead64d908a91c9 to your computer and use it in GitHub Desktop.
Save AmosLewis/585abe610e6e11fc7eead64d908a91c9 to your computer and use it in GitHub Desktop.
cmake -GNinja -B ../iree-build/ -S . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DIREE_ENABLE_ASSERTIONS=OFF -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_INSTALL_PREFIX=/home/chi/IREE/iree-build/install -DIREE_ENABLE_LLD=ON -DIREE_HAL_DRIVER_VULKAN=ON -DIREE_TARGET_BACKEND_CUDA=OFF -DIREE_TARGET_BACKEND_VULKAN_SPIRV=ON -DIREE_TARGET_BACKEND_OPENCL_SPIRV=ON -DIREE_ENABLE_ASSERTIONS=ON -DIREE_BUILD_PYTHON_BINDINGS=ON -DPython3_EXECUTABLE="$(which python)" -DIREE_ENABLE_RUNTIME_TRACING=ON -DIREE_BYTECODE_MODULE_FORCE_LLVM_SYSTEM_LINKER=ON -DIREE_BUILD_TRACY=ON
iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-vulkan-target-triple=rdna2-6900xt-linux --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 unet_stable_diff_maxf.mlir -o unet_stable_diff_maxf.vmfb
iree-benchmark-module --module_file=unet_stable_diff_maxf.vmfb --entry_function=forward --device=vulkan --function_input="2x64x64x4xf32" --function_input="2x320xf32" --function_input="2x77x768xf32"
@AmosLewis
Copy link
Author

Sharktunner

(tuner_venv) chi@alderlake:~/IREE$ python shark-tuner/minilm_example.py -model /home/chi/IREE/stable_diff_linalg.mlir -num_iters 100 -result_dir results -device vulkan -search_op conv
The input mlir type is linalg
Found AMD Radeon RX 5000 series device. Using rdna1-5700xt-linux
Searching for [2, 66, 66, 4, 3, 3, 320, 64, 64, 1, 1, 0]
Updated op %2 = linalg.conv_2d_nhwc_hwcf {compilation_info = #iree_codegen.compilation_info<lowering_config = <tile_sizes = [[0, 64, 64, 16], [0, 4, 8, 4], [0, 0, 0, 0, 1, 1, 4], [0, 1, 0, 0]]>, translation_info = <SPIRVVectorize>, workgroup_size = [4, 8, 16]>, dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%arg0, %arg1 : tensor<2x66x66x4xf32>, tensor<3x3x4x320xf32>) outs(%1 : tensor<2x64x64x320xf32>) -> tensor<2x64x64x320xf32>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment