Created
March 17, 2025 01:17
-
-
Save AmosLewis/688b70aaf0295793448f5b0c70704887 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# downdload the issue1.sh by the following wget | |
# wget https://gist.githubusercontent.com/AmosLewis/00fdb4e9a96f29c188828e3ff4ea29ef/raw/8377ffd2f53e58e12a9adae7b92b3d5a7f35d98b/bisect-issue1.sh | |
(bisect.venv) ➜ bisect git:(main) ✗ python ./bisect_packages.py \ | |
--good-ref=00e88733e6b8c8cdb351d4516509f56daebdf604 \ | |
--bad-ref=4451b8ba42b1249eb3ed1b5031a46f51733984ed \ | |
--test-script=/sharedfile/attn/bisect/issue1.sh | |
Welcome to bisect_packages.py! | |
------------------------------------------------------------------ | |
--------- Configuration ------------------------------------------ | |
------------------------------------------------------------------ | |
Searching range : '00e88733e6b8c8cdb351d4516509f56daebdf604' - '4451b8ba42b1249eb3ed1b5031a46f51733984ed' | |
Using working directory : '/home/chi/.iree/bisect' | |
Using test script : '/sharedfile/attn/bisect/issue1.sh' | |
Current platform is 'Linux-6.8.0-52-generic-x86_64-with-glibc2.35', platform.system is 'Linux'. | |
Current Python version is '3.11.1 (main, Oct 7 2024, 06:16:08) [GCC 11.4.0]'. This script requires 3.11. | |
Found gh at '/usr/bin/gh'. | |
------------------------------------------------------------------ | |
------------------------------------------------------------------ | |
--------- Running git bisect ------------------------------------- | |
------------------------------------------------------------------ | |
Bisecting: 3 revisions left to test after this (roughly 2 steps) | |
[3bfd3628d03587d25fe7f5de126cbb672fd0d71f] No `extern "C"` main functions (#20223) | |
running '/home/chi/.iree/bisect/bisect_run_script.sh' | |
++ git rev-parse BISECT_HEAD | |
+ REF_HASH=3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
+ python /home/chi/src/iree/build_tools/pkgci/bisect/../setup_venv.py /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv --artifact-path=/home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f --fetch-git-ref=3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Finding workflow run for ref: 3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Using normalized ref: 3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Running command to list workflow runs: | |
gh api -H Accept: application/vnd.github+json -H X-GitHub-Api-Version: 2022-11-28 /repos/iree-org/iree/actions/workflows/pkgci.yml/runs?head_sha=3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Found workflow run: https://github.com/iree-org/iree/actions/runs/13818437894 | |
Package iree-base-compiler not found in cache. Fetching from linux_x86_64_release_packages... | |
Fetching artifacts for workflow run: 13818437894 | |
Found artifacts: | |
linux_x86_64_release_packages: /repos/iree-org/iree/actions/artifacts/2739977017/zip | |
Downloading artifact /repos/iree-org/iree/actions/artifacts/2739977017/zip | |
Extracting /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/linux_x86_64_release_packages.zip | |
Installing wheels: [(PosixPath('/home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f'), 'iree-base-compiler'), (PosixPath('/home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f'), 'iree-base-runtime')] | |
Creating venv at /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv | |
Running command: /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f --force-reinstall iree-base-compiler | |
Looking in links: /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Processing /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/iree_base_compiler-3.3.0.dev0+3bfd3628d03587d25fe7f5de126cbb672fd0d71f-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-compiler | |
Successfully installed iree-base-compiler-3.3.0.dev0+3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Running command: /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f --force-reinstall iree-base-runtime | |
Looking in links: /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
Processing /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/iree_base_runtime-3.3.0.dev0+3bfd3628d03587d25fe7f5de126cbb672fd0d71f-cp311-cp311-manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-runtime | |
Successfully installed iree-base-runtime-3.3.0.dev0+3bfd3628d03587d25fe7f5de126cbb672fd0d71f | |
venv setup complete at '/home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv'. Activate it with | |
source /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv/bin/activate | |
+ PATH=/home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv/bin:/usr/lib/git-core:/usr/lib/git-core:/sharedfile/bisect.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin | |
+ set +e | |
+ iree-compile /sharedfile/attn/128/fp8_attn.mlir --iree-hip-target=gfx942 -o=/sharedfile/attn/128/fp8_attn.vmfb --iree-hal-target-device=hip --iree-dispatch-creation-enable-aggressive-fusion=true --iree-global-opt-propagate-transposes=true --iree-opt-aggressively-propagate-transposes=true --iree-opt-data-tiling=false '--iree-preprocessing-pass-pipeline=builtin.module(util.func(iree-preprocessing-generalize-linalg-matmul-experimental))' --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hal-memoization=true --iree-opt-strip-assertions | |
+ ROCR_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 | |
+ iree-benchmark-module --hip_use_streams=true --module=/sharedfile/attn/128/fp8_attn.vmfb --parameters=model=/sharedfile/attn/fp8_attn.irpa --device=hip://4 --function=prefill_bs4 --input=4x128xi64=@/sharedfile/128/prefill/prefill_token_ids_4x128xi64.bin --input=4xi64=@/sharedfile/128/prefill/prefill_seq_lens_4xi64.bin --input=4x4xi64=@/sharedfile/128/prefill/prefill_seq_block_ids_4x4xi64.bin --input=261x2097152xf8E4M3FNUZ=@/sharedfile/128/prefill/prefill_cache_state_261x2097152xf8E4M3FNUZ.bin --benchmark_repetitions=3 | |
2025-03-16T18:15:20-07:00 | |
Running /home/chi/.iree/bisect/3bfd3628d03587d25fe7f5de126cbb672fd0d71f/.venv/lib/python3.11/site-packages/iree/_runtime_libs/iree-benchmark-module | |
Run on (96 X 3810.79 MHz CPU s) | |
CPU Caches: | |
L1 Data 32 KiB (x96) | |
L1 Instruction 32 KiB (x96) | |
L2 Unified 1024 KiB (x96) | |
L3 Unified 32768 KiB (x16) | |
Load Average: 3.52, 2.79, 2.94 | |
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. | |
------------------------------------------------------------------------------------------------------- | |
Benchmark Time CPU Iterations UserCounters... | |
------------------------------------------------------------------------------------------------------- | |
BM_prefill_bs4/process_time/real_time 29.9 ms 30.7 ms 24 items_per_second=33.4553/s | |
BM_prefill_bs4/process_time/real_time 29.9 ms 30.8 ms 24 items_per_second=33.4601/s | |
BM_prefill_bs4/process_time/real_time 29.8 ms 30.7 ms 24 items_per_second=33.5245/s | |
BM_prefill_bs4/process_time/real_time_mean 29.9 ms 30.7 ms 3 items_per_second=33.48/s | |
BM_prefill_bs4/process_time/real_time_median 29.9 ms 30.7 ms 3 items_per_second=33.4601/s | |
BM_prefill_bs4/process_time/real_time_stddev 0.034 ms 0.062 ms 3 items_per_second=0.0386615/s | |
BM_prefill_bs4/process_time/real_time_cv 0.12 % 0.20 % 3 items_per_second=0.12% | |
(bisect.venv) ➜ bisect git:(main) ✗ python ./bisect_packages.py \ | |
--good-ref=00e88733e6b8c8cdb351d4516509f56daebdf604 \ | |
--bad-ref=3bfd3628d03587d25fe7f5de126cbb672fd0d71f \ | |
--test-script=/sharedfile/attn/bisect/issue1.sh | |
Welcome to bisect_packages.py! | |
------------------------------------------------------------------ | |
--------- Configuration ------------------------------------------ | |
------------------------------------------------------------------ | |
Searching range : '00e88733e6b8c8cdb351d4516509f56daebdf604' - '3bfd3628d03587d25fe7f5de126cbb672fd0d71f' | |
Using working directory : '/home/chi/.iree/bisect' | |
Using test script : '/sharedfile/attn/bisect/issue1.sh' | |
Current platform is 'Linux-6.8.0-52-generic-x86_64-with-glibc2.35', platform.system is 'Linux'. | |
Current Python version is '3.11.1 (main, Oct 7 2024, 06:16:08) [GCC 11.4.0]'. This script requires 3.11. | |
Found gh at '/usr/bin/gh'. | |
------------------------------------------------------------------ | |
------------------------------------------------------------------ | |
--------- Running git bisect ------------------------------------- | |
------------------------------------------------------------------ | |
Bisecting: 0 revisions left to test after this (roughly 1 step) | |
[db5b69aab08593e215e4c999854fa633bcedb346] [AMDGPU] Do not rewrite or approximate math functions on ROCm (#20222) | |
running '/home/chi/.iree/bisect/bisect_run_script.sh' | |
++ git rev-parse BISECT_HEAD | |
+ REF_HASH=db5b69aab08593e215e4c999854fa633bcedb346 | |
+ python /home/chi/src/iree/build_tools/pkgci/bisect/../setup_venv.py /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv --artifact-path=/home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346 --fetch-git-ref=db5b69aab08593e215e4c999854fa633bcedb346 | |
Finding workflow run for ref: db5b69aab08593e215e4c999854fa633bcedb346 | |
Using normalized ref: db5b69aab08593e215e4c999854fa633bcedb346 | |
Running command to list workflow runs: | |
gh api -H Accept: application/vnd.github+json -H X-GitHub-Api-Version: 2022-11-28 /repos/iree-org/iree/actions/workflows/pkgci.yml/runs?head_sha=db5b69aab08593e215e4c999854fa633bcedb346 | |
Found workflow run: https://github.com/iree-org/iree/actions/runs/13816514809 | |
Package iree-base-compiler not found in cache. Fetching from linux_x86_64_release_packages... | |
Fetching artifacts for workflow run: 13816514809 | |
Found artifacts: | |
linux_x86_64_release_packages: /repos/iree-org/iree/actions/artifacts/2739270803/zip | |
Downloading artifact /repos/iree-org/iree/actions/artifacts/2739270803/zip | |
Extracting /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/linux_x86_64_release_packages.zip | |
Installing wheels: [(PosixPath('/home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346'), 'iree-base-compiler'), (PosixPath('/home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346'), 'iree-base-runtime')] | |
Creating venv at /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv | |
Running command: /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346 --force-reinstall iree-base-compiler | |
Looking in links: /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346 | |
Processing /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/iree_base_compiler-3.3.0.dev0+db5b69aab08593e215e4c999854fa633bcedb346-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-compiler | |
Successfully installed iree-base-compiler-3.3.0.dev0+db5b69aab08593e215e4c999854fa633bcedb346 | |
Running command: /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346 --force-reinstall iree-base-runtime | |
Looking in links: /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346 | |
Processing /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/iree_base_runtime-3.3.0.dev0+db5b69aab08593e215e4c999854fa633bcedb346-cp311-cp311-manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-runtime | |
Successfully installed iree-base-runtime-3.3.0.dev0+db5b69aab08593e215e4c999854fa633bcedb346 | |
venv setup complete at '/home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv'. Activate it with | |
source /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv/bin/activate | |
+ PATH=/home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv/bin:/usr/lib/git-core:/usr/lib/git-core:/sharedfile/bisect.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin | |
+ set +e | |
+ iree-compile /sharedfile/attn/128/fp8_attn.mlir --iree-hip-target=gfx942 -o=/sharedfile/attn/128/fp8_attn.vmfb --iree-hal-target-device=hip --iree-dispatch-creation-enable-aggressive-fusion=true --iree-global-opt-propagate-transposes=true --iree-opt-aggressively-propagate-transposes=true --iree-opt-data-tiling=false '--iree-preprocessing-pass-pipeline=builtin.module(util.func(iree-preprocessing-generalize-linalg-matmul-experimental))' --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hal-memoization=true --iree-opt-strip-assertions | |
+ ROCR_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 | |
+ iree-benchmark-module --hip_use_streams=true --module=/sharedfile/attn/128/fp8_attn.vmfb --parameters=model=/sharedfile/attn/fp8_attn.irpa --device=hip://4 --function=prefill_bs4 --input=4x128xi64=@/sharedfile/128/prefill/prefill_token_ids_4x128xi64.bin --input=4xi64=@/sharedfile/128/prefill/prefill_seq_lens_4xi64.bin --input=4x4xi64=@/sharedfile/128/prefill/prefill_seq_block_ids_4x4xi64.bin --input=261x2097152xf8E4M3FNUZ=@/sharedfile/128/prefill/prefill_cache_state_261x2097152xf8E4M3FNUZ.bin --benchmark_repetitions=3 | |
2025-03-16T18:18:24-07:00 | |
Running /home/chi/.iree/bisect/db5b69aab08593e215e4c999854fa633bcedb346/.venv/lib/python3.11/site-packages/iree/_runtime_libs/iree-benchmark-module | |
Run on (96 X 3810.79 MHz CPU s) | |
CPU Caches: | |
L1 Data 32 KiB (x96) | |
L1 Instruction 32 KiB (x96) | |
L2 Unified 1024 KiB (x96) | |
L3 Unified 32768 KiB (x16) | |
Load Average: 4.70, 3.19, 3.04 | |
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. | |
------------------------------------------------------------------------------------------------------- | |
Benchmark Time CPU Iterations UserCounters... | |
------------------------------------------------------------------------------------------------------- | |
BM_prefill_bs4/process_time/real_time 29.7 ms 30.4 ms 24 items_per_second=33.6578/s | |
BM_prefill_bs4/process_time/real_time 29.8 ms 30.4 ms 24 items_per_second=33.612/s | |
BM_prefill_bs4/process_time/real_time 29.8 ms 30.3 ms 24 items_per_second=33.5952/s | |
BM_prefill_bs4/process_time/real_time_mean 29.7 ms 30.4 ms 3 items_per_second=33.6217/s | |
BM_prefill_bs4/process_time/real_time_median 29.8 ms 30.4 ms 3 items_per_second=33.612/s | |
BM_prefill_bs4/process_time/real_time_stddev 0.029 ms 0.032 ms 3 items_per_second=0.0324114/s | |
BM_prefill_bs4/process_time/real_time_cv 0.10 % 0.11 % 3 items_per_second=0.10% | |
(bisect.venv) ➜ bisect git:(main) ✗ python ./bisect_packages.py \ | |
--good-ref=00e88733e6b8c8cdb351d4516509f56daebdf604 \ | |
--bad-ref=db5b69aab08593e215e4c999854fa633bcedb346 \ | |
--test-script=/sharedfile/attn/bisect/issue1.sh | |
Welcome to bisect_packages.py! | |
------------------------------------------------------------------ | |
--------- Configuration ------------------------------------------ | |
------------------------------------------------------------------ | |
Searching range : '00e88733e6b8c8cdb351d4516509f56daebdf604' - 'db5b69aab08593e215e4c999854fa633bcedb346' | |
Using working directory : '/home/chi/.iree/bisect' | |
Using test script : '/sharedfile/attn/bisect/issue1.sh' | |
Current platform is 'Linux-6.8.0-52-generic-x86_64-with-glibc2.35', platform.system is 'Linux'. | |
Current Python version is '3.11.1 (main, Oct 7 2024, 06:16:08) [GCC 11.4.0]'. This script requires 3.11. | |
Found gh at '/usr/bin/gh'. | |
------------------------------------------------------------------ | |
------------------------------------------------------------------ | |
--------- Running git bisect ------------------------------------- | |
------------------------------------------------------------------ | |
Bisecting: 0 revisions left to test after this (roughly 0 steps) | |
[9d693cb25223d0520007d97614412e5d7a9f3606] [Codegen][Tuner] improve verifier for the default attribute (#20173) | |
running '/home/chi/.iree/bisect/bisect_run_script.sh' | |
++ git rev-parse BISECT_HEAD | |
+ REF_HASH=9d693cb25223d0520007d97614412e5d7a9f3606 | |
+ python /home/chi/src/iree/build_tools/pkgci/bisect/../setup_venv.py /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv --artifact-path=/home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606 --fetch-git-ref=9d693cb25223d0520007d97614412e5d7a9f3606 | |
Finding workflow run for ref: 9d693cb25223d0520007d97614412e5d7a9f3606 | |
Using normalized ref: 9d693cb25223d0520007d97614412e5d7a9f3606 | |
Running command to list workflow runs: | |
gh api -H Accept: application/vnd.github+json -H X-GitHub-Api-Version: 2022-11-28 /repos/iree-org/iree/actions/workflows/pkgci.yml/runs?head_sha=9d693cb25223d0520007d97614412e5d7a9f3606 | |
Found workflow run: https://github.com/iree-org/iree/actions/runs/13815044240 | |
Package iree-base-compiler not found in cache. Fetching from linux_x86_64_release_packages... | |
Fetching artifacts for workflow run: 13815044240 | |
Found artifacts: | |
linux_x86_64_release_packages: /repos/iree-org/iree/actions/artifacts/2738691432/zip | |
Downloading artifact /repos/iree-org/iree/actions/artifacts/2738691432/zip | |
Extracting /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/linux_x86_64_release_packages.zip | |
Installing wheels: [(PosixPath('/home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606'), 'iree-base-compiler'), (PosixPath('/home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606'), 'iree-base-runtime')] | |
Creating venv at /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv | |
Running command: /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606 --force-reinstall iree-base-compiler | |
Looking in links: /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606 | |
Processing /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/iree_base_compiler-3.3.0.dev0+9d693cb25223d0520007d97614412e5d7a9f3606-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-compiler | |
Successfully installed iree-base-compiler-3.3.0.dev0+9d693cb25223d0520007d97614412e5d7a9f3606 | |
Running command: /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv/bin/python -m pip install --no-deps --no-index -f /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606 --force-reinstall iree-base-runtime | |
Looking in links: /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606 | |
Processing /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/iree_base_runtime-3.3.0.dev0+9d693cb25223d0520007d97614412e5d7a9f3606-cp311-cp311-manylinux_2_28_x86_64.whl | |
Installing collected packages: iree-base-runtime | |
Successfully installed iree-base-runtime-3.3.0.dev0+9d693cb25223d0520007d97614412e5d7a9f3606 | |
venv setup complete at '/home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv'. Activate it with | |
source /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv/bin/activate | |
+ PATH=/home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv/bin:/usr/lib/git-core:/usr/lib/git-core:/sharedfile/bisect.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin | |
+ set +e | |
+ iree-compile /sharedfile/attn/128/fp8_attn.mlir --iree-hip-target=gfx942 -o=/sharedfile/attn/128/fp8_attn.vmfb --iree-hal-target-device=hip --iree-dispatch-creation-enable-aggressive-fusion=true --iree-global-opt-propagate-transposes=true --iree-opt-aggressively-propagate-transposes=true --iree-opt-data-tiling=false '--iree-preprocessing-pass-pipeline=builtin.module(util.func(iree-preprocessing-generalize-linalg-matmul-experimental))' --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hal-memoization=true --iree-opt-strip-assertions | |
+ ROCR_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 | |
+ iree-benchmark-module --hip_use_streams=true --module=/sharedfile/attn/128/fp8_attn.vmfb --parameters=model=/sharedfile/attn/fp8_attn.irpa --device=hip://4 --function=prefill_bs4 --input=4x128xi64=@/sharedfile/128/prefill/prefill_token_ids_4x128xi64.bin --input=4xi64=@/sharedfile/128/prefill/prefill_seq_lens_4xi64.bin --input=4x4xi64=@/sharedfile/128/prefill/prefill_seq_block_ids_4x4xi64.bin --input=261x2097152xf8E4M3FNUZ=@/sharedfile/128/prefill/prefill_cache_state_261x2097152xf8E4M3FNUZ.bin --benchmark_repetitions=3 | |
2025-03-16T18:19:47-07:00 | |
Running /home/chi/.iree/bisect/9d693cb25223d0520007d97614412e5d7a9f3606/.venv/lib/python3.11/site-packages/iree/_runtime_libs/iree-benchmark-module | |
Run on (96 X 3810.79 MHz CPU s) | |
CPU Caches: | |
L1 Data 32 KiB (x96) | |
L1 Instruction 32 KiB (x96) | |
L2 Unified 1024 KiB (x96) | |
L3 Unified 32768 KiB (x16) | |
Load Average: 3.79, 3.15, 3.04 | |
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. | |
------------------------------------------------------------------------------------------------------- | |
Benchmark Time CPU Iterations UserCounters... | |
------------------------------------------------------------------------------------------------------- | |
BM_prefill_bs4/process_time/real_time 27.5 ms 28.1 ms 26 items_per_second=36.4008/s | |
BM_prefill_bs4/process_time/real_time 27.5 ms 28.1 ms 26 items_per_second=36.3567/s | |
BM_prefill_bs4/process_time/real_time 27.5 ms 28.1 ms 26 items_per_second=36.3624/s | |
BM_prefill_bs4/process_time/real_time_mean 27.5 ms 28.1 ms 3 items_per_second=36.3733/s | |
BM_prefill_bs4/process_time/real_time_median 27.5 ms 28.1 ms 3 items_per_second=36.3624/s | |
BM_prefill_bs4/process_time/real_time_stddev 0.018 ms 0.003 ms 3 items_per_second=0.023992/s | |
BM_prefill_bs4/process_time/real_time_cv 0.07 % 0.01 % 3 items_per_second=0.07% |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment