Skip to content

Instantly share code, notes, and snippets.

@AmosLewis
Last active September 20, 2025 01:08
Show Gist options
  • Save AmosLewis/639ddcda4f139c26e196edd67a14ba54 to your computer and use it in GitHub Desktop.
Save AmosLewis/639ddcda4f139c26e196edd67a14ba54 to your computer and use it in GitHub Desktop.
docker is 7.0.0 FROM rocm/7.0:rocm7.0_ubuntu_22.04_sgl-dev-v0.5.2-rocm7.0-mi35x-20250915
export HIP_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export ROCR_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
rocprofv3 --output-format pftrace -r -- python3 -u harness_alt_mi355.py \
  --devices "0,1,2,3,4,5,6,7" --scenario "$TEST_SCENARIO" \
  --test_mode "$TEST_MODE" \
  --bs 2 \
  --user_conf_path user.conf \
  --count 8 \
  --tensor_path /data/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl \
  --logfile_outdir "Output${TEST_SCENARIO}${TEST_MODE}" \
  --debug "$DEBUG" \
  --verbose "$VERBOSE" \
  --user_conf_path "user.conf" \
  --shortfin_config "$SHORTFIN_CONFIG" 2>&1 | tee server.log
W20250919 21:42:48.188076 139723448015616 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.001704 sec
W20250919 21:42:48.188214 139723448015616 simple_timer.cpp:55] [rocprofv3] 'python3 -u harness_alt_mi355.py --devices 0,1,2,3,4,5,6,7 --scenario Offline --test_mode PerformanceOnly --bs 2 --user_conf_path user.conf --count 8 --tensor_path /data/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl --logfile_outdir OutputOfflinePerformanceOnly --debug False --verbose False --user_conf_path user.conf --shortfin_config shortfin_405b_config_fp4.json' :: 0.000000 sec
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:48.667214
INFO:root:####################################################################################################################################################################################
Running python3 harness_alt_mi355.py --devices 0,1,2,3,4,5,6,7 --scenario Offline --test_mode PerformanceOnly --bs 2 --user_conf_path user.conf --count 8 --tensor_path /data/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl --logfile_outdir OutputOfflinePerformanceOnly --debug False --verbose False --user_conf_path user.conf --shortfin_config shortfin_405b_config_fp4.json
##############################################################################################################################################################################################
WARNING:root:Override count with 8
INFO:Llama-405B-Dataset:Loading dataset...
INFO:Llama-405B-Dataset:Finished loading dataset.
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-2 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-3 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-4 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-5 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-6 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-7 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-8 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
Attempt to enable hip visiblity for agent-9 which is not visible to HSA (ROCR)
W20250919 21:42:50.824241 139888629246720 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.003257 sec
W20250919 21:42:50.824388 139888629246720 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=26) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.845466 139960833170176 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002975 sec
W20250919 21:42:50.845575 139960833170176 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.resource_tracker import main;main(5)' :: 0.000000 sec
W20250919 21:42:50.863588 140675557010176 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002736 sec
W20250919 21:42:50.863708 140675557010176 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=12) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.868236 140453796326144 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002719 sec
W20250919 21:42:50.868367 140453796326144 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=18) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.872237 140206473005824 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002562 sec
W20250919 21:42:50.872359 140206473005824 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=20) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.873099 140255432387328 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002312 sec
W20250919 21:42:50.873222 140255432387328 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=14) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.873718 140157087640320 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002396 sec
W20250919 21:42:50.873837 140157087640320 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=22) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.874011 140638869089024 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002491 sec
W20250919 21:42:50.874129 140638869089024 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=16) --multiprocessing-fork' :: 0.000000 sec
W20250919 21:42:50.874360 140467879783168 simple_timer.cpp:55] [rocprofv3] tool initialization :: 0.002387 sec
W20250919 21:42:50.874478 140467879783168 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=24) --multiprocessing-fork' :: 0.000000 sec
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.332266
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.336429
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.345785
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.354247
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.356987
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.356988
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.357025
INFO:shortfin_apps.llm.components.service_debug_dumper:[debug_service.py] Please find debug dumps for service.py in /root/.shortfin/debug/llm_service_invocation_dumps/2025-09-19T21:42:51.379456
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [0]
W20250919 21:42:52.627373 140675557010176 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.628025 140675557010176 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [0]
W20250919 21:42:52.702584 140453796326144 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.703280 140453796326144 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [1]
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:Nearest nodes = [1]
INFO:root:GPU: 0
INFO:root:Nearest nodes = [0]
W20250919 21:42:52.751642 140467879783168 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.752011 140467879783168 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
W20250919 21:42:52.754816 140157087640320 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.755526 140157087640320 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
W20250919 21:42:52.756374 140638869089024 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.756986 140638869089024 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [0]
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [1]
INFO:root:NUMA hardware info: {'numa_node_distance': [[10, 32], [32, 10]], 'node_cpu_info': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191], 1: [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]}}
INFO:root:GPU: 0
INFO:root:Nearest nodes = [1]
W20250919 21:42:52.766714 140206473005824 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.766890 140255432387328 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.767343 139888629246720 tool.cpp:2158] RCCL version 2.26.6 initialized (instance=0)
W20250919 21:42:52.767427 140206473005824 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
W20250919 21:42:52.767522 140255432387328 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
W20250919 21:42:52.768062 139888629246720 tool.cpp:2158] HIP (runtime) version 7.0.0 initialized (instance=0)
W20250919 21:42:53.295138 140675557010176 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
W20250919 21:42:53.475757 140453796326144 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
W20250919 21:42:53.612142 140638869089024 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
W20250919 21:42:53.642388 140157087640320 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
W20250919 21:42:53.642589 139888629246720 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
W20250919 21:42:53.644369 140467879783168 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
W20250919 21:42:53.648671 140255432387328 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
W20250919 21:42:53.650420 140206473005824 tool.cpp:2158] HSA version 8.18.0 initialized (instance=0)
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:shortfin_apps.llm.components.manager:Created local system with ['amdgpu:0:0@0'] devices
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Allocating page table (shape=[2816, 8257536], dtype=float8_e4m3fn, size=21.7GiB) on DeviceAffinity(amdgpu:0:0@0[0x1])
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Loading parameter fiber 'model' from: /shark-dev/weights/fp4/fp4_2025_07_10_fn.irpa
INFO:shortfin_apps.llm.components.manager:Starting system manager
INFO:root:Start Test!
INFO:micro_llama_process_samples:SampleResponder-2 Sending response
INFO:micro_llama_process_samples:SampleResponder-2 end time: 620706.600690797
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-1 Sending response
INFO:micro_llama_process_samples:SampleResponder-1 end time: 620757.741904219
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-2 Sending response
INFO:micro_llama_process_samples:SampleResponder-2 end time: 620763.240478815
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-1 Sending response
INFO:micro_llama_process_samples:SampleResponder-1 end time: 620773.000147627
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-1 Sending response
INFO:micro_llama_process_samples:SampleResponder-1 end time: 620776.558104303
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-2 Sending response
INFO:micro_llama_process_samples:SampleResponder-2 end time: 620781.137441625
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-2 Sending response
INFO:micro_llama_process_samples:SampleResponder-2 end time: 621028.074139599
ALLCLOSE: True
INFO:micro_llama_process_samples:SampleResponder-1 Sending response
INFO:micro_llama_process_samples:SampleResponder-1 end time: 621290.426896844
ALLCLOSE: True
================================================
MLPerf Results Summary
================================================
SUT name : PySUT
Scenario : Offline
Mode : PerformanceOnly
Samples per second: 0.0115753
Tokens per second: 9.28485
Result is : VALID
Min duration satisfied : Yes
Min queries satisfied : Yes
Early stopping satisfied: Yes
================================================
Additional Stats
================================================
Min latency (ns) : 107297600588
Max latency (ns) : 691125557261
Mean latency (ns) : 260294441263
50.00 percentile latency (ns) : 177254424397
90.00 percentile latency (ns) : 691125557261
95.00 percentile latency (ns) : 691125557261
97.00 percentile latency (ns) : 691125557261
99.00 percentile latency (ns) : 691125557261
99.90 percentile latency (ns) : 691125557261
================================================
Test Parameters Used
================================================
samples_per_query : 8
target_qps : 1
ttft_latency (ns): 6000000000
tpot_latency (ns): 175000000
max_async_queries : 1
min_duration (ms): 6000
max_duration (ms): 0
min_query_count : 1
max_query_count : 8
qsl_rng_seed : 1780908523862526354
sample_index_rng_seed : 14771362308971278857
schedule_rng_seed : 18209322760996052031
accuracy_log_rng_seed : 0
accuracy_log_probability : 0
accuracy_log_sampling_target : 0
print_timestamps : 0
performance_issue_unique : 0
performance_issue_same : 0
performance_issue_same_index : 0
performance_sample_count : 8313
WARNING: sample_concatenate_permutation was set to true.
Generated samples per query might be different as the one in the setting.
Check the generated_samples_per_query line in the detailed log for the real
samples_per_query value
No warnings encountered during test.
No errors encountered during test.
INFO:root:[Server] Received 8 total samples
W20250919 21:55:21.036105 140675557010176 tool.cpp:2784] [PPID=68][PID=188][TID=188][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 21:55:21.036254 140675557010176 tool.cpp:2807] [PPID=68][PID=188][TID=188][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 21:55:21.036261 140675557010176 tool.cpp:2822] [PPID=68][PID=188][TID=188][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[320.801] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:04:37.284665 140675557010176 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/188_results.pftrace
[900.052] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:05:01.467125 140675557010176 simple_timer.cpp:55] [rocprofv3] tool finalization :: 580.152169 sec
W20250919 22:05:01.469621 140675557010176 tool.cpp:2863] [PPID=68][PID=188][TID=188][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758319501 (unix time) try "date -d @1758319501" if you are using GNU date ***
PC: @ 0x7ff19f4fe117 (unknown)
*** SIGTERM (@0x44) received by PID 188 (TID 0x7ff1949aab00) from PID 68; stack trace: ***
@ 0x7ff19f506ee8 (unknown)
@ 0x7ff1a08a147e (unknown)
@ 0x7ff1a054f1b7 (unknown)
@ 0x7ff19f506ee8 (unknown)
@ 0x7ff1a054865c (unknown)
@ 0x7ff19f4af520 (unknown)
@ 0x7ff19f4fe117 (unknown)
@ 0x7ff19f509c78 (unknown)
@ 0x7ff19f967cb5 PyThread_acquire_lock_timed
@ 0x7ff19f9d6830 acquire_timed
@ 0x7ff19f9d832c lock_PyThread_acquire_lock
@ 0x7ff19f81cc22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7ff19f80fec8 PyObject_Vectorcall
@ 0x7ff19f7b25df _PyEval_EvalFrameDefault
@ 0x7ff19f90ae7f PyEval_EvalCode
@ 0x7ff19f9564dd run_mod
@ 0x7ff19f9584bf PyRun_StringFlags
@ 0x7ff19f95852d PyRun_SimpleStringFlags
@ 0x7ff19f97859c Py_RunMain
@ 0x7ff19f9790be Py_BytesMain
@ 0x7ff1a054f6c6 (unknown)
@ 0x7ff19f496d90 (unknown)
@ 0x7ff19f496e40 __libc_start_main
@ 0x5618f04d5095 _start
W20250919 22:05:01.781781 140675557010176 tool.cpp:2784] [PPID=68][PID=188][TID=188][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:05:32.002657 140255432387328 tool.cpp:2784] [PPID=68][PID=189][TID=189][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:05:32.002785 140255432387328 tool.cpp:2807] [PPID=68][PID=189][TID=189][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:05:32.002794 140255432387328 tool.cpp:2822] [PPID=68][PID=189][TID=189][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[931.513] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:05:33.189913 140255432387328 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/189_results.pftrace
[932.679] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:05:33.221643 140255432387328 simple_timer.cpp:55] [rocprofv3] tool finalization :: 1.212416 sec
W20250919 22:05:33.222618 140255432387328 tool.cpp:2863] [PPID=68][PID=189][TID=189][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758319533 (unix time) try "date -d @1758319533" if you are using GNU date ***
PC: @ 0x7f8fcdeee117 (unknown)
*** SIGTERM (@0x44) received by PID 189 (TID 0x7f8fc339ab00) from PID 68; stack trace: ***
@ 0x7f8fcdef6ee8 (unknown)
@ 0x7f8fcf29147e (unknown)
@ 0x7f8fcef3f1b7 (unknown)
@ 0x7f8fcdef6ee8 (unknown)
@ 0x7f8fcef3865c (unknown)
@ 0x7f8fcde9f520 (unknown)
@ 0x7f8fcdeee117 (unknown)
@ 0x7f8fcdef9c78 (unknown)
@ 0x7f8fce357cb5 PyThread_acquire_lock_timed
@ 0x7f8fce3c6830 acquire_timed
@ 0x7f8fce3c832c lock_PyThread_acquire_lock
@ 0x7f8fce20cc22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7f8fce1ffec8 PyObject_Vectorcall
@ 0x7f8fce1a25df _PyEval_EvalFrameDefault
@ 0x7f8fce2fae7f PyEval_EvalCode
@ 0x7f8fce3464dd run_mod
@ 0x7f8fce3484bf PyRun_StringFlags
@ 0x7f8fce34852d PyRun_SimpleStringFlags
@ 0x7f8fce36859c Py_RunMain
@ 0x7f8fce3690be Py_BytesMain
@ 0x7f8fcef3f6c6 (unknown)
@ 0x7f8fcde86d90 (unknown)
@ 0x7f8fcde86e40 __libc_start_main
@ 0x55af00fc2095 _start
W20250919 22:05:33.518288 140255432387328 tool.cpp:2784] [PPID=68][PID=189][TID=189][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:06:03.626457 140638869089024 tool.cpp:2784] [PPID=68][PID=190][TID=190][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:06:03.626580 140638869089024 tool.cpp:2807] [PPID=68][PID=190][TID=190][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:06:03.626589 140638869089024 tool.cpp:2822] [PPID=68][PID=190][TID=190][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[963.224] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:09:53.662023 140638869089024 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/190_results.pftrace
[216.603] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:10:17.467114 140638869089024 simple_timer.cpp:55] [rocprofv3] tool finalization :: 253.728989 sec
W20250919 22:10:17.468297 140638869089024 tool.cpp:2863] [PPID=68][PID=190][TID=190][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758319817 (unix time) try "date -d @1758319817" if you are using GNU date ***
PC: @ 0x7fe9148aa117 (unknown)
*** SIGTERM (@0x44) received by PID 190 (TID 0x7fe909d56b00) from PID 68; stack trace: ***
@ 0x7fe9148b2ee8 (unknown)
@ 0x7fe915c4d47e (unknown)
@ 0x7fe9158fb1b7 (unknown)
@ 0x7fe9148b2ee8 (unknown)
@ 0x7fe9158f465c (unknown)
@ 0x7fe91485b520 (unknown)
@ 0x7fe9148aa117 (unknown)
@ 0x7fe9148b5c78 (unknown)
@ 0x7fe914d13cb5 PyThread_acquire_lock_timed
@ 0x7fe914d82830 acquire_timed
@ 0x7fe914d8432c lock_PyThread_acquire_lock
@ 0x7fe914bc8c22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7fe914bbbec8 PyObject_Vectorcall
@ 0x7fe914b5e5df _PyEval_EvalFrameDefault
@ 0x7fe914cb6e7f PyEval_EvalCode
@ 0x7fe914d024dd run_mod
@ 0x7fe914d044bf PyRun_StringFlags
@ 0x7fe914d0452d PyRun_SimpleStringFlags
@ 0x7fe914d2459c Py_RunMain
@ 0x7fe914d250be Py_BytesMain
@ 0x7fe9158fb6c6 (unknown)
@ 0x7fe914842d90 (unknown)
@ 0x7fe914842e40 __libc_start_main
@ 0x55cd41ccd095 _start
W20250919 22:10:17.766443 140638869089024 tool.cpp:2784] [PPID=68][PID=190][TID=190][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:10:47.930798 140453796326144 tool.cpp:2784] [PPID=68][PID=191][TID=191][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:10:47.930923 140453796326144 tool.cpp:2807] [PPID=68][PID=191][TID=191][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:10:47.930930 140453796326144 tool.cpp:2822] [PPID=68][PID=191][TID=191][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[247.441] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:10:49.369810 140453796326144 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/191_results.pftrace
[248.891] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:10:49.433773 140453796326144 simple_timer.cpp:55] [rocprofv3] tool finalization :: 1.496643 sec
W20250919 22:10:49.434734 140453796326144 tool.cpp:2863] [PPID=68][PID=191][TID=191][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758319849 (unix time) try "date -d @1758319849" if you are using GNU date ***
PC: @ 0x7fbdfd586117 (unknown)
*** SIGTERM (@0x44) received by PID 191 (TID 0x7fbdf2a32b00) from PID 68; stack trace: ***
@ 0x7fbdfd58eee8 (unknown)
@ 0x7fbdfe92947e (unknown)
@ 0x7fbdfe5d71b7 (unknown)
@ 0x7fbdfd58eee8 (unknown)
@ 0x7fbdfe5d065c (unknown)
@ 0x7fbdfd537520 (unknown)
@ 0x7fbdfd586117 (unknown)
@ 0x7fbdfd591c78 (unknown)
@ 0x7fbdfd9efcb5 PyThread_acquire_lock_timed
@ 0x7fbdfda5e830 acquire_timed
@ 0x7fbdfda6032c lock_PyThread_acquire_lock
@ 0x7fbdfd8a4c22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7fbdfd897ec8 PyObject_Vectorcall
@ 0x7fbdfd83a5df _PyEval_EvalFrameDefault
@ 0x7fbdfd992e7f PyEval_EvalCode
@ 0x7fbdfd9de4dd run_mod
@ 0x7fbdfd9e04bf PyRun_StringFlags
@ 0x7fbdfd9e052d PyRun_SimpleStringFlags
@ 0x7fbdfda0059c Py_RunMain
@ 0x7fbdfda010be Py_BytesMain
@ 0x7fbdfe5d76c6 (unknown)
@ 0x7fbdfd51ed90 (unknown)
@ 0x7fbdfd51ee40 __libc_start_main
@ 0x563526fd1095 _start
W20250919 22:11:19.814640 140206473005824 tool.cpp:2784] [PPID=68][PID=192][TID=192][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:11:19.814776 140206473005824 tool.cpp:2807] [PPID=68][PID=192][TID=192][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:11:19.814782 140206473005824 tool.cpp:2822] [PPID=68][PID=192][TID=192][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[279.377] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:13:36.165588 140206473005824 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/192_results.pftrace
[439.129] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:13:59.829412 140206473005824 simple_timer.cpp:55] [rocprofv3] tool finalization :: 159.947259 sec
W20250919 22:13:59.830272 140206473005824 tool.cpp:2863] [PPID=68][PID=192][TID=192][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758320039 (unix time) try "date -d @1758320039" if you are using GNU date ***
PC: @ 0x7f8467ba0117 (unknown)
*** SIGTERM (@0x44) received by PID 192 (TID 0x7f845d04cb00) from PID 68; stack trace: ***
@ 0x7f8467ba8ee8 (unknown)
@ 0x7f8468f4347e (unknown)
@ 0x7f8468bf11b7 (unknown)
@ 0x7f8467ba8ee8 (unknown)
@ 0x7f8468bea65c (unknown)
@ 0x7f8467b51520 (unknown)
@ 0x7f8467ba0117 (unknown)
@ 0x7f8467babc78 (unknown)
@ 0x7f8468009cb5 PyThread_acquire_lock_timed
@ 0x7f8468078830 acquire_timed
@ 0x7f846807a32c lock_PyThread_acquire_lock
@ 0x7f8467ebec22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7f8467eb1ec8 PyObject_Vectorcall
@ 0x7f8467e545df _PyEval_EvalFrameDefault
@ 0x7f8467face7f PyEval_EvalCode
@ 0x7f8467ff84dd run_mod
@ 0x7f8467ffa4bf PyRun_StringFlags
@ 0x7f8467ffa52d PyRun_SimpleStringFlags
@ 0x7f846801a59c Py_RunMain
@ 0x7f846801b0be Py_BytesMain
@ 0x7f8468bf16c6 (unknown)
@ 0x7f8467b38d90 (unknown)
@ 0x7f8467b38e40 __libc_start_main
@ 0x56155e315095 _start
W20250919 22:14:00.109237 140206473005824 tool.cpp:2784] [PPID=68][PID=192][TID=192][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:14:30.220467 140157087640320 tool.cpp:2784] [PPID=68][PID=193][TID=193][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:14:30.220590 140157087640320 tool.cpp:2807] [PPID=68][PID=193][TID=193][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:14:30.220596 140157087640320 tool.cpp:2822] [PPID=68][PID=193][TID=193][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[469.732] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:14:31.607971 140157087640320 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/193_results.pftrace
[471.126] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:14:31.662768 140157087640320 simple_timer.cpp:55] [rocprofv3] tool finalization :: 1.436028 sec
W20250919 22:14:31.663492 140157087640320 tool.cpp:2863] [PPID=68][PID=193][TID=193][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758320071 (unix time) try "date -d @1758320071" if you are using GNU date ***
PC: @ 0x7f78e8212117 (unknown)
*** SIGTERM (@0x44) received by PID 193 (TID 0x7f78dd6beb00) from PID 68; stack trace: ***
@ 0x7f78e821aee8 (unknown)
@ 0x7f78e95b547e (unknown)
@ 0x7f78e92631b7 (unknown)
@ 0x7f78e821aee8 (unknown)
@ 0x7f78e925c65c (unknown)
@ 0x7f78e81c3520 (unknown)
@ 0x7f78e8212117 (unknown)
@ 0x7f78e821dc78 (unknown)
@ 0x7f78e867bcb5 PyThread_acquire_lock_timed
@ 0x7f78e86ea830 acquire_timed
@ 0x7f78e86ec32c lock_PyThread_acquire_lock
@ 0x7f78e8530c22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7f78e8523ec8 PyObject_Vectorcall
@ 0x7f78e84c65df _PyEval_EvalFrameDefault
@ 0x7f78e861ee7f PyEval_EvalCode
@ 0x7f78e866a4dd run_mod
@ 0x7f78e866c4bf PyRun_StringFlags
@ 0x7f78e866c52d PyRun_SimpleStringFlags
@ 0x7f78e868c59c Py_RunMain
@ 0x7f78e868d0be Py_BytesMain
@ 0x7f78e92636c6 (unknown)
@ 0x7f78e81aad90 (unknown)
@ 0x7f78e81aae40 __libc_start_main
@ 0x55bf1958d095 _start
W20250919 22:14:31.936895 140157087640320 tool.cpp:2784] [PPID=68][PID=193][TID=193][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:15:02.024932 140467879783168 tool.cpp:2784] [PPID=68][PID=194][TID=194][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:15:02.025062 140467879783168 tool.cpp:2807] [PPID=68][PID=194][TID=194][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:15:02.025069 140467879783168 tool.cpp:2822] [PPID=68][PID=194][TID=194][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[501.536] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:15:03.322910 140467879783168 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/194_results.pftrace
[502.833] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:15:03.374594 140467879783168 simple_timer.cpp:55] [rocprofv3] tool finalization :: 1.343232 sec
W20250919 22:15:03.375466 140467879783168 tool.cpp:2863] [PPID=68][PID=194][TID=194][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758320103 (unix time) try "date -d @1758320103" if you are using GNU date ***
PC: @ 0x7fc144c8e117 (unknown)
*** SIGTERM (@0x44) received by PID 194 (TID 0x7fc13a13ab00) from PID 68; stack trace: ***
@ 0x7fc144c96ee8 (unknown)
@ 0x7fc14603147e (unknown)
@ 0x7fc145cdf1b7 (unknown)
@ 0x7fc144c96ee8 (unknown)
@ 0x7fc145cd865c (unknown)
@ 0x7fc144c3f520 (unknown)
@ 0x7fc144c8e117 (unknown)
@ 0x7fc144c99c78 (unknown)
@ 0x7fc1450f7cb5 PyThread_acquire_lock_timed
@ 0x7fc145166830 acquire_timed
@ 0x7fc14516832c lock_PyThread_acquire_lock
@ 0x7fc144facc22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7fc144f9fec8 PyObject_Vectorcall
@ 0x7fc144f425df _PyEval_EvalFrameDefault
@ 0x7fc14509ae7f PyEval_EvalCode
@ 0x7fc1450e64dd run_mod
@ 0x7fc1450e84bf PyRun_StringFlags
@ 0x7fc1450e852d PyRun_SimpleStringFlags
@ 0x7fc14510859c Py_RunMain
@ 0x7fc1451090be Py_BytesMain
@ 0x7fc145cdf6c6 (unknown)
@ 0x7fc144c26d90 (unknown)
@ 0x7fc144c26e40 __libc_start_main
@ 0x56456c85b095 _start
W20250919 22:15:33.746776 139888629246720 tool.cpp:2784] [PPID=68][PID=195][TID=195][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
W20250919 22:15:33.746912 139888629246720 tool.cpp:2807] [PPID=68][PID=195][TID=195][rocprofv3_error_signal_handler] rocprofv3 will wait for 0 children to exit
W20250919 22:15:33.746918 139888629246720 tool.cpp:2822] [PPID=68][PID=195][TID=195][rocprofv3_error_signal_handler] rocprofv3 finalizing after signal 15...
[533.426] perfetto.cc:47304 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1048576 KB, total sessions:1, uid:0 session name: ""
E20250919 22:20:42.354218 139888629246720 output_stream.cpp:111] Opened result file: /mlperf/harness/smci350-zts-gtu-c8-25/195_results.pftrace
[881.121] perfetto.cc:48888 Tracing session 1 ended, total sessions:0
W20250919 22:21:22.144151 139888629246720 simple_timer.cpp:55] [rocprofv3] tool finalization :: 348.203828 sec
W20250919 22:21:22.145321 139888629246720 tool.cpp:2863] [PPID=68][PID=195][TID=195][rocprofv3_error_signal_handler] rocprofv3 found chained signal handler for 15... executing chained sigaction (SIGINFO)
*** Aborted at 1758320482 (unix time) try "date -d @1758320482" if you are using GNU date ***
PC: @ 0x7f3a66c32117 (unknown)
*** SIGTERM (@0x44) received by PID 195 (TID 0x7f3a5c0deb00) from PID 68; stack trace: ***
@ 0x7f3a66c3aee8 (unknown)
@ 0x7f3a67fd547e (unknown)
@ 0x7f3a67c831b7 (unknown)
@ 0x7f3a66c3aee8 (unknown)
@ 0x7f3a67c7c65c (unknown)
@ 0x7f3a66be3520 (unknown)
@ 0x7f3a66c32117 (unknown)
@ 0x7f3a66c3dc78 (unknown)
@ 0x7f3a6709bcb5 PyThread_acquire_lock_timed
@ 0x7f3a6710a830 acquire_timed
@ 0x7f3a6710c32c lock_PyThread_acquire_lock
@ 0x7f3a66f50c22 method_vectorcall_VARARGS_KEYWORDS
@ 0x7f3a66f43ec8 PyObject_Vectorcall
@ 0x7f3a66ee65df _PyEval_EvalFrameDefault
@ 0x7f3a6703ee7f PyEval_EvalCode
@ 0x7f3a6708a4dd run_mod
@ 0x7f3a6708c4bf PyRun_StringFlags
@ 0x7f3a6708c52d PyRun_SimpleStringFlags
@ 0x7f3a670ac59c Py_RunMain
@ 0x7f3a670ad0be Py_BytesMain
@ 0x7f3a67c836c6 (unknown)
@ 0x7f3a66bcad90 (unknown)
@ 0x7f3a66bcae40 __libc_start_main
@ 0x558925230095 _start
W20250919 22:21:22.396152 139888629246720 tool.cpp:2784] [PPID=68][PID=195][TID=195][rocprofv3_error_signal_handler] rocprofv3 caught signal 15...
INFO:root:Test Done!
INFO:root:Destroying SUT...
INFO:root:Destroying QSL...
INFO:root:Check OutputOfflinePerformanceOnly
W20250919 22:21:23.402851 139723448015616 simple_timer.cpp:55] [rocprofv3] 'python3 -u harness_alt_mi355.py --devices 0,1,2,3,4,5,6,7 --scenario Offline --test_mode PerformanceOnly --bs 2 --user_conf_path user.conf --count 8 --tensor_path /data/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl --logfile_outdir OutputOfflinePerformanceOnly --debug False --verbose False --user_conf_path user.conf --shortfin_config shortfin_405b_config_fp4.json' :: 2315.214637 sec
W20250919 22:21:23.405983 139723448015616 simple_timer.cpp:55] [rocprofv3] tool finalization :: 0.003035 sec
W20250919 22:21:23.432644 139960833170176 simple_timer.cpp:55] [rocprofv3] '/root/.pyenv/versions/3.11.13/bin/python3 -c from multiprocessing.resource_tracker import main;main(5)' :: 2312.587069 sec
W20250919 22:21:23.435180 139960833170176 simple_timer.cpp:55] [rocprofv3] tool finalization :: 0.002473 sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment