Skip to content

Instantly share code, notes, and snippets.

@briansp2020
Created October 25, 2023 21:25
Show Gist options
  • Save briansp2020/93a030b8612497c9d691cbb39c991f55 to your computer and use it in GitHub Desktop.
Save briansp2020/93a030b8612497c9d691cbb39c991f55 to your computer and use it in GitHub Desktop.
MI100 new-ai-benchmakr
root@rocm:/root/tmp# python benchmark.py
2023-10-22 19:31:47.927753: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-22 19:31:47.946907: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-22 19:31:48.774214: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:48.785167: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:48.785201: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
>> AI-Benchmark - 0.1.3.cm
INFO:ai_benchmark:>> AI-Benchmark - 0.1.3.cm
>> Let the AI Games begin
INFO:ai_benchmark:>> Let the AI Games begin
2023-10-22 19:31:49.861137: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861201: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861221: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861303: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861332: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861352: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.861367: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:31:49.992484: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992548: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992569: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992599: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992619: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:31:49.992664: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992691: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992708: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992730: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992746: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:49.992754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
* TF Version: 2.13.0
INFO:ai_benchmark:* TF Version: 2.13.0
* Platform: Linux-5.15.0-87-generic-x86_64-with-glibc2.31
INFO:ai_benchmark:* Platform: Linux-5.15.0-87-generic-x86_64-with-glibc2.31
* CPU: AMD Ryzen 9 7900X 12-Core Processor
INFO:ai_benchmark:* CPU: AMD Ryzen 9 7900X 12-Core Processor
* CPU RAM: 62 GB
INFO:ai_benchmark:* CPU RAM: 62 GB
* GPU/0:
INFO:ai_benchmark:* GPU/0:
* GPU RAM: 30.7 GB
INFO:ai_benchmark:* GPU RAM: 30.7 GB
* CUDA Version: N/A
INFO:ai_benchmark:* CUDA Version: N/A
* CUDA Build: N/A
INFO:ai_benchmark:* CUDA Build: N/A
The benchmark is running...
WARNING:ai_benchmark:The benchmark is running...
The tests might take up to 20 minutes
WARNING:ai_benchmark:The tests might take up to 20 minutes
Please don't interrupt the script
WARNING:ai_benchmark:Please don't interrupt the script
1/19. MobileNet-V2
INFO:ai_benchmark:
1/19. MobileNet-V2
2023-10-22 19:31:51.154618: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:51.154735: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:51.154772: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:51.154831: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:51.154867: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:31:51.154885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:31:51.379433: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
2023-10-22 19:31:51.528334: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:31:51.934273: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 306 ms
DEBUG:ai_benchmark:Inference Time: 306 ms
Inference Time: 20 ms
DEBUG:ai_benchmark:Inference Time: 20 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
Inference Time: 18 ms
DEBUG:ai_benchmark:Inference Time: 18 ms
1.1 - inference | batch=50, size=224x224: 18.5 ± 0.6 ms
INFO:ai_benchmark:1.1 - inference | batch=50, size=224x224: 18.5 ± 0.6 ms
2023-10-22 19:31:55.784616: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:31:56.248003: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 2920 ms
DEBUG:ai_benchmark:Training Time: 2920 ms
Training Time: 2552 ms
DEBUG:ai_benchmark:Training Time: 2552 ms
Training Time: 2510 ms
DEBUG:ai_benchmark:Training Time: 2510 ms
Training Time: 2517 ms
DEBUG:ai_benchmark:Training Time: 2517 ms
Training Time: 2482 ms
DEBUG:ai_benchmark:Training Time: 2482 ms
Training Time: 2513 ms
DEBUG:ai_benchmark:Training Time: 2513 ms
Training Time: 2491 ms
DEBUG:ai_benchmark:Training Time: 2491 ms
Training Time: 2507 ms
DEBUG:ai_benchmark:Training Time: 2507 ms
Training Time: 2475 ms
DEBUG:ai_benchmark:Training Time: 2475 ms
Training Time: 2475 ms
DEBUG:ai_benchmark:Training Time: 2475 ms
Training Time: 2510 ms
DEBUG:ai_benchmark:Training Time: 2510 ms
1.2 - training | batch=50, size=224x224: 2503 ± 22 ms
INFO:ai_benchmark:1.2 - training | batch=50, size=224x224: 2503 ± 22 ms
2/19. Inception-V3
INFO:ai_benchmark:
2/19. Inception-V3
2023-10-22 19:32:25.107096: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:25.107167: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:25.107188: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:25.107221: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:25.107241: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:25.107251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:32:25.453016: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:32:25.696096: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 282 ms
DEBUG:ai_benchmark:Inference Time: 282 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 26 ms
DEBUG:ai_benchmark:Inference Time: 26 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
2.1 - inference | batch=20, size=346x346: 26.8 ± 0.5 ms
INFO:ai_benchmark:2.1 - inference | batch=20, size=346x346: 26.8 ± 0.5 ms
2023-10-22 19:32:28.821332: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:32:29.478215: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 2479 ms
DEBUG:ai_benchmark:Training Time: 2479 ms
Training Time: 1911 ms
DEBUG:ai_benchmark:Training Time: 1911 ms
Training Time: 1921 ms
DEBUG:ai_benchmark:Training Time: 1921 ms
Training Time: 1917 ms
DEBUG:ai_benchmark:Training Time: 1917 ms
Training Time: 1928 ms
DEBUG:ai_benchmark:Training Time: 1928 ms
Training Time: 1914 ms
DEBUG:ai_benchmark:Training Time: 1914 ms
Training Time: 1924 ms
DEBUG:ai_benchmark:Training Time: 1924 ms
Training Time: 1931 ms
DEBUG:ai_benchmark:Training Time: 1931 ms
Training Time: 1924 ms
DEBUG:ai_benchmark:Training Time: 1924 ms
Training Time: 1917 ms
DEBUG:ai_benchmark:Training Time: 1917 ms
Training Time: 1925 ms
DEBUG:ai_benchmark:Training Time: 1925 ms
Training Time: 1928 ms
DEBUG:ai_benchmark:Training Time: 1928 ms
Training Time: 1929 ms
DEBUG:ai_benchmark:Training Time: 1929 ms
Training Time: 1942 ms
DEBUG:ai_benchmark:Training Time: 1942 ms
Training Time: 1930 ms
DEBUG:ai_benchmark:Training Time: 1930 ms
2.2 - training | batch=20, size=346x346: 1924 ± 8 ms
INFO:ai_benchmark:2.2 - training | batch=20, size=346x346: 1924 ± 8 ms
3/19. Inception-V4
INFO:ai_benchmark:
3/19. Inception-V4
2023-10-22 19:32:59.307898: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:59.308031: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:59.308061: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:59.308094: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:59.308115: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:32:59.308125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:32:59.865766: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:33:00.173586: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 280 ms
DEBUG:ai_benchmark:Inference Time: 280 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
Inference Time: 31 ms
DEBUG:ai_benchmark:Inference Time: 31 ms
Inference Time: 30 ms
DEBUG:ai_benchmark:Inference Time: 30 ms
3.1 - inference | batch=10, size=346x346: 31.0 ± 0.8 ms
INFO:ai_benchmark:3.1 - inference | batch=10, size=346x346: 31.0 ± 0.8 ms
2023-10-22 19:33:03.123242: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:33:04.116132: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 2387 ms
DEBUG:ai_benchmark:Training Time: 2387 ms
Training Time: 1486 ms
DEBUG:ai_benchmark:Training Time: 1486 ms
Training Time: 1490 ms
DEBUG:ai_benchmark:Training Time: 1490 ms
Training Time: 1497 ms
DEBUG:ai_benchmark:Training Time: 1497 ms
Training Time: 1493 ms
DEBUG:ai_benchmark:Training Time: 1493 ms
Training Time: 1503 ms
DEBUG:ai_benchmark:Training Time: 1503 ms
Training Time: 1506 ms
DEBUG:ai_benchmark:Training Time: 1506 ms
Training Time: 1504 ms
DEBUG:ai_benchmark:Training Time: 1504 ms
Training Time: 1501 ms
DEBUG:ai_benchmark:Training Time: 1501 ms
Training Time: 1504 ms
DEBUG:ai_benchmark:Training Time: 1504 ms
Training Time: 1501 ms
DEBUG:ai_benchmark:Training Time: 1501 ms
Training Time: 1506 ms
DEBUG:ai_benchmark:Training Time: 1506 ms
Training Time: 1497 ms
DEBUG:ai_benchmark:Training Time: 1497 ms
Training Time: 1504 ms
DEBUG:ai_benchmark:Training Time: 1504 ms
Training Time: 1494 ms
DEBUG:ai_benchmark:Training Time: 1494 ms
Training Time: 1500 ms
DEBUG:ai_benchmark:Training Time: 1500 ms
Training Time: 1506 ms
DEBUG:ai_benchmark:Training Time: 1506 ms
Training Time: 1510 ms
DEBUG:ai_benchmark:Training Time: 1510 ms
Training Time: 1499 ms
DEBUG:ai_benchmark:Training Time: 1499 ms
3.2 - training | batch=10, size=346x346: 1500 ± 6 ms
INFO:ai_benchmark:3.2 - training | batch=10, size=346x346: 1500 ± 6 ms
4/19. Inception-ResNet-V2
INFO:ai_benchmark:
4/19. Inception-ResNet-V2
2023-10-22 19:33:33.208562: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:33:33.208639: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:33:33.208659: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:33:33.208696: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:33:33.208716: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:33:33.208727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:33:34.309266: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:33:34.801617: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 455 ms
DEBUG:ai_benchmark:Inference Time: 455 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 39 ms
DEBUG:ai_benchmark:Inference Time: 39 ms
Inference Time: 39 ms
DEBUG:ai_benchmark:Inference Time: 39 ms
Inference Time: 39 ms
DEBUG:ai_benchmark:Inference Time: 39 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 39 ms
DEBUG:ai_benchmark:Inference Time: 39 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
Inference Time: 38 ms
DEBUG:ai_benchmark:Inference Time: 38 ms
4.1 - inference | batch=10, size=346x346: 38.2 ± 0.4 ms
INFO:ai_benchmark:4.1 - inference | batch=10, size=346x346: 38.2 ± 0.4 ms
2023-10-22 19:33:39.263445: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:33:41.150704: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 2805 ms
DEBUG:ai_benchmark:Training Time: 2805 ms
Training Time: 997 ms
DEBUG:ai_benchmark:Training Time: 997 ms
Training Time: 1003 ms
DEBUG:ai_benchmark:Training Time: 1003 ms
Training Time: 1001 ms
DEBUG:ai_benchmark:Training Time: 1001 ms
Training Time: 1009 ms
DEBUG:ai_benchmark:Training Time: 1009 ms
Training Time: 1007 ms
DEBUG:ai_benchmark:Training Time: 1007 ms
Training Time: 1009 ms
DEBUG:ai_benchmark:Training Time: 1009 ms
Training Time: 1010 ms
DEBUG:ai_benchmark:Training Time: 1010 ms
Training Time: 1005 ms
DEBUG:ai_benchmark:Training Time: 1005 ms
Training Time: 1004 ms
DEBUG:ai_benchmark:Training Time: 1004 ms
Training Time: 1000 ms
DEBUG:ai_benchmark:Training Time: 1000 ms
Training Time: 1006 ms
DEBUG:ai_benchmark:Training Time: 1006 ms
Training Time: 1002 ms
DEBUG:ai_benchmark:Training Time: 1002 ms
Training Time: 1011 ms
DEBUG:ai_benchmark:Training Time: 1011 ms
Training Time: 1003 ms
DEBUG:ai_benchmark:Training Time: 1003 ms
Training Time: 1009 ms
DEBUG:ai_benchmark:Training Time: 1009 ms
Training Time: 1000 ms
DEBUG:ai_benchmark:Training Time: 1000 ms
Training Time: 1003 ms
DEBUG:ai_benchmark:Training Time: 1003 ms
Training Time: 1011 ms
DEBUG:ai_benchmark:Training Time: 1011 ms
Training Time: 1011 ms
DEBUG:ai_benchmark:Training Time: 1011 ms
Training Time: 1001 ms
DEBUG:ai_benchmark:Training Time: 1001 ms
Training Time: 1005 ms
DEBUG:ai_benchmark:Training Time: 1005 ms
4.2 - training | batch=8, size=346x346: 1005 ± 4 ms
INFO:ai_benchmark:4.2 - training | batch=8, size=346x346: 1005 ± 4 ms
5/19. ResNet-V2-50
INFO:ai_benchmark:
5/19. ResNet-V2-50
2023-10-22 19:34:04.052089: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:04.052224: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:04.052515: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:04.052556: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:04.052576: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:04.052586: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:04.299069: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:04.454051: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 144 ms
DEBUG:ai_benchmark:Inference Time: 144 ms
Inference Time: 20 ms
DEBUG:ai_benchmark:Inference Time: 20 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 20 ms
DEBUG:ai_benchmark:Inference Time: 20 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 20 ms
DEBUG:ai_benchmark:Inference Time: 20 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
Inference Time: 20 ms
DEBUG:ai_benchmark:Inference Time: 20 ms
Inference Time: 19 ms
DEBUG:ai_benchmark:Inference Time: 19 ms
5.1 - inference | batch=10, size=346x346: 19.2 ± 0.4 ms
INFO:ai_benchmark:5.1 - inference | batch=10, size=346x346: 19.2 ± 0.4 ms
2023-10-22 19:34:06.388479: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:06.836208: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 709 ms
DEBUG:ai_benchmark:Training Time: 709 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 57 ms
DEBUG:ai_benchmark:Training Time: 57 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
Training Time: 58 ms
DEBUG:ai_benchmark:Training Time: 58 ms
5.2 - training | batch=10, size=346x346: 58.0 ± 0.2 ms
INFO:ai_benchmark:5.2 - training | batch=10, size=346x346: 58.0 ± 0.2 ms
6/19. ResNet-V2-152
INFO:ai_benchmark:
6/19. ResNet-V2-152
2023-10-22 19:34:09.039350: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:09.039423: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:09.039444: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:09.039478: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:09.039504: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:09.039514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:09.788447: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:10.169938: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 346 ms
DEBUG:ai_benchmark:Inference Time: 346 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
Inference Time: 28 ms
DEBUG:ai_benchmark:Inference Time: 28 ms
Inference Time: 27 ms
DEBUG:ai_benchmark:Inference Time: 27 ms
6.1 - inference | batch=10, size=256x256: 27.6 ± 0.5 ms
INFO:ai_benchmark:6.1 - inference | batch=10, size=256x256: 27.6 ± 0.5 ms
2023-10-22 19:34:14.159006: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:15.531191: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 1370 ms
DEBUG:ai_benchmark:Training Time: 1370 ms
Training Time: 96 ms
DEBUG:ai_benchmark:Training Time: 96 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 96 ms
DEBUG:ai_benchmark:Training Time: 96 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 96 ms
DEBUG:ai_benchmark:Training Time: 96 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 93 ms
DEBUG:ai_benchmark:Training Time: 93 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 94 ms
DEBUG:ai_benchmark:Training Time: 94 ms
Training Time: 95 ms
DEBUG:ai_benchmark:Training Time: 95 ms
6.2 - training | batch=10, size=256x256: 94.6 ± 0.8 ms
INFO:ai_benchmark:6.2 - training | batch=10, size=256x256: 94.6 ± 0.8 ms
7/19. VGG-16
INFO:ai_benchmark:
7/19. VGG-16
2023-10-22 19:34:18.252627: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:18.253613: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:18.253643: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:18.253679: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:18.253701: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:18.253712: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:18.299092: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:18.376564: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 74 ms
DEBUG:ai_benchmark:Inference Time: 74 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
7.1 - inference | batch=20, size=224x224: 50.4 ± 0.5 ms
INFO:ai_benchmark:7.1 - inference | batch=20, size=224x224: 50.4 ± 0.5 ms
2023-10-22 19:34:20.707271: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:21.061940: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 536 ms
DEBUG:ai_benchmark:Training Time: 536 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
Training Time: 69 ms
DEBUG:ai_benchmark:Training Time: 69 ms
7.2 - training | batch=2, size=224x224: 68.8 ± 0.4 ms
INFO:ai_benchmark:7.2 - training | batch=2, size=224x224: 68.8 ± 0.4 ms
8/19. SRCNN 9-5-5
INFO:ai_benchmark:
8/19. SRCNN 9-5-5
2023-10-22 19:34:23.156146: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:23.156248: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:23.156286: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:23.156338: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:23.156374: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:23.156391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:23.172805: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:23.305943: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 47 ms
DEBUG:ai_benchmark:Inference Time: 47 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 36 ms
DEBUG:ai_benchmark:Inference Time: 36 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
Inference Time: 37 ms
DEBUG:ai_benchmark:Inference Time: 37 ms
8.1 - inference | batch=10, size=512x512: 36.7 ± 0.5 ms
INFO:ai_benchmark:8.1 - inference | batch=10, size=512x512: 36.7 ± 0.5 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 32 ms
DEBUG:ai_benchmark:Inference Time: 32 ms
Inference Time: 33 ms
DEBUG:ai_benchmark:Inference Time: 33 ms
8.2 - inference | batch=1, size=1536x1536: 32.3 ± 0.5 ms
INFO:ai_benchmark:8.2 - inference | batch=1, size=1536x1536: 32.3 ± 0.5 ms
2023-10-22 19:34:27.722717: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:28.065954: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 132 ms
DEBUG:ai_benchmark:Training Time: 132 ms
Training Time: 102 ms
DEBUG:ai_benchmark:Training Time: 102 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 100 ms
DEBUG:ai_benchmark:Training Time: 100 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 100 ms
DEBUG:ai_benchmark:Training Time: 100 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
Training Time: 101 ms
DEBUG:ai_benchmark:Training Time: 101 ms
8.3 - training | batch=10, size=512x512: 101.0 ± 0.4 ms
INFO:ai_benchmark:8.3 - training | batch=10, size=512x512: 101.0 ± 0.4 ms
9/19. VGG-19 Super-Res
INFO:ai_benchmark:
9/19. VGG-19 Super-Res
2023-10-22 19:34:36.639969: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:36.640113: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:36.640138: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:36.640170: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:36.640190: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:36.640200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:36.703489: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:36.816801: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 72 ms
DEBUG:ai_benchmark:Inference Time: 72 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 48 ms
DEBUG:ai_benchmark:Inference Time: 48 ms
Inference Time: 49 ms
DEBUG:ai_benchmark:Inference Time: 49 ms
9.1 - inference | batch=10, size=256x256: 48.3 ± 0.5 ms
INFO:ai_benchmark:9.1 - inference | batch=10, size=256x256: 48.3 ± 0.5 ms
Inference Time: 88 ms
DEBUG:ai_benchmark:Inference Time: 88 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
Inference Time: 85 ms
DEBUG:ai_benchmark:Inference Time: 85 ms
Inference Time: 86 ms
DEBUG:ai_benchmark:Inference Time: 86 ms
9.2 - inference | batch=1, size=1024x1024: 85.3 ± 0.5 ms
INFO:ai_benchmark:9.2 - inference | batch=1, size=1024x1024: 85.3 ± 0.5 ms
2023-10-22 19:34:42.198547: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:42.507050: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 193 ms
DEBUG:ai_benchmark:Training Time: 193 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 116 ms
DEBUG:ai_benchmark:Training Time: 116 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
Training Time: 117 ms
DEBUG:ai_benchmark:Training Time: 117 ms
9.3 - training | batch=10, size=224x224: 116.7 ± 0.5 ms
INFO:ai_benchmark:9.3 - training | batch=10, size=224x224: 116.7 ± 0.5 ms
10/19. ResNet-SRGAN
INFO:ai_benchmark:
10/19. ResNet-SRGAN
2023-10-22 19:34:49.995674: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:49.995743: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:49.995764: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:49.995796: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:49.995822: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:34:49.995835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:34:50.192274: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:50.414929: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 160 ms
DEBUG:ai_benchmark:Inference Time: 160 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 52 ms
DEBUG:ai_benchmark:Inference Time: 52 ms
Inference Time: 52 ms
DEBUG:ai_benchmark:Inference Time: 52 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 52 ms
DEBUG:ai_benchmark:Inference Time: 52 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 54 ms
DEBUG:ai_benchmark:Inference Time: 54 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
Inference Time: 53 ms
DEBUG:ai_benchmark:Inference Time: 53 ms
10.1 - inference | batch=10, size=512x512: 52.9 ± 0.4 ms
INFO:ai_benchmark:10.1 - inference | batch=10, size=512x512: 52.9 ± 0.4 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
10.2 - inference | batch=1, size=1536x1536: 50.3 ± 0.5 ms
INFO:ai_benchmark:10.2 - inference | batch=1, size=1536x1536: 50.3 ± 0.5 ms
2023-10-22 19:34:55.920066: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:34:56.380332: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 392 ms
DEBUG:ai_benchmark:Training Time: 392 ms
Training Time: 79 ms
DEBUG:ai_benchmark:Training Time: 79 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 79 ms
DEBUG:ai_benchmark:Training Time: 79 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 79 ms
DEBUG:ai_benchmark:Training Time: 79 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
Training Time: 77 ms
DEBUG:ai_benchmark:Training Time: 77 ms
Training Time: 79 ms
DEBUG:ai_benchmark:Training Time: 79 ms
Training Time: 78 ms
DEBUG:ai_benchmark:Training Time: 78 ms
10.3 - training | batch=5, size=512x512: 78.1 ± 0.5 ms
INFO:ai_benchmark:10.3 - training | batch=5, size=512x512: 78.1 ± 0.5 ms
11/19. ResNet-DPED
INFO:ai_benchmark:
11/19. ResNet-DPED
2023-10-22 19:35:01.174619: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:01.174689: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:01.174709: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:01.174745: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:01.174766: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:01.174777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:35:01.212501: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:01.327875: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 98 ms
DEBUG:ai_benchmark:Inference Time: 98 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
Inference Time: 61 ms
DEBUG:ai_benchmark:Inference Time: 61 ms
11.1 - inference | batch=10, size=256x256: 61.0 ± 0.0 ms
INFO:ai_benchmark:11.1 - inference | batch=10, size=256x256: 61.0 ± 0.0 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
Inference Time: 110 ms
DEBUG:ai_benchmark:Inference Time: 110 ms
11.2 - inference | batch=1, size=1024x1024: 110.0 ± 0.0 ms
INFO:ai_benchmark:11.2 - inference | batch=1, size=1024x1024: 110.0 ± 0.0 ms
2023-10-22 19:35:07.600917: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:08.172414: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 342 ms
DEBUG:ai_benchmark:Training Time: 342 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 92 ms
DEBUG:ai_benchmark:Training Time: 92 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 90 ms
DEBUG:ai_benchmark:Training Time: 90 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 90 ms
DEBUG:ai_benchmark:Training Time: 90 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
Training Time: 92 ms
DEBUG:ai_benchmark:Training Time: 92 ms
Training Time: 91 ms
DEBUG:ai_benchmark:Training Time: 91 ms
11.3 - training | batch=15, size=128x128: 91.0 ± 0.4 ms
INFO:ai_benchmark:11.3 - training | batch=15, size=128x128: 91.0 ± 0.4 ms
12/19. U-Net
INFO:ai_benchmark:
12/19. U-Net
2023-10-22 19:35:17.154802: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:17.154872: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:17.154893: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:17.154928: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:17.154947: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:17.154957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:35:17.466214: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:17.573649: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 180 ms
DEBUG:ai_benchmark:Inference Time: 180 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 104 ms
DEBUG:ai_benchmark:Inference Time: 104 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 104 ms
DEBUG:ai_benchmark:Inference Time: 104 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 104 ms
DEBUG:ai_benchmark:Inference Time: 104 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
Inference Time: 104 ms
DEBUG:ai_benchmark:Inference Time: 104 ms
Inference Time: 103 ms
DEBUG:ai_benchmark:Inference Time: 103 ms
12.1 - inference | batch=4, size=512x512: 103.2 ± 0.4 ms
INFO:ai_benchmark:12.1 - inference | batch=4, size=512x512: 103.2 ± 0.4 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 124 ms
DEBUG:ai_benchmark:Inference Time: 124 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 114 ms
DEBUG:ai_benchmark:Inference Time: 114 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
Inference Time: 115 ms
DEBUG:ai_benchmark:Inference Time: 115 ms
12.2 - inference | batch=1, size=1024x1024: 115 ± 2 ms
INFO:ai_benchmark:12.2 - inference | batch=1, size=1024x1024: 115 ± 2 ms
2023-10-22 19:35:24.130472: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:24.693498: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 704 ms
DEBUG:ai_benchmark:Training Time: 704 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 139 ms
DEBUG:ai_benchmark:Training Time: 139 ms
Training Time: 138 ms
DEBUG:ai_benchmark:Training Time: 138 ms
12.3 - training | batch=4, size=256x256: 138.7 ± 0.5 ms
INFO:ai_benchmark:12.3 - training | batch=4, size=256x256: 138.7 ± 0.5 ms
13/19. Nvidia-SPADE
INFO:ai_benchmark:
13/19. Nvidia-SPADE
2023-10-22 19:35:28.617241: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:28.617314: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:28.617334: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:28.617366: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:28.617386: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:28.617398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:35:28.842177: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:29.052412: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 1333 ms
DEBUG:ai_benchmark:Inference Time: 1333 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 50 ms
DEBUG:ai_benchmark:Inference Time: 50 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
Inference Time: 51 ms
DEBUG:ai_benchmark:Inference Time: 51 ms
13.1 - inference | batch=5, size=128x128: 50.8 ± 0.4 ms
INFO:ai_benchmark:13.1 - inference | batch=5, size=128x128: 50.8 ± 0.4 ms
2023-10-22 19:35:33.125496: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:34.150385: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 961 ms
DEBUG:ai_benchmark:Training Time: 961 ms
Training Time: 67 ms
DEBUG:ai_benchmark:Training Time: 67 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 67 ms
DEBUG:ai_benchmark:Training Time: 67 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 67 ms
DEBUG:ai_benchmark:Training Time: 67 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 67 ms
DEBUG:ai_benchmark:Training Time: 67 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
Training Time: 68 ms
DEBUG:ai_benchmark:Training Time: 68 ms
13.2 - training | batch=1, size=128x128: 67.8 ± 0.4 ms
INFO:ai_benchmark:13.2 - training | batch=1, size=128x128: 67.8 ± 0.4 ms
14/19. ICNet
INFO:ai_benchmark:
14/19. ICNet
2023-10-22 19:35:36.003191: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:36.003357: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:36.003642: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:36.003705: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:36.003743: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:36.003760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:35:36.217371: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:36.436207: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 237 ms
DEBUG:ai_benchmark:Inference Time: 237 ms
Inference Time: 104 ms
DEBUG:ai_benchmark:Inference Time: 104 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 97 ms
DEBUG:ai_benchmark:Inference Time: 97 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 100 ms
DEBUG:ai_benchmark:Inference Time: 100 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 98 ms
DEBUG:ai_benchmark:Inference Time: 98 ms
Inference Time: 102 ms
DEBUG:ai_benchmark:Inference Time: 102 ms
Inference Time: 101 ms
DEBUG:ai_benchmark:Inference Time: 101 ms
14.1 - inference | batch=5, size=1024x1536: 101 ± 1 ms
INFO:ai_benchmark:14.1 - inference | batch=5, size=1024x1536: 101 ± 1 ms
2023-10-22 19:35:40.933679: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:41.506316: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 306 ms
DEBUG:ai_benchmark:Training Time: 306 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 197 ms
DEBUG:ai_benchmark:Training Time: 197 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 196 ms
DEBUG:ai_benchmark:Training Time: 196 ms
Training Time: 197 ms
DEBUG:ai_benchmark:Training Time: 197 ms
Training Time: 195 ms
DEBUG:ai_benchmark:Training Time: 195 ms
14.2 - training | batch=10, size=1024x1536: 195.7 ± 0.6 ms
INFO:ai_benchmark:14.2 - training | batch=10, size=1024x1536: 195.7 ± 0.6 ms
15/19. PSPNet
INFO:ai_benchmark:
15/19. PSPNet
2023-10-22 19:35:54.730059: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:54.730128: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:54.730149: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:54.730179: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:54.730200: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:35:54.730210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:35:55.019860: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:35:55.252195: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 425 ms
DEBUG:ai_benchmark:Inference Time: 425 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 263 ms
DEBUG:ai_benchmark:Inference Time: 263 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
Inference Time: 262 ms
DEBUG:ai_benchmark:Inference Time: 262 ms
15.1 - inference | batch=5, size=720x720: 262.3 ± 0.5 ms
INFO:ai_benchmark:15.1 - inference | batch=5, size=720x720: 262.3 ± 0.5 ms
2023-10-22 19:36:02.916428: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:36:03.421037: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 914 ms
DEBUG:ai_benchmark:Training Time: 914 ms
Training Time: 396 ms
DEBUG:ai_benchmark:Training Time: 396 ms
Training Time: 397 ms
DEBUG:ai_benchmark:Training Time: 397 ms
Training Time: 396 ms
DEBUG:ai_benchmark:Training Time: 396 ms
Training Time: 396 ms
DEBUG:ai_benchmark:Training Time: 396 ms
Training Time: 395 ms
DEBUG:ai_benchmark:Training Time: 395 ms
Training Time: 398 ms
DEBUG:ai_benchmark:Training Time: 398 ms
Training Time: 396 ms
DEBUG:ai_benchmark:Training Time: 396 ms
Training Time: 398 ms
DEBUG:ai_benchmark:Training Time: 398 ms
Training Time: 401 ms
DEBUG:ai_benchmark:Training Time: 401 ms
Training Time: 396 ms
DEBUG:ai_benchmark:Training Time: 396 ms
Training Time: 399 ms
DEBUG:ai_benchmark:Training Time: 399 ms
Training Time: 400 ms
DEBUG:ai_benchmark:Training Time: 400 ms
Training Time: 397 ms
DEBUG:ai_benchmark:Training Time: 397 ms
Training Time: 394 ms
DEBUG:ai_benchmark:Training Time: 394 ms
Training Time: 401 ms
DEBUG:ai_benchmark:Training Time: 401 ms
Training Time: 398 ms
DEBUG:ai_benchmark:Training Time: 398 ms
Training Time: 399 ms
DEBUG:ai_benchmark:Training Time: 399 ms
Training Time: 397 ms
DEBUG:ai_benchmark:Training Time: 397 ms
Training Time: 394 ms
DEBUG:ai_benchmark:Training Time: 394 ms
Training Time: 400 ms
DEBUG:ai_benchmark:Training Time: 400 ms
Training Time: 394 ms
DEBUG:ai_benchmark:Training Time: 394 ms
15.2 - training | batch=1, size=512x512: 397 ± 2 ms
INFO:ai_benchmark:15.2 - training | batch=1, size=512x512: 397 ± 2 ms
16/19. DeepLab
INFO:ai_benchmark:
16/19. DeepLab
2023-10-22 19:36:12.622903: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:12.623011: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:12.623050: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:12.623104: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:12.623138: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:12.623155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:36:13.199100: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:36:13.586387: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 431 ms
DEBUG:ai_benchmark:Inference Time: 431 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 55 ms
DEBUG:ai_benchmark:Inference Time: 55 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 55 ms
DEBUG:ai_benchmark:Inference Time: 55 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 55 ms
DEBUG:ai_benchmark:Inference Time: 55 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 57 ms
DEBUG:ai_benchmark:Inference Time: 57 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
Inference Time: 56 ms
DEBUG:ai_benchmark:Inference Time: 56 ms
16.1 - inference | batch=2, size=512x512: 55.9 ± 0.4 ms
INFO:ai_benchmark:16.1 - inference | batch=2, size=512x512: 55.9 ± 0.4 ms
2023-10-22 19:36:16.991554: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:36:17.927186: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 1004 ms
DEBUG:ai_benchmark:Training Time: 1004 ms
Training Time: 120 ms
DEBUG:ai_benchmark:Training Time: 120 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 120 ms
DEBUG:ai_benchmark:Training Time: 120 ms
Training Time: 120 ms
DEBUG:ai_benchmark:Training Time: 120 ms
Training Time: 121 ms
DEBUG:ai_benchmark:Training Time: 121 ms
Training Time: 122 ms
DEBUG:ai_benchmark:Training Time: 122 ms
Training Time: 120 ms
DEBUG:ai_benchmark:Training Time: 120 ms
16.2 - training | batch=1, size=384x384: 121.1 ± 0.7 ms
INFO:ai_benchmark:16.2 - training | batch=1, size=384x384: 121.1 ± 0.7 ms
17/19. Pixel-RNN
INFO:ai_benchmark:
17/19. Pixel-RNN
2023-10-22 19:36:20.991322: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:20.991437: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:20.991475: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:20.991531: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:20.991568: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:20.991585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:36:21.117704: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:21.117763: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:21.117784: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:21.117813: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:21.117832: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:36:21.117842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:36:30.408471: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:36:30.416782: W tensorflow/c/c_api.cc:304] Operation '{name:'conv2d_out_logits/biases/Adam_1/Assign' id:47369 op device:{requested: '', assigned: ''} def:{{{node conv2d_out_logits/biases/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv2d_out_logits/biases/Adam_1, conv2d_out_logits/biases/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
2023-10-22 19:36:31.194235: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:36:33.247421: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 2132 ms
DEBUG:ai_benchmark:Inference Time: 2132 ms
Inference Time: 323 ms
DEBUG:ai_benchmark:Inference Time: 323 ms
Inference Time: 321 ms
DEBUG:ai_benchmark:Inference Time: 321 ms
Inference Time: 322 ms
DEBUG:ai_benchmark:Inference Time: 322 ms
Inference Time: 321 ms
DEBUG:ai_benchmark:Inference Time: 321 ms
Inference Time: 321 ms
DEBUG:ai_benchmark:Inference Time: 321 ms
Inference Time: 320 ms
DEBUG:ai_benchmark:Inference Time: 320 ms
Inference Time: 324 ms
DEBUG:ai_benchmark:Inference Time: 324 ms
Inference Time: 316 ms
DEBUG:ai_benchmark:Inference Time: 316 ms
Inference Time: 324 ms
DEBUG:ai_benchmark:Inference Time: 324 ms
Inference Time: 322 ms
DEBUG:ai_benchmark:Inference Time: 322 ms
Inference Time: 320 ms
DEBUG:ai_benchmark:Inference Time: 320 ms
Inference Time: 324 ms
DEBUG:ai_benchmark:Inference Time: 324 ms
Inference Time: 322 ms
DEBUG:ai_benchmark:Inference Time: 322 ms
Inference Time: 323 ms
DEBUG:ai_benchmark:Inference Time: 323 ms
Inference Time: 321 ms
DEBUG:ai_benchmark:Inference Time: 321 ms
Inference Time: 323 ms
DEBUG:ai_benchmark:Inference Time: 323 ms
Inference Time: 320 ms
DEBUG:ai_benchmark:Inference Time: 320 ms
Inference Time: 320 ms
DEBUG:ai_benchmark:Inference Time: 320 ms
Inference Time: 315 ms
DEBUG:ai_benchmark:Inference Time: 315 ms
Inference Time: 315 ms
DEBUG:ai_benchmark:Inference Time: 315 ms
Inference Time: 317 ms
DEBUG:ai_benchmark:Inference Time: 317 ms
17.1 - inference | batch=50, size=64x64: 321 ± 3 ms
INFO:ai_benchmark:17.1 - inference | batch=50, size=64x64: 321 ± 3 ms
2023-10-22 19:36:56.950409: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 11331 ms
DEBUG:ai_benchmark:Training Time: 11331 ms
Training Time: 2039 ms
DEBUG:ai_benchmark:Training Time: 2039 ms
Training Time: 1986 ms
DEBUG:ai_benchmark:Training Time: 1986 ms
Training Time: 2039 ms
DEBUG:ai_benchmark:Training Time: 2039 ms
Training Time: 2091 ms
DEBUG:ai_benchmark:Training Time: 2091 ms
Training Time: 2226 ms
DEBUG:ai_benchmark:Training Time: 2226 ms
Training Time: 2001 ms
DEBUG:ai_benchmark:Training Time: 2001 ms
Training Time: 2159 ms
DEBUG:ai_benchmark:Training Time: 2159 ms
Training Time: 2192 ms
DEBUG:ai_benchmark:Training Time: 2192 ms
17.2 - training | batch=10, size=64x64: 2092 ± 85 ms
INFO:ai_benchmark:17.2 - training | batch=10, size=64x64: 2092 ± 85 ms
18/19. LSTM-Sentiment
INFO:ai_benchmark:
18/19. LSTM-Sentiment
2023-10-22 19:37:17.849688: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:17.849763: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:17.849784: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:17.849823: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:17.849843: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:17.849853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:37:17.986061: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:37:17.992936: W tensorflow/c/c_api.cc:304] Operation '{name:'Variable_1/Adam_1/Assign' id:351 op device:{requested: '', assigned: ''} def:{{{node Variable_1/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](Variable_1/Adam_1, Variable_1/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
2023-10-22 19:37:18.016768: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:37:18.194147: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 331 ms
DEBUG:ai_benchmark:Inference Time: 331 ms
Inference Time: 319 ms
DEBUG:ai_benchmark:Inference Time: 319 ms
Inference Time: 326 ms
DEBUG:ai_benchmark:Inference Time: 326 ms
Inference Time: 330 ms
DEBUG:ai_benchmark:Inference Time: 330 ms
Inference Time: 327 ms
DEBUG:ai_benchmark:Inference Time: 327 ms
Inference Time: 327 ms
DEBUG:ai_benchmark:Inference Time: 327 ms
Inference Time: 321 ms
DEBUG:ai_benchmark:Inference Time: 321 ms
Inference Time: 323 ms
DEBUG:ai_benchmark:Inference Time: 323 ms
Inference Time: 315 ms
DEBUG:ai_benchmark:Inference Time: 315 ms
Inference Time: 317 ms
DEBUG:ai_benchmark:Inference Time: 317 ms
Inference Time: 308 ms
DEBUG:ai_benchmark:Inference Time: 308 ms
Inference Time: 317 ms
DEBUG:ai_benchmark:Inference Time: 317 ms
Inference Time: 305 ms
DEBUG:ai_benchmark:Inference Time: 305 ms
Inference Time: 319 ms
DEBUG:ai_benchmark:Inference Time: 319 ms
Inference Time: 328 ms
DEBUG:ai_benchmark:Inference Time: 328 ms
Inference Time: 294 ms
DEBUG:ai_benchmark:Inference Time: 294 ms
Inference Time: 329 ms
DEBUG:ai_benchmark:Inference Time: 329 ms
Inference Time: 327 ms
DEBUG:ai_benchmark:Inference Time: 327 ms
Inference Time: 327 ms
DEBUG:ai_benchmark:Inference Time: 327 ms
Inference Time: 323 ms
DEBUG:ai_benchmark:Inference Time: 323 ms
Inference Time: 329 ms
DEBUG:ai_benchmark:Inference Time: 329 ms
Inference Time: 326 ms
DEBUG:ai_benchmark:Inference Time: 326 ms
18.1 - inference | batch=100, size=1024x300: 321 ± 9 ms
INFO:ai_benchmark:18.1 - inference | batch=100, size=1024x300: 321 ± 9 ms
2023-10-22 19:37:27.813396: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Training Time: 1266 ms
DEBUG:ai_benchmark:Training Time: 1266 ms
Training Time: 1022 ms
DEBUG:ai_benchmark:Training Time: 1022 ms
Training Time: 993 ms
DEBUG:ai_benchmark:Training Time: 993 ms
Training Time: 1024 ms
DEBUG:ai_benchmark:Training Time: 1024 ms
Training Time: 1000 ms
DEBUG:ai_benchmark:Training Time: 1000 ms
Training Time: 1178 ms
DEBUG:ai_benchmark:Training Time: 1178 ms
Training Time: 1051 ms
DEBUG:ai_benchmark:Training Time: 1051 ms
Training Time: 1188 ms
DEBUG:ai_benchmark:Training Time: 1188 ms
Training Time: 1046 ms
DEBUG:ai_benchmark:Training Time: 1046 ms
Training Time: 1192 ms
DEBUG:ai_benchmark:Training Time: 1192 ms
Training Time: 1039 ms
DEBUG:ai_benchmark:Training Time: 1039 ms
Training Time: 1193 ms
DEBUG:ai_benchmark:Training Time: 1193 ms
Training Time: 1039 ms
DEBUG:ai_benchmark:Training Time: 1039 ms
Training Time: 983 ms
DEBUG:ai_benchmark:Training Time: 983 ms
Training Time: 1187 ms
DEBUG:ai_benchmark:Training Time: 1187 ms
Training Time: 984 ms
DEBUG:ai_benchmark:Training Time: 984 ms
Training Time: 990 ms
DEBUG:ai_benchmark:Training Time: 990 ms
Training Time: 988 ms
DEBUG:ai_benchmark:Training Time: 988 ms
Training Time: 1043 ms
DEBUG:ai_benchmark:Training Time: 1043 ms
Training Time: 987 ms
DEBUG:ai_benchmark:Training Time: 987 ms
Training Time: 977 ms
DEBUG:ai_benchmark:Training Time: 977 ms
Training Time: 982 ms
DEBUG:ai_benchmark:Training Time: 982 ms
18.2 - training | batch=10, size=1024x300: 1052 ± 79 ms
INFO:ai_benchmark:18.2 - training | batch=10, size=1024x300: 1052 ± 79 ms
19/19. GNMT-Translation
INFO:ai_benchmark:
19/19. GNMT-Translation
2023-10-22 19:37:51.452079: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:51.452178: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:51.452213: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:51.452261: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:51.452295: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-10-22 19:37:51.452310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 31404 MB memory: -> device: 0, name: , pci bus id: 0000:03:00.0
2023-10-22 19:37:51.599735: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:37:51.608421: W tensorflow/c/c_api.cc:304] Operation '{name:'index_to_string/table_init' id:13 op device:{requested: '', assigned: ''} def:{{{node index_to_string/table_init}} = InitializeTableFromTextFileV2[_has_manual_control_dependencies=true, delimiter="\t", key_index=-1, offset=0, value_index=-2, vocab_size=-1](index_to_string, index_to_string/table_init/asset_filepath)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
2023-10-22 19:37:51.624952: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
2023-10-22 19:37:51.757342: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled.
Inference Time: 242 ms
DEBUG:ai_benchmark:Inference Time: 242 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 90 ms
DEBUG:ai_benchmark:Inference Time: 90 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 93 ms
DEBUG:ai_benchmark:Inference Time: 93 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 93 ms
DEBUG:ai_benchmark:Inference Time: 93 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 93 ms
DEBUG:ai_benchmark:Inference Time: 93 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 93 ms
DEBUG:ai_benchmark:Inference Time: 93 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
Inference Time: 92 ms
DEBUG:ai_benchmark:Inference Time: 92 ms
Inference Time: 91 ms
DEBUG:ai_benchmark:Inference Time: 91 ms
19.1 - inference | batch=1, size=1x20: 91.7 ± 0.8 ms
INFO:ai_benchmark:19.1 - inference | batch=1, size=1x20: 91.7 ± 0.8 ms
Device Inference Score: 22592
INFO:ai_benchmark:Device Inference Score: 22592
Device Training Score: 11170
INFO:ai_benchmark:Device Training Score: 11170
Device AI Score: 33762
INFO:ai_benchmark:Device AI Score: 33762
For more information and results, please visit http://ai-benchmark.com/alpha
INFO:ai_benchmark:For more information and results, please visit http://ai-benchmark.com/alpha
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment