Created
October 15, 2023 00:12
-
-
Save briansp2020/3e176c7a933cf23531642e326a2f91c5 to your computer and use it in GitHub Desktop.
Latest ai-benchmark using ROCm 5.7.1 and tensorflow-upstream 10/14/2023 source.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(tf) root@rocm:~/tmp# python benchmark.py | |
2023-10-14 15:02:22.116047: E external/local_xla/xla/stream_executor/plugin_registry.cc:93] Invalid plugin kind specified: DNN | |
2023-10-14 15:02:22.348480: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. | |
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2023-10-14 15:02:23.756833: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:23.982269: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:23.982301: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
>> AI-Benchmark - 0.1.3.cm | |
>> Let the AI Games begin | |
2023-10-14 15:02:25.095387: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.095474: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.095505: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.096220: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.096264: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.096317: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.096335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:02:25.388550: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388614: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388631: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388655: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388671: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:02:25.388709: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388735: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388749: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388765: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388779: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:25.388786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
* TF Version: 2.15.0 | |
* Platform: Linux-5.15.0-86-generic-x86_64-with-glibc2.35 | |
* CPU: AMD Ryzen 9 7900X 12-Core Processor | |
* CPU RAM: 63 GB | |
* GPU/0: Radeon RX 7900 XTX | |
* GPU RAM: 23.5 GB | |
* CUDA Version: N/A | |
* CUDA Build: N/A | |
The benchmark is running... | |
The tests might take up to 20 minutes | |
Please don't interrupt the script | |
1/19. MobileNet-V2 | |
2023-10-14 15:02:26.643065: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:26.643180: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:26.643214: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:26.643259: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:26.643288: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:26.643304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:02:26.746956: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled | |
2023-10-14 15:02:26.899813: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:02:27.270635: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2622 ms | |
Inference Time: 29 ms | |
Inference Time: 31 ms | |
Inference Time: 24 ms | |
Inference Time: 21 ms | |
Inference Time: 32 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 18 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 21 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 18 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 21 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
1.1 - inference | batch=50, size=224x224: 22.8 ± 3.5 ms | |
2023-10-14 15:02:33.291002: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:02:33.697929: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 784 ms | |
Training Time: 367 ms | |
Training Time: 363 ms | |
Training Time: 362 ms | |
Training Time: 375 ms | |
Training Time: 366 ms | |
Training Time: 375 ms | |
Training Time: 364 ms | |
Training Time: 359 ms | |
Training Time: 369 ms | |
Training Time: 375 ms | |
Training Time: 374 ms | |
Training Time: 353 ms | |
Training Time: 360 ms | |
Training Time: 358 ms | |
Training Time: 356 ms | |
Training Time: 358 ms | |
Training Time: 356 ms | |
Training Time: 359 ms | |
Training Time: 357 ms | |
Training Time: 353 ms | |
Training Time: 355 ms | |
1.2 - training | batch=50, size=224x224: 363 ± 7 ms | |
2/19. Inception-V3 | |
2023-10-14 15:02:44.470971: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:44.471068: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:44.471101: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:44.471158: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:44.471191: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:02:44.471207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:02:44.741532: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:02:44.954822: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2552 ms | |
Inference Time: 28 ms | |
Inference Time: 28 ms | |
Inference Time: 28 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 28 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 28 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
Inference Time: 27 ms | |
2.1 - inference | batch=20, size=346x346: 27.2 ± 0.4 ms | |
2023-10-14 15:02:50.047277: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:02:50.614592: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 900 ms | |
Training Time: 399 ms | |
Training Time: 392 ms | |
Training Time: 393 ms | |
Training Time: 393 ms | |
Training Time: 392 ms | |
Training Time: 398 ms | |
Training Time: 398 ms | |
Training Time: 400 ms | |
Training Time: 395 ms | |
Training Time: 390 ms | |
Training Time: 390 ms | |
Training Time: 389 ms | |
Training Time: 395 ms | |
Training Time: 390 ms | |
Training Time: 389 ms | |
Training Time: 388 ms | |
Training Time: 389 ms | |
Training Time: 393 ms | |
Training Time: 389 ms | |
Training Time: 405 ms | |
Training Time: 389 ms | |
2.2 - training | batch=20, size=346x346: 393 ± 4 ms | |
3/19. Inception-V4 | |
2023-10-14 15:03:00.614325: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:00.614392: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:00.614410: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:00.614439: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:00.614457: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:00.614467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:01.133304: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:01.416106: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 280 ms | |
Inference Time: 36 ms | |
Inference Time: 40 ms | |
Inference Time: 34 ms | |
Inference Time: 35 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 33 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
3.1 - inference | batch=10, size=346x346: 33.4 ± 1.8 ms | |
2023-10-14 15:03:04.097457: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:04.989052: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1134 ms | |
Training Time: 302 ms | |
Training Time: 301 ms | |
Training Time: 301 ms | |
Training Time: 298 ms | |
Training Time: 300 ms | |
Training Time: 300 ms | |
Training Time: 299 ms | |
Training Time: 300 ms | |
Training Time: 303 ms | |
Training Time: 300 ms | |
Training Time: 298 ms | |
Training Time: 300 ms | |
Training Time: 300 ms | |
Training Time: 300 ms | |
Training Time: 313 ms | |
Training Time: 303 ms | |
Training Time: 301 ms | |
Training Time: 303 ms | |
Training Time: 299 ms | |
Training Time: 302 ms | |
Training Time: 304 ms | |
3.2 - training | batch=10, size=346x346: 301 ± 3 ms | |
4/19. Inception-ResNet-V2 | |
2023-10-14 15:03:12.299818: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:12.300077: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:12.300140: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:12.300169: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:12.300186: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:12.300195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:13.155066: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:13.591312: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 742 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
4.1 - inference | batch=10, size=346x346: 38.1 ± 0.7 ms | |
2023-10-14 15:03:17.797480: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:19.411986: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1822 ms | |
Training Time: 250 ms | |
Training Time: 267 ms | |
Training Time: 252 ms | |
Training Time: 249 ms | |
Training Time: 251 ms | |
Training Time: 249 ms | |
Training Time: 249 ms | |
Training Time: 251 ms | |
Training Time: 249 ms | |
Training Time: 252 ms | |
Training Time: 249 ms | |
Training Time: 252 ms | |
Training Time: 253 ms | |
Training Time: 252 ms | |
Training Time: 251 ms | |
Training Time: 251 ms | |
Training Time: 251 ms | |
Training Time: 251 ms | |
Training Time: 249 ms | |
Training Time: 251 ms | |
Training Time: 251 ms | |
4.2 - training | batch=8, size=346x346: 251 ± 4 ms | |
5/19. ResNet-V2-50 | |
2023-10-14 15:03:25.775048: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:25.775199: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:25.775259: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:25.775324: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:25.775344: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:25.775356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:25.966544: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:26.098802: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 147 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
5.1 - inference | batch=10, size=346x346: 20.5 ± 0.5 ms | |
2023-10-14 15:03:27.892222: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:28.267980: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 523 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 84 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 84 ms | |
Training Time: 85 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 84 ms | |
Training Time: 88 ms | |
Training Time: 85 ms | |
Training Time: 84 ms | |
5.2 - training | batch=10, size=346x346: 84.6 ± 0.9 ms | |
6/19. ResNet-V2-152 | |
2023-10-14 15:03:30.836981: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:30.837044: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:30.837061: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:30.837089: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:30.837107: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:30.837116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:31.654625: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:31.990248: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 340 ms | |
Inference Time: 31 ms | |
Inference Time: 30 ms | |
Inference Time: 31 ms | |
Inference Time: 31 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 31 ms | |
Inference Time: 31 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 31 ms | |
Inference Time: 31 ms | |
6.1 - inference | batch=10, size=256x256: 30.3 ± 0.5 ms | |
2023-10-14 15:03:35.281545: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:36.448473: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1236 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 108 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 108 ms | |
Training Time: 106 ms | |
6.2 - training | batch=10, size=256x256: 106.8 ± 0.8 ms | |
7/19. VGG-16 | |
2023-10-14 15:03:39.442594: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:39.443575: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:39.443892: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:39.443928: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:39.443946: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:39.443956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:39.480703: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:39.556154: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 86 ms | |
Inference Time: 42 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 47 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 46 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 47 ms | |
Inference Time: 46 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
Inference Time: 44 ms | |
Inference Time: 47 ms | |
Inference Time: 43 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 51 ms | |
7.1 - inference | batch=20, size=224x224: 45.0 ± 2.0 ms | |
2023-10-14 15:03:41.816492: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:42.116570: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 184 ms | |
Training Time: 84 ms | |
Training Time: 80 ms | |
Training Time: 81 ms | |
Training Time: 79 ms | |
Training Time: 82 ms | |
Training Time: 80 ms | |
Training Time: 80 ms | |
Training Time: 82 ms | |
Training Time: 83 ms | |
Training Time: 79 ms | |
Training Time: 76 ms | |
Training Time: 79 ms | |
Training Time: 79 ms | |
Training Time: 80 ms | |
Training Time: 79 ms | |
Training Time: 81 ms | |
Training Time: 79 ms | |
Training Time: 82 ms | |
Training Time: 79 ms | |
Training Time: 80 ms | |
Training Time: 83 ms | |
7.2 - training | batch=2, size=224x224: 80.3 ± 1.8 ms | |
8/19. SRCNN 9-5-5 | |
2023-10-14 15:03:44.102987: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:44.103080: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:44.103113: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:44.103163: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:44.103194: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:44.103210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:44.118538: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:44.230837: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 50 ms | |
Inference Time: 30 ms | |
Inference Time: 32 ms | |
Inference Time: 29 ms | |
Inference Time: 34 ms | |
Inference Time: 28 ms | |
Inference Time: 34 ms | |
Inference Time: 27 ms | |
Inference Time: 34 ms | |
Inference Time: 28 ms | |
Inference Time: 32 ms | |
Inference Time: 28 ms | |
Inference Time: 32 ms | |
Inference Time: 28 ms | |
Inference Time: 34 ms | |
Inference Time: 27 ms | |
Inference Time: 36 ms | |
Inference Time: 27 ms | |
Inference Time: 36 ms | |
Inference Time: 27 ms | |
Inference Time: 36 ms | |
Inference Time: 27 ms | |
8.1 - inference | batch=10, size=512x512: 30.8 ± 3.3 ms | |
Inference Time: 24 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 25 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 25 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
8.2 - inference | batch=1, size=1536x1536: 23.0 ± 0.7 ms | |
2023-10-14 15:03:47.984362: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:48.309565: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 197 ms | |
Training Time: 181 ms | |
Training Time: 176 ms | |
Training Time: 175 ms | |
Training Time: 170 ms | |
Training Time: 171 ms | |
Training Time: 170 ms | |
Training Time: 171 ms | |
Training Time: 166 ms | |
Training Time: 171 ms | |
Training Time: 169 ms | |
Training Time: 174 ms | |
Training Time: 172 ms | |
Training Time: 172 ms | |
Training Time: 170 ms | |
Training Time: 170 ms | |
Training Time: 171 ms | |
Training Time: 170 ms | |
Training Time: 168 ms | |
Training Time: 174 ms | |
Training Time: 168 ms | |
Training Time: 169 ms | |
8.3 - training | batch=10, size=512x512: 171 ± 3 ms | |
9/19. VGG-19 Super-Res | |
2023-10-14 15:03:58.190969: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:58.191029: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:58.191046: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:58.191070: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:58.191086: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:03:58.191095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:03:58.242288: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:03:58.344816: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 72 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 38 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
9.1 - inference | batch=10, size=256x256: 35.8 ± 0.7 ms | |
Inference Time: 59 ms | |
Inference Time: 56 ms | |
Inference Time: 59 ms | |
Inference Time: 58 ms | |
Inference Time: 56 ms | |
Inference Time: 58 ms | |
Inference Time: 66 ms | |
Inference Time: 55 ms | |
Inference Time: 56 ms | |
Inference Time: 58 ms | |
Inference Time: 56 ms | |
Inference Time: 57 ms | |
Inference Time: 60 ms | |
Inference Time: 66 ms | |
Inference Time: 60 ms | |
Inference Time: 59 ms | |
Inference Time: 60 ms | |
Inference Time: 60 ms | |
Inference Time: 63 ms | |
Inference Time: 59 ms | |
Inference Time: 59 ms | |
Inference Time: 60 ms | |
9.2 - inference | batch=1, size=1024x1024: 59.1 ± 2.9 ms | |
2023-10-14 15:04:02.714165: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:03.008050: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 276 ms | |
Training Time: 200 ms | |
Training Time: 200 ms | |
Training Time: 199 ms | |
Training Time: 199 ms | |
Training Time: 199 ms | |
Training Time: 199 ms | |
Training Time: 201 ms | |
Training Time: 199 ms | |
Training Time: 200 ms | |
Training Time: 202 ms | |
Training Time: 200 ms | |
Training Time: 200 ms | |
Training Time: 199 ms | |
Training Time: 200 ms | |
Training Time: 200 ms | |
Training Time: 201 ms | |
Training Time: 200 ms | |
Training Time: 200 ms | |
Training Time: 200 ms | |
Training Time: 201 ms | |
Training Time: 200 ms | |
9.3 - training | batch=10, size=224x224: 200.0 ± 0.8 ms | |
10/19. ResNet-SRGAN | |
2023-10-14 15:04:12.157956: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:12.158018: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:12.158037: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:12.158067: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:12.158087: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:12.158097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:04:12.385685: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:12.583018: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2656 ms | |
Inference Time: 45 ms | |
Inference Time: 43 ms | |
Inference Time: 43 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 44 ms | |
Inference Time: 43 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
Inference Time: 43 ms | |
Inference Time: 44 ms | |
Inference Time: 43 ms | |
Inference Time: 43 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
Inference Time: 43 ms | |
Inference Time: 42 ms | |
10.1 - inference | batch=10, size=512x512: 42.9 ± 0.8 ms | |
Inference Time: 40 ms | |
Inference Time: 35 ms | |
Inference Time: 37 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 38 ms | |
Inference Time: 34 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
10.2 - inference | batch=1, size=1536x1536: 35.6 ± 1.0 ms | |
2023-10-14 15:04:19.776953: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:20.131030: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 330 ms | |
Training Time: 121 ms | |
Training Time: 114 ms | |
Training Time: 117 ms | |
Training Time: 120 ms | |
Training Time: 118 ms | |
Training Time: 131 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 116 ms | |
Training Time: 119 ms | |
Training Time: 115 ms | |
Training Time: 114 ms | |
Training Time: 115 ms | |
Training Time: 118 ms | |
Training Time: 114 ms | |
Training Time: 114 ms | |
Training Time: 114 ms | |
Training Time: 115 ms | |
Training Time: 115 ms | |
Training Time: 115 ms | |
10.3 - training | batch=5, size=512x512: 117 ± 4 ms | |
11/19. ResNet-DPED | |
2023-10-14 15:04:25.763317: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:25.763481: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:25.763506: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:25.763535: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:25.763555: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:25.763565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:04:25.793871: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:25.905343: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 636 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 52 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
11.1 - inference | batch=10, size=256x256: 48.2 ± 1.1 ms | |
Inference Time: 4254 ms | |
Inference Time: 78 ms | |
Inference Time: 77 ms | |
Inference Time: 80 ms | |
Inference Time: 77 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 78 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
11.2 - inference | batch=1, size=1024x1024: 78.7 ± 0.9 ms | |
2023-10-14 15:04:35.872063: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:36.375474: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1906 ms | |
Training Time: 106 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 108 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 106 ms | |
Training Time: 107 ms | |
Training Time: 107 ms | |
Training Time: 106 ms | |
11.3 - training | batch=15, size=128x128: 106.7 ± 0.7 ms | |
12/19. U-Net | |
2023-10-14 15:04:47.115840: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:47.115901: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:47.115918: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:47.115942: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:47.115959: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:04:47.115967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:04:47.187508: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:04:47.284195: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 6592 ms | |
Inference Time: 81 ms | |
Inference Time: 87 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 80 ms | |
Inference Time: 78 ms | |
Inference Time: 83 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 80 ms | |
12.1 - inference | batch=4, size=512x512: 79.4 ± 2.1 ms | |
Inference Time: 7581 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 80 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
12.2 - inference | batch=1, size=1024x1024: 81.0 ± 0.7 ms | |
2023-10-14 15:05:06.333452: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:06.760069: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5457 ms | |
Training Time: 112 ms | |
Training Time: 112 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 112 ms | |
Training Time: 112 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 111 ms | |
Training Time: 112 ms | |
Training Time: 111 ms | |
Training Time: 112 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 113 ms | |
Training Time: 112 ms | |
Training Time: 112 ms | |
12.3 - training | batch=4, size=256x256: 112.4 ± 0.7 ms | |
13/19. Nvidia-SPADE | |
2023-10-14 15:05:15.045140: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:15.045205: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:15.045222: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:15.045249: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:15.045266: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:15.045274: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:05:15.216647: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:15.402677: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1868 ms | |
Inference Time: 52 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 52 ms | |
13.1 - inference | batch=5, size=128x128: 50.8 ± 0.6 ms | |
2023-10-14 15:05:19.865705: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:20.769577: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2589 ms | |
Training Time: 80 ms | |
Training Time: 80 ms | |
Training Time: 79 ms | |
Training Time: 81 ms | |
Training Time: 81 ms | |
Training Time: 83 ms | |
Training Time: 81 ms | |
Training Time: 80 ms | |
Training Time: 79 ms | |
Training Time: 82 ms | |
Training Time: 81 ms | |
Training Time: 80 ms | |
Training Time: 80 ms | |
Training Time: 80 ms | |
Training Time: 80 ms | |
Training Time: 81 ms | |
Training Time: 81 ms | |
Training Time: 82 ms | |
Training Time: 79 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
13.2 - training | batch=1, size=128x128: 80.7 ± 1.1 ms | |
14/19. ICNet | |
2023-10-14 15:05:24.692589: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:24.692992: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:24.693033: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:24.693084: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:24.693119: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:24.693137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:05:24.855863: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:25.071520: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1616 ms | |
Inference Time: 80 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 85 ms | |
Inference Time: 90 ms | |
Inference Time: 88 ms | |
Inference Time: 83 ms | |
Inference Time: 89 ms | |
Inference Time: 84 ms | |
Inference Time: 91 ms | |
Inference Time: 84 ms | |
Inference Time: 85 ms | |
Inference Time: 82 ms | |
Inference Time: 87 ms | |
Inference Time: 84 ms | |
Inference Time: 87 ms | |
Inference Time: 87 ms | |
Inference Time: 84 ms | |
Inference Time: 83 ms | |
Inference Time: 86 ms | |
Inference Time: 87 ms | |
14.1 - inference | batch=5, size=1024x1536: 85.2 ± 2.8 ms | |
2023-10-14 15:05:30.605581: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:31.308104: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3481 ms | |
Training Time: 377 ms | |
Training Time: 395 ms | |
Training Time: 383 ms | |
Training Time: 378 ms | |
Training Time: 375 ms | |
Training Time: 399 ms | |
Training Time: 438 ms | |
Training Time: 379 ms | |
Training Time: 395 ms | |
Training Time: 435 ms | |
Training Time: 412 ms | |
Training Time: 407 ms | |
Training Time: 377 ms | |
Training Time: 364 ms | |
Training Time: 388 ms | |
Training Time: 426 ms | |
Training Time: 380 ms | |
Training Time: 376 ms | |
Training Time: 433 ms | |
Training Time: 368 ms | |
Training Time: 404 ms | |
14.2 - training | batch=10, size=1024x1536: 395 ± 22 ms | |
15/19. PSPNet | |
2023-10-14 15:05:51.716627: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:51.716713: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:51.716731: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:51.716760: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:51.716779: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:05:51.716789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:05:51.944695: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:05:52.158135: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 10123 ms | |
Inference Time: 206 ms | |
Inference Time: 211 ms | |
Inference Time: 205 ms | |
Inference Time: 208 ms | |
Inference Time: 210 ms | |
Inference Time: 209 ms | |
Inference Time: 213 ms | |
Inference Time: 215 ms | |
Inference Time: 210 ms | |
Inference Time: 206 ms | |
Inference Time: 211 ms | |
Inference Time: 206 ms | |
Inference Time: 210 ms | |
Inference Time: 211 ms | |
Inference Time: 213 ms | |
Inference Time: 206 ms | |
Inference Time: 205 ms | |
Inference Time: 205 ms | |
Inference Time: 220 ms | |
Inference Time: 214 ms | |
Inference Time: 211 ms | |
15.1 - inference | batch=5, size=720x720: 210 ± 4 ms | |
2023-10-14 15:06:08.352412: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:06:08.815435: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5221 ms | |
Training Time: 124 ms | |
Training Time: 123 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 127 ms | |
Training Time: 124 ms | |
Training Time: 125 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 124 ms | |
Training Time: 129 ms | |
Training Time: 124 ms | |
15.2 - training | batch=1, size=512x512: 125 ± 1 ms | |
16/19. DeepLab | |
2023-10-14 15:06:16.646700: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:16.646798: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:16.646831: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:16.646877: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:16.646909: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:16.646928: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:06:17.089644: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:06:17.458490: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1179 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 47 ms | |
Inference Time: 49 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
16.1 - inference | batch=2, size=512x512: 47.9 ± 0.7 ms | |
2023-10-14 15:06:21.077653: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:06:21.913419: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2215 ms | |
Training Time: 77 ms | |
Training Time: 76 ms | |
Training Time: 77 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 76 ms | |
Training Time: 77 ms | |
Training Time: 88 ms | |
Training Time: 77 ms | |
Training Time: 77 ms | |
Training Time: 77 ms | |
Training Time: 77 ms | |
Training Time: 77 ms | |
Training Time: 76 ms | |
16.2 - training | batch=1, size=384x384: 77.0 ± 2.5 ms | |
17/19. Pixel-RNN | |
2023-10-14 15:06:25.215820: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.215898: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.215929: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.216048: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.216074: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.216085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:06:25.341668: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.341719: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.341738: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.341762: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.341779: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:06:25.341788: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:06:32.815332: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:06:32.827924: W tensorflow/c/c_api.cc:305] Operation '{name:'conv2d_out_logits/biases/Adam_1/Assign' id:47115 op device:{requested: '', assigned: ''} def:{{{node conv2d_out_logits/biases/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv2d_out_logits/biases/Adam_1, conv2d_out_logits/biases/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-10-14 15:06:33.577408: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:06:35.532577: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2369 ms | |
Inference Time: 307 ms | |
Inference Time: 315 ms | |
Inference Time: 316 ms | |
Inference Time: 312 ms | |
Inference Time: 313 ms | |
Inference Time: 314 ms | |
Inference Time: 308 ms | |
Inference Time: 313 ms | |
Inference Time: 307 ms | |
Inference Time: 310 ms | |
Inference Time: 314 ms | |
Inference Time: 307 ms | |
Inference Time: 323 ms | |
Inference Time: 307 ms | |
Inference Time: 293 ms | |
Inference Time: 300 ms | |
Inference Time: 311 ms | |
Inference Time: 303 ms | |
Inference Time: 305 ms | |
Inference Time: 308 ms | |
Inference Time: 309 ms | |
17.1 - inference | batch=50, size=64x64: 309 ± 6 ms | |
2023-10-14 15:06:58.646138: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 12407 ms | |
Training Time: 1462 ms | |
Training Time: 1558 ms | |
Training Time: 1524 ms | |
Training Time: 1587 ms | |
Training Time: 1639 ms | |
Training Time: 1530 ms | |
Training Time: 1557 ms | |
Training Time: 1536 ms | |
Training Time: 1559 ms | |
Training Time: 1551 ms | |
17.2 - training | batch=10, size=64x64: 1550 ± 43 ms | |
18/19. LSTM-Sentiment | |
2023-10-14 15:07:20.838263: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:20.838338: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:20.838356: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:20.838391: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:20.838409: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:20.838420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:07:20.944745: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:07:20.948530: W tensorflow/c/c_api.cc:305] Operation '{name:'Variable_1/Adam_1/Assign' id:325 op device:{requested: '', assigned: ''} def:{{{node Variable_1/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](Variable_1/Adam_1, Variable_1/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-10-14 15:07:20.972233: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:07:21.198197: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 548 ms | |
Inference Time: 352 ms | |
Inference Time: 386 ms | |
Inference Time: 366 ms | |
Inference Time: 359 ms | |
Inference Time: 366 ms | |
Inference Time: 365 ms | |
Inference Time: 360 ms | |
Inference Time: 378 ms | |
Inference Time: 364 ms | |
Inference Time: 359 ms | |
Inference Time: 362 ms | |
Inference Time: 365 ms | |
Inference Time: 382 ms | |
Inference Time: 364 ms | |
Inference Time: 381 ms | |
Inference Time: 379 ms | |
Inference Time: 361 ms | |
Inference Time: 363 ms | |
Inference Time: 362 ms | |
Inference Time: 385 ms | |
Inference Time: 384 ms | |
18.1 - inference | batch=100, size=1024x300: 369 ± 10 ms | |
2023-10-14 15:07:31.951713: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 810 ms | |
Training Time: 707 ms | |
Training Time: 707 ms | |
Training Time: 726 ms | |
Training Time: 716 ms | |
Training Time: 712 ms | |
Training Time: 708 ms | |
Training Time: 705 ms | |
Training Time: 701 ms | |
Training Time: 694 ms | |
Training Time: 692 ms | |
Training Time: 708 ms | |
Training Time: 699 ms | |
Training Time: 705 ms | |
Training Time: 706 ms | |
Training Time: 716 ms | |
Training Time: 725 ms | |
Training Time: 725 ms | |
Training Time: 721 ms | |
Training Time: 718 ms | |
Training Time: 735 ms | |
Training Time: 726 ms | |
18.2 - training | batch=10, size=1024x300: 712 ± 11 ms | |
19/19. GNMT-Translation | |
2023-10-14 15:07:48.028090: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:48.028179: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:48.028210: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:48.028250: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:48.028279: I external/local_xla/xla/stream_executor/rocm/rocm_gpu_executor.cc:787] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-10-14 15:07:48.028295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1926] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:03:00.0 | |
2023-10-14 15:07:48.685811: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:07:48.696186: W tensorflow/c/c_api.cc:305] Operation '{name:'index_to_string/table_init' id:13 op device:{requested: '', assigned: ''} def:{{{node index_to_string/table_init}} = InitializeTableFromTextFileV2[_has_manual_control_dependencies=true, delimiter="\t", key_index=-1, offset=0, value_index=-2, vocab_size=-1](index_to_string, index_to_string/table_init/asset_filepath)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-10-14 15:07:48.714160: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-10-14 15:07:48.835731: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 707 ms | |
Inference Time: 102 ms | |
Inference Time: 101 ms | |
Inference Time: 101 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 101 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 101 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 100 ms | |
Inference Time: 102 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 101 ms | |
Inference Time: 101 ms | |
Inference Time: 100 ms | |
Inference Time: 100 ms | |
19.1 - inference | batch=1, size=1x20: 100.7 ± 0.6 ms | |
Device Inference Score: 25523 | |
Device Training Score: 15473 | |
Device AI Score: 40996 | |
For more information and results, please visit http://ai-benchmark.com/alpha |
pip install new-ai-benchmark
https://pypi.org/project/new-ai-benchmark/
I built from https://github.com/ROCmSoftwarePlatform/tensorflow-upstream
I haven't played with windows tensorflow-directml
thanks for all the info! ..
https://pypi.org/project/new-ai-benchmark/ interesting..
curious by versioning it's 2.2.0 vs 0.1.3.cm ..
I will wait for a 2.15 rocm build from pipy..
thanks for all the info! .. https://pypi.org/project/new-ai-benchmark/ interesting.. curious by versioning it's 2.2.0 vs 0.1.3.cm .. I will wait for a 2.15 rocm build from pipy..
it was builded from me, the ai benchmark is not compatible with python +3.10.
You can see the code here: https://github.com/johnnynunez/ai-benchmark
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
thanks for sharing!
two questions:
1)where you obtain version"0.1.3.cm" I see: AI-Benchmark - 0.1.3.cm
I obtain 0.1.2 https://ai-benchmark.com/alpha from https://pypi.org/project/ai-benchmark/#history
2) by "tensorflow-upstream" are you building from source from the amd rocm tensorflow repo as on pipy I only see:
https://pypi.org/project/tensorflow-rocm/#history
pip install tensorflow-rocm==2.13.0.570
and you seem using 2.15
3) interested on posting also (Windows) tensorflow-directml benchmarks? up to tensorflow 2.11 his plugin..