Last active
September 16, 2023 13:39
-
-
Save briansp2020/e885f0eb6cbec45fcaf0c2eac8c3ee11 to your computer and use it in GitHub Desktop.
ai-benchmark comparison between 7900XTX (ROCm) and 3080ti (WSL2) (9/4/2023)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
:~/tmp$ python benchmark.py | |
>> AI-Benchmark - 0.1.3.cm | |
>> Let the AI Games begin | |
* TF Version: 2.13.0 | |
* Platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 | |
* CPU: AMD Ryzen 9 5950X 16-Core Processor | |
* CPU RAM: 8 GB | |
* GPU/0: NVIDIA GeForce RTX 3080 Ti | |
* GPU RAM: 9.1 GB | |
* CUDA Version: 11.5 | |
* CUDA Build: V11.5.119 | |
The benchmark is running... | |
The tests might take up to 20 minutes | |
Please don't interrupt the script | |
1/19. MobileNet-V2 | |
Inference Time: 1956 ms | |
Inference Time: 24 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 21 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 33 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 34 ms | |
Inference Time: 37 ms | |
Inference Time: 36 ms | |
1.1 - inference | batch=50, size=224x224: 28.1 ± 6.0 ms | |
Training Time: 2979 ms | |
Training Time: 78 ms | |
Training Time: 70 ms | |
Training Time: 67 ms | |
Training Time: 73 ms | |
Training Time: 72 ms | |
Training Time: 73 ms | |
Training Time: 73 ms | |
Training Time: 72 ms | |
Training Time: 108 ms | |
Training Time: 75 ms | |
Training Time: 75 ms | |
Training Time: 74 ms | |
Training Time: 74 ms | |
Training Time: 74 ms | |
Training Time: 82 ms | |
Training Time: 79 ms | |
Training Time: 81 ms | |
Training Time: 75 ms | |
Training Time: 89 ms | |
Training Time: 106 ms | |
Training Time: 117 ms | |
1.2 - training | batch=50, size=224x224: 80.3 ± 13.2 ms | |
2/19. Inception-V3 | |
Inference Time: 1021 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 46 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
2.1 - inference | batch=20, size=346x346: 39.1 ± 4.8 ms | |
Training Time: 5588 ms | |
Training Time: 95 ms | |
Training Time: 101 ms | |
Training Time: 101 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
Training Time: 104 ms | |
Training Time: 94 ms | |
Training Time: 94 ms | |
Training Time: 97 ms | |
Training Time: 100 ms | |
Training Time: 98 ms | |
Training Time: 97 ms | |
Training Time: 98 ms | |
Training Time: 99 ms | |
Training Time: 100 ms | |
Training Time: 99 ms | |
Training Time: 98 ms | |
Training Time: 96 ms | |
Training Time: 99 ms | |
Training Time: 96 ms | |
Training Time: 93 ms | |
2.2 - training | batch=20, size=346x346: 98.0 ± 2.6 ms | |
3/19. Inception-V4 | |
Inference Time: 1241 ms | |
Inference Time: 30 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 30 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 30 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 31 ms | |
Inference Time: 29 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 29 ms | |
3.1 - inference | batch=10, size=346x346: 29.4 ± 0.6 ms | |
Training Time: 5039 ms | |
Training Time: 105 ms | |
Training Time: 101 ms | |
Training Time: 108 ms | |
Training Time: 100 ms | |
Training Time: 121 ms | |
Training Time: 108 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
Training Time: 118 ms | |
Training Time: 105 ms | |
Training Time: 107 ms | |
Training Time: 100 ms | |
Training Time: 97 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
Training Time: 102 ms | |
Training Time: 99 ms | |
Training Time: 102 ms | |
Training Time: 100 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
3.2 - training | batch=10, size=346x346: 103 ± 6 ms | |
4/19. Inception-ResNet-V2 | |
Inference Time: 1099 ms | |
Inference Time: 40 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 40 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
4.1 - inference | batch=10, size=346x346: 37.7 ± 0.9 ms | |
Training Time: 5609 ms | |
Training Time: 112 ms | |
Training Time: 127 ms | |
Training Time: 173 ms | |
Training Time: 112 ms | |
Training Time: 172 ms | |
Training Time: 175 ms | |
Training Time: 177 ms | |
Training Time: 156 ms | |
Training Time: 111 ms | |
Training Time: 113 ms | |
Training Time: 155 ms | |
Training Time: 176 ms | |
Training Time: 171 ms | |
Training Time: 111 ms | |
Training Time: 112 ms | |
Training Time: 163 ms | |
Training Time: 173 ms | |
Training Time: 112 ms | |
Training Time: 114 ms | |
Training Time: 174 ms | |
Training Time: 112 ms | |
4.2 - training | batch=8, size=346x346: 143 ± 29 ms | |
5/19. ResNet-V2-50 | |
Inference Time: 400 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 19 ms | |
Inference Time: 20 ms | |
Inference Time: 20 ms | |
Inference Time: 19 ms | |
Inference Time: 20 ms | |
Inference Time: 19 ms | |
Inference Time: 21 ms | |
Inference Time: 19 ms | |
Inference Time: 20 ms | |
Inference Time: 21 ms | |
Inference Time: 20 ms | |
5.1 - inference | batch=10, size=346x346: 20.0 ± 0.6 ms | |
Training Time: 3081 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 67 ms | |
Training Time: 58 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 60 ms | |
Training Time: 58 ms | |
Training Time: 59 ms | |
Training Time: 58 ms | |
Training Time: 59 ms | |
Training Time: 60 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 58 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 59 ms | |
Training Time: 60 ms | |
5.2 - training | batch=10, size=346x346: 59.3 ± 1.8 ms | |
6/19. ResNet-V2-152 | |
Inference Time: 563 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 27 ms | |
Inference Time: 25 ms | |
Inference Time: 27 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 25 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 25 ms | |
Inference Time: 27 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 25 ms | |
Inference Time: 26 ms | |
Inference Time: 26 ms | |
Inference Time: 25 ms | |
6.1 - inference | batch=10, size=256x256: 25.9 ± 0.6 ms | |
Training Time: 3713 ms | |
Training Time: 88 ms | |
Training Time: 89 ms | |
Training Time: 116 ms | |
Training Time: 89 ms | |
Training Time: 114 ms | |
Training Time: 88 ms | |
Training Time: 89 ms | |
Training Time: 119 ms | |
Training Time: 93 ms | |
Training Time: 93 ms | |
Training Time: 116 ms | |
Training Time: 88 ms | |
Training Time: 90 ms | |
Training Time: 92 ms | |
Training Time: 89 ms | |
Training Time: 112 ms | |
Training Time: 89 ms | |
Training Time: 92 ms | |
Training Time: 88 ms | |
Training Time: 88 ms | |
Training Time: 88 ms | |
6.2 - training | batch=10, size=256x256: 95.7 ± 11.2 ms | |
7/19. VGG-16 | |
Inference Time: 376 ms | |
Inference Time: 41 ms | |
Inference Time: 40 ms | |
Inference Time: 40 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 40 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 40 ms | |
Inference Time: 40 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 40 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 42 ms | |
7.1 - inference | batch=20, size=224x224: 41.0 ± 0.7 ms | |
Training Time: 1495 ms | |
Training Time: 55 ms | |
Training Time: 54 ms | |
Training Time: 55 ms | |
Training Time: 55 ms | |
Training Time: 58 ms | |
Training Time: 58 ms | |
Training Time: 56 ms | |
Training Time: 55 ms | |
Training Time: 55 ms | |
Training Time: 55 ms | |
Training Time: 56 ms | |
Training Time: 54 ms | |
Training Time: 52 ms | |
Training Time: 53 ms | |
Training Time: 53 ms | |
Training Time: 55 ms | |
Training Time: 54 ms | |
Training Time: 52 ms | |
Training Time: 52 ms | |
Training Time: 53 ms | |
Training Time: 53 ms | |
7.2 - training | batch=2, size=224x224: 54.4 ± 1.7 ms | |
8/19. SRCNN 9-5-5 | |
Inference Time: 1446 ms | |
Inference Time: 42 ms | |
Inference Time: 39 ms | |
Inference Time: 42 ms | |
Inference Time: 37 ms | |
Inference Time: 44 ms | |
Inference Time: 34 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 58 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 54 ms | |
Inference Time: 54 ms | |
Inference Time: 56 ms | |
Inference Time: 55 ms | |
Inference Time: 62 ms | |
Inference Time: 58 ms | |
8.1 - inference | batch=10, size=512x512: 47.2 ± 9.7 ms | |
Inference Time: 1760 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 33 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 35 ms | |
Inference Time: 34 ms | |
Inference Time: 33 ms | |
Inference Time: 35 ms | |
Inference Time: 34 ms | |
Inference Time: 33 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 34 ms | |
Inference Time: 34 ms | |
Inference Time: 33 ms | |
8.2 - inference | batch=1, size=1536x1536: 34.1 ± 0.7 ms | |
Training Time: 6951 ms | |
Training Time: 115 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
Training Time: 108 ms | |
Training Time: 119 ms | |
Training Time: 111 ms | |
Training Time: 114 ms | |
Training Time: 110 ms | |
Training Time: 112 ms | |
Training Time: 113 ms | |
Training Time: 111 ms | |
Training Time: 109 ms | |
Training Time: 111 ms | |
Training Time: 107 ms | |
Training Time: 110 ms | |
Training Time: 108 ms | |
Training Time: 111 ms | |
Training Time: 109 ms | |
Training Time: 111 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
8.3 - training | batch=10, size=512x512: 111 ± 3 ms | |
9/19. VGG-19 Super-Res | |
Inference Time: 242 ms | |
Inference Time: 55 ms | |
Inference Time: 56 ms | |
Inference Time: 56 ms | |
Inference Time: 55 ms | |
Inference Time: 56 ms | |
Inference Time: 56 ms | |
Inference Time: 54 ms | |
Inference Time: 55 ms | |
Inference Time: 56 ms | |
Inference Time: 54 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 56 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 57 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
9.1 - inference | batch=10, size=256x256: 54.1 ± 1.7 ms | |
Inference Time: 329 ms | |
Inference Time: 88 ms | |
Inference Time: 88 ms | |
Inference Time: 89 ms | |
Inference Time: 91 ms | |
Inference Time: 90 ms | |
Inference Time: 90 ms | |
Inference Time: 89 ms | |
Inference Time: 88 ms | |
Inference Time: 88 ms | |
Inference Time: 89 ms | |
Inference Time: 88 ms | |
Inference Time: 88 ms | |
Inference Time: 88 ms | |
Inference Time: 90 ms | |
Inference Time: 97 ms | |
Inference Time: 93 ms | |
Inference Time: 91 ms | |
Inference Time: 103 ms | |
Inference Time: 94 ms | |
Inference Time: 92 ms | |
Inference Time: 91 ms | |
9.2 - inference | batch=1, size=1024x1024: 90.7 ± 3.6 ms | |
Training Time: 1143 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 114 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
Training Time: 113 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
Training Time: 111 ms | |
Training Time: 112 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 112 ms | |
Training Time: 113 ms | |
Training Time: 119 ms | |
Training Time: 110 ms | |
Training Time: 111 ms | |
Training Time: 112 ms | |
Training Time: 110 ms | |
Training Time: 109 ms | |
9.3 - training | batch=10, size=224x224: 111 ± 2 ms | |
10/19. ResNet-SRGAN | |
Inference Time: 784 ms | |
Inference Time: 78 ms | |
Inference Time: 76 ms | |
Inference Time: 83 ms | |
Inference Time: 80 ms | |
Inference Time: 74 ms | |
Inference Time: 71 ms | |
Inference Time: 74 ms | |
Inference Time: 66 ms | |
Inference Time: 80 ms | |
Inference Time: 98 ms | |
Inference Time: 107 ms | |
Inference Time: 96 ms | |
Inference Time: 76 ms | |
Inference Time: 69 ms | |
Inference Time: 77 ms | |
Inference Time: 76 ms | |
Inference Time: 78 ms | |
Inference Time: 72 ms | |
Inference Time: 76 ms | |
Inference Time: 69 ms | |
Inference Time: 75 ms | |
10.1 - inference | batch=10, size=512x512: 78.6 ± 9.9 ms | |
Inference Time: 515 ms | |
Inference Time: 57 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
Inference Time: 56 ms | |
Inference Time: 55 ms | |
Inference Time: 55 ms | |
Inference Time: 55 ms | |
Inference Time: 58 ms | |
Inference Time: 53 ms | |
Inference Time: 55 ms | |
Inference Time: 57 ms | |
Inference Time: 55 ms | |
Inference Time: 56 ms | |
Inference Time: 57 ms | |
Inference Time: 54 ms | |
Inference Time: 58 ms | |
Inference Time: 53 ms | |
Inference Time: 54 ms | |
Inference Time: 56 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
10.2 - inference | batch=1, size=1536x1536: 55.9 ± 1.6 ms | |
Training Time: 2012 ms | |
Training Time: 85 ms | |
Training Time: 89 ms | |
Training Time: 87 ms | |
Training Time: 94 ms | |
Training Time: 88 ms | |
Training Time: 85 ms | |
Training Time: 88 ms | |
Training Time: 89 ms | |
Training Time: 84 ms | |
Training Time: 88 ms | |
Training Time: 91 ms | |
Training Time: 91 ms | |
Training Time: 92 ms | |
Training Time: 89 ms | |
Training Time: 91 ms | |
Training Time: 89 ms | |
Training Time: 91 ms | |
Training Time: 91 ms | |
Training Time: 90 ms | |
Training Time: 121 ms | |
Training Time: 88 ms | |
10.3 - training | batch=5, size=512x512: 90.5 ± 7.2 ms | |
11/19. ResNet-DPED | |
Inference Time: 615 ms | |
Inference Time: 67 ms | |
Inference Time: 70 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 69 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 70 ms | |
Inference Time: 73 ms | |
Inference Time: 71 ms | |
Inference Time: 68 ms | |
Inference Time: 69 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 67 ms | |
Inference Time: 68 ms | |
Inference Time: 68 ms | |
Inference Time: 71 ms | |
11.1 - inference | batch=10, size=256x256: 68.7 ± 1.5 ms | |
Inference Time: 1038 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 108 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 110 ms | |
Inference Time: 112 ms | |
Inference Time: 110 ms | |
Inference Time: 106 ms | |
Inference Time: 108 ms | |
Inference Time: 108 ms | |
Inference Time: 106 ms | |
Inference Time: 107 ms | |
Inference Time: 109 ms | |
Inference Time: 108 ms | |
11.2 - inference | batch=1, size=1024x1024: 108 ± 1 ms | |
Training Time: 1478 ms | |
Training Time: 98 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
Training Time: 99 ms | |
Training Time: 100 ms | |
Training Time: 100 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
Training Time: 99 ms | |
Training Time: 99 ms | |
Training Time: 98 ms | |
Training Time: 100 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
Training Time: 99 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
Training Time: 98 ms | |
11.3 - training | batch=15, size=128x128: 98.6 ± 0.7 ms | |
12/19. U-Net | |
Inference Time: 2703 ms | |
Inference Time: 110 ms | |
Inference Time: 116 ms | |
Inference Time: 111 ms | |
Inference Time: 113 ms | |
Inference Time: 113 ms | |
Inference Time: 113 ms | |
Inference Time: 110 ms | |
Inference Time: 111 ms | |
Inference Time: 115 ms | |
Inference Time: 111 ms | |
Inference Time: 110 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 112 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
Inference Time: 111 ms | |
12.1 - inference | batch=4, size=512x512: 112 ± 2 ms | |
Inference Time: 2678 ms | |
Inference Time: 117 ms | |
Inference Time: 113 ms | |
Inference Time: 115 ms | |
Inference Time: 114 ms | |
Inference Time: 114 ms | |
Inference Time: 110 ms | |
Inference Time: 110 ms | |
Inference Time: 111 ms | |
Inference Time: 110 ms | |
Inference Time: 110 ms | |
Inference Time: 109 ms | |
Inference Time: 110 ms | |
Inference Time: 114 ms | |
Inference Time: 114 ms | |
Inference Time: 115 ms | |
Inference Time: 115 ms | |
Inference Time: 122 ms | |
Inference Time: 117 ms | |
Inference Time: 131 ms | |
Inference Time: 117 ms | |
Inference Time: 117 ms | |
12.2 - inference | batch=1, size=1024x1024: 115 ± 5 ms | |
Training Time: 4174 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 125 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 119 ms | |
Training Time: 124 ms | |
Training Time: 116 ms | |
Training Time: 116 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 116 ms | |
Training Time: 117 ms | |
Training Time: 118 ms | |
Training Time: 116 ms | |
12.3 - training | batch=4, size=256x256: 117 ± 2 ms | |
13/19. Nvidia-SPADE | |
Inference Time: 1424 ms | |
Inference Time: 44 ms | |
Inference Time: 43 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 46 ms | |
Inference Time: 46 ms | |
Inference Time: 43 ms | |
Inference Time: 46 ms | |
Inference Time: 46 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 46 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
13.1 - inference | batch=5, size=128x128: 44.8 ± 0.9 ms | |
Training Time: 4303 ms | |
Training Time: 75 ms | |
Training Time: 68 ms | |
Training Time: 71 ms | |
Training Time: 75 ms | |
Training Time: 76 ms | |
Training Time: 68 ms | |
Training Time: 71 ms | |
Training Time: 69 ms | |
Training Time: 70 ms | |
Training Time: 69 ms | |
Training Time: 75 ms | |
Training Time: 69 ms | |
Training Time: 81 ms | |
Training Time: 82 ms | |
Training Time: 71 ms | |
Training Time: 69 ms | |
Training Time: 69 ms | |
Training Time: 77 ms | |
Training Time: 77 ms | |
Training Time: 70 ms | |
Training Time: 77 ms | |
13.2 - training | batch=1, size=128x128: 72.8 ± 4.2 ms | |
14/19. ICNet | |
Inference Time: 616 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 83 ms | |
Inference Time: 84 ms | |
Inference Time: 82 ms | |
Inference Time: 83 ms | |
Inference Time: 82 ms | |
Inference Time: 105 ms | |
Inference Time: 102 ms | |
Inference Time: 104 ms | |
Inference Time: 105 ms | |
Inference Time: 107 ms | |
Inference Time: 105 ms | |
Inference Time: 111 ms | |
Inference Time: 108 ms | |
Inference Time: 107 ms | |
Inference Time: 107 ms | |
Inference Time: 108 ms | |
Inference Time: 110 ms | |
14.1 - inference | batch=5, size=1024x1536: 96.1 ± 12.2 ms | |
Training Time: 2903 ms | |
Training Time: 311 ms | |
Training Time: 306 ms | |
Training Time: 317 ms | |
Training Time: 311 ms | |
Training Time: 344 ms | |
Training Time: 397 ms | |
Training Time: 341 ms | |
Training Time: 365 ms | |
Training Time: 365 ms | |
Training Time: 352 ms | |
Training Time: 306 ms | |
Training Time: 308 ms | |
Training Time: 306 ms | |
Training Time: 311 ms | |
Training Time: 306 ms | |
Training Time: 302 ms | |
Training Time: 320 ms | |
Training Time: 310 ms | |
Training Time: 302 ms | |
Training Time: 326 ms | |
Training Time: 351 ms | |
14.2 - training | batch=10, size=1024x1536: 327 ± 26 ms | |
15/19. PSPNet | |
Inference Time: 3414 ms | |
Inference Time: 176 ms | |
Inference Time: 173 ms | |
Inference Time: 173 ms | |
Inference Time: 172 ms | |
Inference Time: 172 ms | |
Inference Time: 173 ms | |
Inference Time: 174 ms | |
Inference Time: 173 ms | |
Inference Time: 174 ms | |
Inference Time: 173 ms | |
Inference Time: 176 ms | |
Inference Time: 178 ms | |
Inference Time: 174 ms | |
Inference Time: 177 ms | |
Inference Time: 179 ms | |
Inference Time: 176 ms | |
Inference Time: 174 ms | |
Inference Time: 175 ms | |
Inference Time: 173 ms | |
Inference Time: 176 ms | |
Inference Time: 179 ms | |
15.1 - inference | batch=5, size=720x720: 175 ± 2 ms | |
Training Time: 3758 ms | |
Training Time: 72 ms | |
Training Time: 72 ms | |
Training Time: 72 ms | |
Training Time: 77 ms | |
Training Time: 73 ms | |
Training Time: 72 ms | |
Training Time: 72 ms | |
Training Time: 73 ms | |
Training Time: 72 ms | |
Training Time: 72 ms | |
Training Time: 72 ms | |
Training Time: 73 ms | |
Training Time: 74 ms | |
Training Time: 72 ms | |
Training Time: 74 ms | |
Training Time: 76 ms | |
Training Time: 75 ms | |
Training Time: 72 ms | |
Training Time: 75 ms | |
Training Time: 74 ms | |
Training Time: 74 ms | |
15.2 - training | batch=1, size=512x512: 73.2 ± 1.5 ms | |
16/19. DeepLab | |
Inference Time: 1116 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 52 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
16.1 - inference | batch=2, size=512x512: 50.2 ± 0.7 ms | |
Training Time: 3288 ms | |
Training Time: 58 ms | |
Training Time: 88 ms | |
Training Time: 68 ms | |
Training Time: 68 ms | |
Training Time: 88 ms | |
Training Time: 59 ms | |
Training Time: 67 ms | |
Training Time: 61 ms | |
Training Time: 60 ms | |
Training Time: 59 ms | |
Training Time: 68 ms | |
Training Time: 70 ms | |
Training Time: 60 ms | |
Training Time: 58 ms | |
Training Time: 77 ms | |
Training Time: 69 ms | |
Training Time: 60 ms | |
Training Time: 59 ms | |
Training Time: 86 ms | |
Training Time: 67 ms | |
Training Time: 66 ms | |
16.2 - training | batch=1, size=384x384: 67.4 ± 9.5 ms | |
17/19. Pixel-RNN | |
Inference Time: 3993 ms | |
Inference Time: 1675 ms | |
Inference Time: 1709 ms | |
Inference Time: 1756 ms | |
Inference Time: 1630 ms | |
Inference Time: 1782 ms | |
Inference Time: 1662 ms | |
Inference Time: 1665 ms | |
Inference Time: 1793 ms | |
Inference Time: 1620 ms | |
Inference Time: 1717 ms | |
Inference Time: 1617 ms | |
Inference Time: 1742 ms | |
17.1 - inference | batch=50, size=64x64: 1697 ± 59 ms | |
Training Time: 20502 ms | |
Training Time: 7568 ms | |
Training Time: 8171 ms | |
Training Time: 6588 ms | |
Training Time: 7508 ms | |
17.2 - training | batch=10, size=64x64: 7459 ± 566 ms | |
18/19. LSTM-Sentiment | |
Inference Time: 1029 ms | |
Inference Time: 816 ms | |
Inference Time: 794 ms | |
Inference Time: 801 ms | |
Inference Time: 770 ms | |
Inference Time: 773 ms | |
Inference Time: 784 ms | |
Inference Time: 837 ms | |
Inference Time: 835 ms | |
Inference Time: 822 ms | |
Inference Time: 825 ms | |
Inference Time: 769 ms | |
Inference Time: 772 ms | |
Inference Time: 779 ms | |
Inference Time: 791 ms | |
Inference Time: 758 ms | |
Inference Time: 738 ms | |
Inference Time: 759 ms | |
Inference Time: 757 ms | |
Inference Time: 822 ms | |
Inference Time: 780 ms | |
Inference Time: 768 ms | |
18.1 - inference | batch=100, size=1024x300: 788 ± 28 ms | |
Training Time: 2808 ms | |
Training Time: 2808 ms | |
Training Time: 3105 ms | |
Training Time: 2882 ms | |
Training Time: 3195 ms | |
Training Time: 2605 ms | |
Training Time: 2594 ms | |
Training Time: 3228 ms | |
Training Time: 3112 ms | |
Training Time: 2561 ms | |
Training Time: 3042 ms | |
18.2 - training | batch=10, size=1024x300: 2913 ± 246 ms | |
19/19. GNMT-Translation | |
Inference Time: 502 ms | |
Inference Time: 315 ms | |
Inference Time: 316 ms | |
Inference Time: 310 ms | |
Inference Time: 290 ms | |
Inference Time: 314 ms | |
Inference Time: 311 ms | |
Inference Time: 283 ms | |
Inference Time: 292 ms | |
Inference Time: 287 ms | |
Inference Time: 287 ms | |
Inference Time: 284 ms | |
Inference Time: 314 ms | |
Inference Time: 289 ms | |
Inference Time: 288 ms | |
Inference Time: 309 ms | |
Inference Time: 294 ms | |
Inference Time: 314 ms | |
Inference Time: 280 ms | |
Inference Time: 309 ms | |
Inference Time: 297 ms | |
Inference Time: 281 ms | |
19.1 - inference | batch=1, size=1x20: 298 ± 13 ms | |
Device Inference Score: 18566 | |
Device Training Score: 20064 | |
Device AI Score: 38630 | |
For more information and results, please visit http://ai-benchmark.com/alpha |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> python3 benchmark.py | |
2023-09-04 15:40:05.360036: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. | |
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
/usr/local/lib/python3.9/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.25.2 | |
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" | |
2023-09-04 15:40:06.815801: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.828821: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.828887: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
>> AI-Benchmark - 0.1.3.cm | |
INFO:ai_benchmark:>> AI-Benchmark - 0.1.3.cm | |
>> Let the AI Games begin | |
INFO:ai_benchmark:>> Let the AI Games begin | |
2023-09-04 15:40:06.981612: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.981719: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.981765: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.981918: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.981974: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.982028: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:06.982056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:40:07.179161: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179269: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179308: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179366: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179409: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:40:07.179507: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179563: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179599: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179645: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179684: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:07.179704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
* TF Version: 2.15.0 | |
INFO:ai_benchmark:* TF Version: 2.15.0 | |
* Platform: Linux-5.15.0-82-generic-x86_64-with-glibc2.17 | |
INFO:ai_benchmark:* Platform: Linux-5.15.0-82-generic-x86_64-with-glibc2.17 | |
* CPU: AMD Ryzen 9 3900X 12-Core Processor | |
INFO:ai_benchmark:* CPU: AMD Ryzen 9 3900X 12-Core Processor | |
* CPU RAM: 31 GB | |
INFO:ai_benchmark:* CPU RAM: 31 GB | |
* GPU/0: Radeon RX 7900 XTX | |
INFO:ai_benchmark:* GPU/0: Radeon RX 7900 XTX | |
* GPU RAM: 23.5 GB | |
INFO:ai_benchmark:* GPU RAM: 23.5 GB | |
* CUDA Version: N/A | |
INFO:ai_benchmark:* CUDA Version: N/A | |
* CUDA Build: N/A | |
INFO:ai_benchmark:* CUDA Build: N/A | |
The benchmark is running... | |
WARNING:ai_benchmark:The benchmark is running... | |
The tests might take up to 20 minutes | |
WARNING:ai_benchmark:The tests might take up to 20 minutes | |
Please don't interrupt the script | |
WARNING:ai_benchmark:Please don't interrupt the script | |
1/19. MobileNet-V2 | |
INFO:ai_benchmark: | |
1/19. MobileNet-V2 | |
2023-09-04 15:40:08.470243: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:08.470356: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:08.470392: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:08.470456: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:08.470497: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:08.470520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:40:08.736260: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:386] MLIR V1 optimization pass is not enabled | |
2023-09-04 15:40:08.959779: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:09.542531: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 229 ms | |
DEBUG:ai_benchmark:Inference Time: 229 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
1.1 - inference | batch=50, size=224x224: 32.9 ± 0.5 ms | |
INFO:ai_benchmark:1.1 - inference | batch=50, size=224x224: 32.9 ± 0.5 ms | |
2023-09-04 15:40:15.501624: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:16.260700: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 961 ms | |
DEBUG:ai_benchmark:Training Time: 961 ms | |
Training Time: 372 ms | |
DEBUG:ai_benchmark:Training Time: 372 ms | |
Training Time: 370 ms | |
DEBUG:ai_benchmark:Training Time: 370 ms | |
Training Time: 367 ms | |
DEBUG:ai_benchmark:Training Time: 367 ms | |
Training Time: 369 ms | |
DEBUG:ai_benchmark:Training Time: 369 ms | |
Training Time: 364 ms | |
DEBUG:ai_benchmark:Training Time: 364 ms | |
Training Time: 365 ms | |
DEBUG:ai_benchmark:Training Time: 365 ms | |
Training Time: 366 ms | |
DEBUG:ai_benchmark:Training Time: 366 ms | |
Training Time: 362 ms | |
DEBUG:ai_benchmark:Training Time: 362 ms | |
Training Time: 362 ms | |
DEBUG:ai_benchmark:Training Time: 362 ms | |
Training Time: 365 ms | |
DEBUG:ai_benchmark:Training Time: 365 ms | |
Training Time: 362 ms | |
DEBUG:ai_benchmark:Training Time: 362 ms | |
Training Time: 366 ms | |
DEBUG:ai_benchmark:Training Time: 366 ms | |
Training Time: 363 ms | |
DEBUG:ai_benchmark:Training Time: 363 ms | |
Training Time: 366 ms | |
DEBUG:ai_benchmark:Training Time: 366 ms | |
Training Time: 361 ms | |
DEBUG:ai_benchmark:Training Time: 361 ms | |
Training Time: 363 ms | |
DEBUG:ai_benchmark:Training Time: 363 ms | |
Training Time: 361 ms | |
DEBUG:ai_benchmark:Training Time: 361 ms | |
Training Time: 363 ms | |
DEBUG:ai_benchmark:Training Time: 363 ms | |
Training Time: 361 ms | |
DEBUG:ai_benchmark:Training Time: 361 ms | |
Training Time: 363 ms | |
DEBUG:ai_benchmark:Training Time: 363 ms | |
Training Time: 363 ms | |
DEBUG:ai_benchmark:Training Time: 363 ms | |
1.2 - training | batch=50, size=224x224: 364 ± 3 ms | |
INFO:ai_benchmark:1.2 - training | batch=50, size=224x224: 364 ± 3 ms | |
2/19. Inception-V3 | |
INFO:ai_benchmark: | |
2/19. Inception-V3 | |
2023-09-04 15:40:28.098164: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:28.098279: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:28.098317: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:28.098380: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:28.098448: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:28.098477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:40:28.884284: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:29.284076: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 323 ms | |
DEBUG:ai_benchmark:Inference Time: 323 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 33 ms | |
DEBUG:ai_benchmark:Inference Time: 33 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
2.1 - inference | batch=20, size=346x346: 33.2 ± 1.4 ms | |
INFO:ai_benchmark:2.1 - inference | batch=20, size=346x346: 33.2 ± 1.4 ms | |
2023-09-04 15:40:33.857562: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:35.007305: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1413 ms | |
DEBUG:ai_benchmark:Training Time: 1413 ms | |
Training Time: 407 ms | |
DEBUG:ai_benchmark:Training Time: 407 ms | |
Training Time: 404 ms | |
DEBUG:ai_benchmark:Training Time: 404 ms | |
Training Time: 401 ms | |
DEBUG:ai_benchmark:Training Time: 401 ms | |
Training Time: 400 ms | |
DEBUG:ai_benchmark:Training Time: 400 ms | |
Training Time: 410 ms | |
DEBUG:ai_benchmark:Training Time: 410 ms | |
Training Time: 405 ms | |
DEBUG:ai_benchmark:Training Time: 405 ms | |
Training Time: 402 ms | |
DEBUG:ai_benchmark:Training Time: 402 ms | |
Training Time: 400 ms | |
DEBUG:ai_benchmark:Training Time: 400 ms | |
Training Time: 405 ms | |
DEBUG:ai_benchmark:Training Time: 405 ms | |
Training Time: 407 ms | |
DEBUG:ai_benchmark:Training Time: 407 ms | |
Training Time: 398 ms | |
DEBUG:ai_benchmark:Training Time: 398 ms | |
Training Time: 398 ms | |
DEBUG:ai_benchmark:Training Time: 398 ms | |
Training Time: 403 ms | |
DEBUG:ai_benchmark:Training Time: 403 ms | |
Training Time: 403 ms | |
DEBUG:ai_benchmark:Training Time: 403 ms | |
Training Time: 402 ms | |
DEBUG:ai_benchmark:Training Time: 402 ms | |
Training Time: 400 ms | |
DEBUG:ai_benchmark:Training Time: 400 ms | |
Training Time: 398 ms | |
DEBUG:ai_benchmark:Training Time: 398 ms | |
Training Time: 403 ms | |
DEBUG:ai_benchmark:Training Time: 403 ms | |
Training Time: 407 ms | |
DEBUG:ai_benchmark:Training Time: 407 ms | |
Training Time: 402 ms | |
DEBUG:ai_benchmark:Training Time: 402 ms | |
Training Time: 401 ms | |
DEBUG:ai_benchmark:Training Time: 401 ms | |
2.2 - training | batch=20, size=346x346: 403 ± 3 ms | |
INFO:ai_benchmark:2.2 - training | batch=20, size=346x346: 403 ± 3 ms | |
3/19. Inception-V4 | |
INFO:ai_benchmark: | |
3/19. Inception-V4 | |
2023-09-04 15:40:45.852720: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:45.852835: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:45.852874: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:45.852936: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:45.852979: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:40:45.853003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:40:47.015716: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:47.577671: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 481 ms | |
DEBUG:ai_benchmark:Inference Time: 481 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
3.1 - inference | batch=10, size=346x346: 36.9 ± 1.9 ms | |
INFO:ai_benchmark:3.1 - inference | batch=10, size=346x346: 36.9 ± 1.9 ms | |
2023-09-04 15:40:52.700491: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:40:54.700566: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2168 ms | |
DEBUG:ai_benchmark:Training Time: 2168 ms | |
Training Time: 315 ms | |
DEBUG:ai_benchmark:Training Time: 315 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
Training Time: 318 ms | |
DEBUG:ai_benchmark:Training Time: 318 ms | |
Training Time: 318 ms | |
DEBUG:ai_benchmark:Training Time: 318 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
Training Time: 318 ms | |
DEBUG:ai_benchmark:Training Time: 318 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 318 ms | |
DEBUG:ai_benchmark:Training Time: 318 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 318 ms | |
DEBUG:ai_benchmark:Training Time: 318 ms | |
Training Time: 319 ms | |
DEBUG:ai_benchmark:Training Time: 319 ms | |
Training Time: 320 ms | |
DEBUG:ai_benchmark:Training Time: 320 ms | |
3.2 - training | batch=10, size=346x346: 319 ± 1 ms | |
INFO:ai_benchmark:3.2 - training | batch=10, size=346x346: 319 ± 1 ms | |
4/19. Inception-ResNet-V2 | |
INFO:ai_benchmark: | |
4/19. Inception-ResNet-V2 | |
2023-09-04 15:41:02.819883: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:02.820008: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:02.820046: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:02.820112: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:02.820155: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:02.820179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:41:04.929646: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:05.882172: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 983 ms | |
DEBUG:ai_benchmark:Inference Time: 983 ms | |
Inference Time: 47 ms | |
DEBUG:ai_benchmark:Inference Time: 47 ms | |
Inference Time: 51 ms | |
DEBUG:ai_benchmark:Inference Time: 51 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 46 ms | |
DEBUG:ai_benchmark:Inference Time: 46 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 43 ms | |
DEBUG:ai_benchmark:Inference Time: 43 ms | |
Inference Time: 46 ms | |
DEBUG:ai_benchmark:Inference Time: 46 ms | |
Inference Time: 43 ms | |
DEBUG:ai_benchmark:Inference Time: 43 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 46 ms | |
DEBUG:ai_benchmark:Inference Time: 46 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
4.1 - inference | batch=10, size=346x346: 45.0 ± 1.6 ms | |
INFO:ai_benchmark:4.1 - inference | batch=10, size=346x346: 45.0 ± 1.6 ms | |
2023-09-04 15:41:13.756770: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:17.684396: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 4046 ms | |
DEBUG:ai_benchmark:Training Time: 4046 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 284 ms | |
DEBUG:ai_benchmark:Training Time: 284 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 278 ms | |
DEBUG:ai_benchmark:Training Time: 278 ms | |
Training Time: 275 ms | |
DEBUG:ai_benchmark:Training Time: 275 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 276 ms | |
DEBUG:ai_benchmark:Training Time: 276 ms | |
Training Time: 275 ms | |
DEBUG:ai_benchmark:Training Time: 275 ms | |
Training Time: 275 ms | |
DEBUG:ai_benchmark:Training Time: 275 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 276 ms | |
DEBUG:ai_benchmark:Training Time: 276 ms | |
Training Time: 274 ms | |
DEBUG:ai_benchmark:Training Time: 274 ms | |
Training Time: 274 ms | |
DEBUG:ai_benchmark:Training Time: 274 ms | |
Training Time: 274 ms | |
DEBUG:ai_benchmark:Training Time: 274 ms | |
Training Time: 276 ms | |
DEBUG:ai_benchmark:Training Time: 276 ms | |
Training Time: 275 ms | |
DEBUG:ai_benchmark:Training Time: 275 ms | |
Training Time: 277 ms | |
DEBUG:ai_benchmark:Training Time: 277 ms | |
Training Time: 275 ms | |
DEBUG:ai_benchmark:Training Time: 275 ms | |
Training Time: 276 ms | |
DEBUG:ai_benchmark:Training Time: 276 ms | |
Training Time: 278 ms | |
DEBUG:ai_benchmark:Training Time: 278 ms | |
4.2 - training | batch=8, size=346x346: 276 ± 2 ms | |
INFO:ai_benchmark:4.2 - training | batch=8, size=346x346: 276 ± 2 ms | |
5/19. ResNet-V2-50 | |
INFO:ai_benchmark: | |
5/19. ResNet-V2-50 | |
2023-09-04 15:41:24.771442: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:24.771562: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:24.771602: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:24.771667: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:24.771711: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:24.771737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:41:25.351442: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:25.639592: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 259 ms | |
DEBUG:ai_benchmark:Inference Time: 259 ms | |
Inference Time: 30 ms | |
DEBUG:ai_benchmark:Inference Time: 30 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
Inference Time: 23 ms | |
DEBUG:ai_benchmark:Inference Time: 23 ms | |
Inference Time: 24 ms | |
DEBUG:ai_benchmark:Inference Time: 24 ms | |
5.1 - inference | batch=10, size=346x346: 24.1 ± 1.5 ms | |
INFO:ai_benchmark:5.1 - inference | batch=10, size=346x346: 24.1 ± 1.5 ms | |
2023-09-04 15:41:28.815530: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:29.606853: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 815 ms | |
DEBUG:ai_benchmark:Training Time: 815 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 88 ms | |
DEBUG:ai_benchmark:Training Time: 88 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 88 ms | |
DEBUG:ai_benchmark:Training Time: 88 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 88 ms | |
DEBUG:ai_benchmark:Training Time: 88 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 88 ms | |
DEBUG:ai_benchmark:Training Time: 88 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
5.2 - training | batch=10, size=346x346: 89.2 ± 0.8 ms | |
INFO:ai_benchmark:5.2 - training | batch=10, size=346x346: 89.2 ± 0.8 ms | |
6/19. ResNet-V2-152 | |
INFO:ai_benchmark: | |
6/19. ResNet-V2-152 | |
2023-09-04 15:41:32.605342: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:32.606920: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:32.607007: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:32.607080: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:32.607125: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:32.607151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:41:34.775272: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:35.540567: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 674 ms | |
DEBUG:ai_benchmark:Inference Time: 674 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
Inference Time: 35 ms | |
DEBUG:ai_benchmark:Inference Time: 35 ms | |
Inference Time: 34 ms | |
DEBUG:ai_benchmark:Inference Time: 34 ms | |
6.1 - inference | batch=10, size=256x256: 34.5 ± 0.6 ms | |
INFO:ai_benchmark:6.1 - inference | batch=10, size=256x256: 34.5 ± 0.6 ms | |
2023-09-04 15:41:42.303121: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:45.024722: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2648 ms | |
DEBUG:ai_benchmark:Training Time: 2648 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
Training Time: 134 ms | |
DEBUG:ai_benchmark:Training Time: 134 ms | |
6.2 - training | batch=10, size=256x256: 133.6 ± 0.5 ms | |
INFO:ai_benchmark:6.2 - training | batch=10, size=256x256: 133.6 ± 0.5 ms | |
7/19. VGG-16 | |
INFO:ai_benchmark: | |
7/19. VGG-16 | |
2023-09-04 15:41:48.974847: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:48.974977: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:48.975018: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:48.975083: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:48.975129: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:48.975155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:41:49.079823: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:49.205783: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 118 ms | |
DEBUG:ai_benchmark:Inference Time: 118 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 51 ms | |
DEBUG:ai_benchmark:Inference Time: 51 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 51 ms | |
DEBUG:ai_benchmark:Inference Time: 51 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 52 ms | |
DEBUG:ai_benchmark:Inference Time: 52 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 54 ms | |
DEBUG:ai_benchmark:Inference Time: 54 ms | |
7.1 - inference | batch=20, size=224x224: 49.8 ± 1.4 ms | |
INFO:ai_benchmark:7.1 - inference | batch=20, size=224x224: 49.8 ± 1.4 ms | |
2023-09-04 15:41:52.221623: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:52.847856: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 232 ms | |
DEBUG:ai_benchmark:Training Time: 232 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 86 ms | |
DEBUG:ai_benchmark:Training Time: 86 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 87 ms | |
DEBUG:ai_benchmark:Training Time: 87 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 87 ms | |
DEBUG:ai_benchmark:Training Time: 87 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 87 ms | |
DEBUG:ai_benchmark:Training Time: 87 ms | |
Training Time: 92 ms | |
DEBUG:ai_benchmark:Training Time: 92 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 91 ms | |
DEBUG:ai_benchmark:Training Time: 91 ms | |
Training Time: 89 ms | |
DEBUG:ai_benchmark:Training Time: 89 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
Training Time: 90 ms | |
DEBUG:ai_benchmark:Training Time: 90 ms | |
7.2 - training | batch=2, size=224x224: 89.4 ± 1.6 ms | |
INFO:ai_benchmark:7.2 - training | batch=2, size=224x224: 89.4 ± 1.6 ms | |
8/19. SRCNN 9-5-5 | |
INFO:ai_benchmark: | |
8/19. SRCNN 9-5-5 | |
2023-09-04 15:41:55.001108: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:55.001252: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:55.001309: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:55.001395: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:55.001452: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:41:55.001486: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:41:55.020559: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:41:55.205874: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 76 ms | |
DEBUG:ai_benchmark:Inference Time: 76 ms | |
Inference Time: 39 ms | |
DEBUG:ai_benchmark:Inference Time: 39 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 30 ms | |
DEBUG:ai_benchmark:Inference Time: 30 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 30 ms | |
DEBUG:ai_benchmark:Inference Time: 30 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 30 ms | |
DEBUG:ai_benchmark:Inference Time: 30 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
Inference Time: 32 ms | |
DEBUG:ai_benchmark:Inference Time: 32 ms | |
Inference Time: 31 ms | |
DEBUG:ai_benchmark:Inference Time: 31 ms | |
8.1 - inference | batch=10, size=512x512: 31.5 ± 1.8 ms | |
INFO:ai_benchmark:8.1 - inference | batch=10, size=512x512: 31.5 ± 1.8 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 25 ms | |
DEBUG:ai_benchmark:Inference Time: 25 ms | |
Inference Time: 27 ms | |
DEBUG:ai_benchmark:Inference Time: 27 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 27 ms | |
DEBUG:ai_benchmark:Inference Time: 27 ms | |
Inference Time: 27 ms | |
DEBUG:ai_benchmark:Inference Time: 27 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 27 ms | |
DEBUG:ai_benchmark:Inference Time: 27 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
Inference Time: 27 ms | |
DEBUG:ai_benchmark:Inference Time: 27 ms | |
Inference Time: 26 ms | |
DEBUG:ai_benchmark:Inference Time: 26 ms | |
8.2 - inference | batch=1, size=1536x1536: 26.1 ± 0.6 ms | |
INFO:ai_benchmark:8.2 - inference | batch=1, size=1536x1536: 26.1 ± 0.6 ms | |
2023-09-04 15:42:00.974126: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:01.530477: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 272 ms | |
DEBUG:ai_benchmark:Training Time: 272 ms | |
Training Time: 195 ms | |
DEBUG:ai_benchmark:Training Time: 195 ms | |
Training Time: 193 ms | |
DEBUG:ai_benchmark:Training Time: 193 ms | |
Training Time: 191 ms | |
DEBUG:ai_benchmark:Training Time: 191 ms | |
Training Time: 193 ms | |
DEBUG:ai_benchmark:Training Time: 193 ms | |
Training Time: 195 ms | |
DEBUG:ai_benchmark:Training Time: 195 ms | |
Training Time: 193 ms | |
DEBUG:ai_benchmark:Training Time: 193 ms | |
Training Time: 192 ms | |
DEBUG:ai_benchmark:Training Time: 192 ms | |
Training Time: 195 ms | |
DEBUG:ai_benchmark:Training Time: 195 ms | |
Training Time: 191 ms | |
DEBUG:ai_benchmark:Training Time: 191 ms | |
Training Time: 194 ms | |
DEBUG:ai_benchmark:Training Time: 194 ms | |
Training Time: 192 ms | |
DEBUG:ai_benchmark:Training Time: 192 ms | |
Training Time: 191 ms | |
DEBUG:ai_benchmark:Training Time: 191 ms | |
Training Time: 189 ms | |
DEBUG:ai_benchmark:Training Time: 189 ms | |
Training Time: 196 ms | |
DEBUG:ai_benchmark:Training Time: 196 ms | |
Training Time: 191 ms | |
DEBUG:ai_benchmark:Training Time: 191 ms | |
Training Time: 188 ms | |
DEBUG:ai_benchmark:Training Time: 188 ms | |
Training Time: 191 ms | |
DEBUG:ai_benchmark:Training Time: 191 ms | |
Training Time: 193 ms | |
DEBUG:ai_benchmark:Training Time: 193 ms | |
Training Time: 190 ms | |
DEBUG:ai_benchmark:Training Time: 190 ms | |
Training Time: 193 ms | |
DEBUG:ai_benchmark:Training Time: 193 ms | |
Training Time: 192 ms | |
DEBUG:ai_benchmark:Training Time: 192 ms | |
8.3 - training | batch=10, size=512x512: 192 ± 2 ms | |
INFO:ai_benchmark:8.3 - training | batch=10, size=512x512: 192 ± 2 ms | |
9/19. VGG-19 Super-Res | |
INFO:ai_benchmark: | |
9/19. VGG-19 Super-Res | |
2023-09-04 15:42:15.529053: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:15.529191: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:15.529246: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:15.529321: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:15.529369: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:15.529390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:42:15.659972: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:15.838272: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 106 ms | |
DEBUG:ai_benchmark:Inference Time: 106 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 36 ms | |
DEBUG:ai_benchmark:Inference Time: 36 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
9.1 - inference | batch=10, size=256x256: 36.7 ± 0.5 ms | |
INFO:ai_benchmark:9.1 - inference | batch=10, size=256x256: 36.7 ± 0.5 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 58 ms | |
DEBUG:ai_benchmark:Inference Time: 58 ms | |
Inference Time: 60 ms | |
DEBUG:ai_benchmark:Inference Time: 60 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 60 ms | |
DEBUG:ai_benchmark:Inference Time: 60 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 58 ms | |
DEBUG:ai_benchmark:Inference Time: 58 ms | |
Inference Time: 60 ms | |
DEBUG:ai_benchmark:Inference Time: 60 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 60 ms | |
DEBUG:ai_benchmark:Inference Time: 60 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 59 ms | |
DEBUG:ai_benchmark:Inference Time: 59 ms | |
Inference Time: 58 ms | |
DEBUG:ai_benchmark:Inference Time: 58 ms | |
9.2 - inference | batch=1, size=1024x1024: 59.1 ± 0.7 ms | |
INFO:ai_benchmark:9.2 - inference | batch=1, size=1024x1024: 59.1 ± 0.7 ms | |
2023-09-04 15:42:21.519407: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:22.034918: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 374 ms | |
DEBUG:ai_benchmark:Training Time: 374 ms | |
Training Time: 223 ms | |
DEBUG:ai_benchmark:Training Time: 223 ms | |
Training Time: 219 ms | |
DEBUG:ai_benchmark:Training Time: 219 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 222 ms | |
DEBUG:ai_benchmark:Training Time: 222 ms | |
Training Time: 219 ms | |
DEBUG:ai_benchmark:Training Time: 219 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 221 ms | |
DEBUG:ai_benchmark:Training Time: 221 ms | |
Training Time: 222 ms | |
DEBUG:ai_benchmark:Training Time: 222 ms | |
Training Time: 221 ms | |
DEBUG:ai_benchmark:Training Time: 221 ms | |
Training Time: 221 ms | |
DEBUG:ai_benchmark:Training Time: 221 ms | |
Training Time: 222 ms | |
DEBUG:ai_benchmark:Training Time: 222 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 221 ms | |
DEBUG:ai_benchmark:Training Time: 221 ms | |
Training Time: 225 ms | |
DEBUG:ai_benchmark:Training Time: 225 ms | |
Training Time: 217 ms | |
DEBUG:ai_benchmark:Training Time: 217 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 220 ms | |
DEBUG:ai_benchmark:Training Time: 220 ms | |
Training Time: 222 ms | |
DEBUG:ai_benchmark:Training Time: 222 ms | |
9.3 - training | batch=10, size=224x224: 221 ± 2 ms | |
INFO:ai_benchmark:9.3 - training | batch=10, size=224x224: 221 ± 2 ms | |
10/19. ResNet-SRGAN | |
INFO:ai_benchmark: | |
10/19. ResNet-SRGAN | |
2023-09-04 15:42:34.314120: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:34.314257: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:34.314312: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:34.314394: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:34.314449: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:34.314483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:42:34.749338: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:35.127095: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 415 ms | |
DEBUG:ai_benchmark:Inference Time: 415 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 46 ms | |
DEBUG:ai_benchmark:Inference Time: 46 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 46 ms | |
DEBUG:ai_benchmark:Inference Time: 46 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 44 ms | |
DEBUG:ai_benchmark:Inference Time: 44 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
Inference Time: 45 ms | |
DEBUG:ai_benchmark:Inference Time: 45 ms | |
10.1 - inference | batch=10, size=512x512: 44.8 ± 0.6 ms | |
INFO:ai_benchmark:10.1 - inference | batch=10, size=512x512: 44.8 ± 0.6 ms | |
Inference Time: 40 ms | |
DEBUG:ai_benchmark:Inference Time: 40 ms | |
Inference Time: 37 ms | |
DEBUG:ai_benchmark:Inference Time: 37 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 39 ms | |
DEBUG:ai_benchmark:Inference Time: 39 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 39 ms | |
DEBUG:ai_benchmark:Inference Time: 39 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 39 ms | |
DEBUG:ai_benchmark:Inference Time: 39 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 39 ms | |
DEBUG:ai_benchmark:Inference Time: 39 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
Inference Time: 38 ms | |
DEBUG:ai_benchmark:Inference Time: 38 ms | |
10.2 - inference | batch=1, size=1536x1536: 38.1 ± 0.5 ms | |
INFO:ai_benchmark:10.2 - inference | batch=1, size=1536x1536: 38.1 ± 0.5 ms | |
2023-09-04 15:42:42.313045: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:43.081593: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 681 ms | |
DEBUG:ai_benchmark:Training Time: 681 ms | |
Training Time: 133 ms | |
DEBUG:ai_benchmark:Training Time: 133 ms | |
Training Time: 132 ms | |
DEBUG:ai_benchmark:Training Time: 132 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 130 ms | |
DEBUG:ai_benchmark:Training Time: 130 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 130 ms | |
DEBUG:ai_benchmark:Training Time: 130 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 130 ms | |
DEBUG:ai_benchmark:Training Time: 130 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 131 ms | |
DEBUG:ai_benchmark:Training Time: 131 ms | |
Training Time: 130 ms | |
DEBUG:ai_benchmark:Training Time: 130 ms | |
Training Time: 132 ms | |
DEBUG:ai_benchmark:Training Time: 132 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 130 ms | |
DEBUG:ai_benchmark:Training Time: 130 ms | |
Training Time: 129 ms | |
DEBUG:ai_benchmark:Training Time: 129 ms | |
Training Time: 139 ms | |
DEBUG:ai_benchmark:Training Time: 139 ms | |
10.3 - training | batch=5, size=512x512: 131 ± 2 ms | |
INFO:ai_benchmark:10.3 - training | batch=5, size=512x512: 131 ± 2 ms | |
11/19. ResNet-DPED | |
INFO:ai_benchmark: | |
11/19. ResNet-DPED | |
2023-09-04 15:42:50.745526: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:50.745647: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:50.745686: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:50.745751: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:50.745792: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:42:50.745816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:42:50.827978: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:51.025158: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 482 ms | |
DEBUG:ai_benchmark:Inference Time: 482 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 53 ms | |
DEBUG:ai_benchmark:Inference Time: 53 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 53 ms | |
DEBUG:ai_benchmark:Inference Time: 53 ms | |
Inference Time: 47 ms | |
DEBUG:ai_benchmark:Inference Time: 47 ms | |
Inference Time: 50 ms | |
DEBUG:ai_benchmark:Inference Time: 50 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 47 ms | |
DEBUG:ai_benchmark:Inference Time: 47 ms | |
11.1 - inference | batch=10, size=256x256: 49.1 ± 1.6 ms | |
INFO:ai_benchmark:11.1 - inference | batch=10, size=256x256: 49.1 ± 1.6 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 78 ms | |
DEBUG:ai_benchmark:Inference Time: 78 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 78 ms | |
DEBUG:ai_benchmark:Inference Time: 78 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 78 ms | |
DEBUG:ai_benchmark:Inference Time: 78 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
11.2 - inference | batch=1, size=1024x1024: 79.2 ± 0.9 ms | |
INFO:ai_benchmark:11.2 - inference | batch=1, size=1024x1024: 79.2 ± 0.9 ms | |
2023-09-04 15:42:58.036611: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:42:58.930243: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 810 ms | |
DEBUG:ai_benchmark:Training Time: 810 ms | |
Training Time: 107 ms | |
DEBUG:ai_benchmark:Training Time: 107 ms | |
Training Time: 107 ms | |
DEBUG:ai_benchmark:Training Time: 107 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 108 ms | |
DEBUG:ai_benchmark:Training Time: 108 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 108 ms | |
DEBUG:ai_benchmark:Training Time: 108 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 110 ms | |
DEBUG:ai_benchmark:Training Time: 110 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 108 ms | |
DEBUG:ai_benchmark:Training Time: 108 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
Training Time: 109 ms | |
DEBUG:ai_benchmark:Training Time: 109 ms | |
11.3 - training | batch=15, size=128x128: 108.7 ± 0.7 ms | |
INFO:ai_benchmark:11.3 - training | batch=15, size=128x128: 108.7 ± 0.7 ms | |
12/19. U-Net | |
INFO:ai_benchmark: | |
12/19. U-Net | |
2023-09-04 15:43:12.089042: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:12.089159: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:12.089209: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:12.089286: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:12.089341: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:12.089373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:43:12.281416: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:43:12.466331: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 8843 ms | |
DEBUG:ai_benchmark:Inference Time: 8843 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 79 ms | |
DEBUG:ai_benchmark:Inference Time: 79 ms | |
12.1 - inference | batch=4, size=512x512: 80.0 ± 0.7 ms | |
INFO:ai_benchmark:12.1 - inference | batch=4, size=512x512: 80.0 ± 0.7 ms | |
Inference Time: 9645 ms | |
DEBUG:ai_benchmark:Inference Time: 9645 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 80 ms | |
DEBUG:ai_benchmark:Inference Time: 80 ms | |
Inference Time: 82 ms | |
DEBUG:ai_benchmark:Inference Time: 82 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
Inference Time: 81 ms | |
DEBUG:ai_benchmark:Inference Time: 81 ms | |
12.2 - inference | batch=1, size=1024x1024: 81.2 ± 0.6 ms | |
INFO:ai_benchmark:12.2 - inference | batch=1, size=1024x1024: 81.2 ± 0.6 ms | |
2023-09-04 15:43:37.220648: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:43:38.261203: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 7317 ms | |
DEBUG:ai_benchmark:Training Time: 7317 ms | |
Training Time: 127 ms | |
DEBUG:ai_benchmark:Training Time: 127 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 127 ms | |
DEBUG:ai_benchmark:Training Time: 127 ms | |
Training Time: 127 ms | |
DEBUG:ai_benchmark:Training Time: 127 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
Training Time: 125 ms | |
DEBUG:ai_benchmark:Training Time: 125 ms | |
Training Time: 126 ms | |
DEBUG:ai_benchmark:Training Time: 126 ms | |
12.3 - training | batch=4, size=256x256: 125.8 ± 0.7 ms | |
INFO:ai_benchmark:12.3 - training | batch=4, size=256x256: 125.8 ± 0.7 ms | |
13/19. Nvidia-SPADE | |
INFO:ai_benchmark: | |
13/19. Nvidia-SPADE | |
2023-09-04 15:43:48.612494: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:48.612610: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:48.612662: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:48.612738: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:48.612798: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:43:48.612836: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:43:49.097222: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:43:49.470627: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2933 ms | |
DEBUG:ai_benchmark:Inference Time: 2933 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
Inference Time: 48 ms | |
DEBUG:ai_benchmark:Inference Time: 48 ms | |
Inference Time: 49 ms | |
DEBUG:ai_benchmark:Inference Time: 49 ms | |
13.1 - inference | batch=5, size=128x128: 48.4 ± 0.5 ms | |
INFO:ai_benchmark:13.1 - inference | batch=5, size=128x128: 48.4 ± 0.5 ms | |
2023-09-04 15:43:56.321476: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:43:58.402851: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 4250 ms | |
DEBUG:ai_benchmark:Training Time: 4250 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 84 ms | |
DEBUG:ai_benchmark:Training Time: 84 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 83 ms | |
DEBUG:ai_benchmark:Training Time: 83 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
Training Time: 82 ms | |
DEBUG:ai_benchmark:Training Time: 82 ms | |
13.2 - training | batch=1, size=128x128: 82.5 ± 0.6 ms | |
INFO:ai_benchmark:13.2 - training | batch=1, size=128x128: 82.5 ± 0.6 ms | |
14/19. ICNet | |
INFO:ai_benchmark: | |
14/19. ICNet | |
2023-09-04 15:44:03.065577: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:03.065709: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:03.065755: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:03.065833: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:03.065882: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:03.065918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:44:03.505339: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:44:03.897380: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 3419 ms | |
DEBUG:ai_benchmark:Inference Time: 3419 ms | |
Inference Time: 104 ms | |
DEBUG:ai_benchmark:Inference Time: 104 ms | |
Inference Time: 105 ms | |
DEBUG:ai_benchmark:Inference Time: 105 ms | |
Inference Time: 106 ms | |
DEBUG:ai_benchmark:Inference Time: 106 ms | |
Inference Time: 111 ms | |
DEBUG:ai_benchmark:Inference Time: 111 ms | |
Inference Time: 107 ms | |
DEBUG:ai_benchmark:Inference Time: 107 ms | |
Inference Time: 110 ms | |
DEBUG:ai_benchmark:Inference Time: 110 ms | |
Inference Time: 114 ms | |
DEBUG:ai_benchmark:Inference Time: 114 ms | |
Inference Time: 108 ms | |
DEBUG:ai_benchmark:Inference Time: 108 ms | |
Inference Time: 108 ms | |
DEBUG:ai_benchmark:Inference Time: 108 ms | |
Inference Time: 110 ms | |
DEBUG:ai_benchmark:Inference Time: 110 ms | |
Inference Time: 108 ms | |
DEBUG:ai_benchmark:Inference Time: 108 ms | |
Inference Time: 110 ms | |
DEBUG:ai_benchmark:Inference Time: 110 ms | |
Inference Time: 109 ms | |
DEBUG:ai_benchmark:Inference Time: 109 ms | |
Inference Time: 109 ms | |
DEBUG:ai_benchmark:Inference Time: 109 ms | |
Inference Time: 111 ms | |
DEBUG:ai_benchmark:Inference Time: 111 ms | |
Inference Time: 108 ms | |
DEBUG:ai_benchmark:Inference Time: 108 ms | |
Inference Time: 111 ms | |
DEBUG:ai_benchmark:Inference Time: 111 ms | |
Inference Time: 109 ms | |
DEBUG:ai_benchmark:Inference Time: 109 ms | |
Inference Time: 111 ms | |
DEBUG:ai_benchmark:Inference Time: 111 ms | |
Inference Time: 106 ms | |
DEBUG:ai_benchmark:Inference Time: 106 ms | |
Inference Time: 110 ms | |
DEBUG:ai_benchmark:Inference Time: 110 ms | |
14.1 - inference | batch=5, size=1024x1536: 109 ± 2 ms | |
INFO:ai_benchmark:14.1 - inference | batch=5, size=1024x1536: 109 ± 2 ms | |
2023-09-04 15:44:12.763925: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:44:13.750116: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5732 ms | |
DEBUG:ai_benchmark:Training Time: 5732 ms | |
Training Time: 851 ms | |
DEBUG:ai_benchmark:Training Time: 851 ms | |
Training Time: 857 ms | |
DEBUG:ai_benchmark:Training Time: 857 ms | |
Training Time: 858 ms | |
DEBUG:ai_benchmark:Training Time: 858 ms | |
Training Time: 853 ms | |
DEBUG:ai_benchmark:Training Time: 853 ms | |
Training Time: 856 ms | |
DEBUG:ai_benchmark:Training Time: 856 ms | |
Training Time: 862 ms | |
DEBUG:ai_benchmark:Training Time: 862 ms | |
Training Time: 854 ms | |
DEBUG:ai_benchmark:Training Time: 854 ms | |
Training Time: 852 ms | |
DEBUG:ai_benchmark:Training Time: 852 ms | |
Training Time: 855 ms | |
DEBUG:ai_benchmark:Training Time: 855 ms | |
Training Time: 853 ms | |
DEBUG:ai_benchmark:Training Time: 853 ms | |
Training Time: 855 ms | |
DEBUG:ai_benchmark:Training Time: 855 ms | |
Training Time: 859 ms | |
DEBUG:ai_benchmark:Training Time: 859 ms | |
Training Time: 852 ms | |
DEBUG:ai_benchmark:Training Time: 852 ms | |
Training Time: 857 ms | |
DEBUG:ai_benchmark:Training Time: 857 ms | |
Training Time: 852 ms | |
DEBUG:ai_benchmark:Training Time: 852 ms | |
Training Time: 861 ms | |
DEBUG:ai_benchmark:Training Time: 861 ms | |
14.2 - training | batch=10, size=1024x1536: 855 ± 3 ms | |
INFO:ai_benchmark:14.2 - training | batch=10, size=1024x1536: 855 ± 3 ms | |
15/19. PSPNet | |
INFO:ai_benchmark: | |
15/19. PSPNet | |
2023-09-04 15:44:42.700521: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:42.700636: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:42.700673: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:42.700736: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:42.700776: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:44:42.700795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:44:43.308172: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:44:43.714808: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 11204 ms | |
DEBUG:ai_benchmark:Inference Time: 11204 ms | |
Inference Time: 243 ms | |
DEBUG:ai_benchmark:Inference Time: 243 ms | |
Inference Time: 246 ms | |
DEBUG:ai_benchmark:Inference Time: 246 ms | |
Inference Time: 244 ms | |
DEBUG:ai_benchmark:Inference Time: 244 ms | |
Inference Time: 246 ms | |
DEBUG:ai_benchmark:Inference Time: 246 ms | |
Inference Time: 246 ms | |
DEBUG:ai_benchmark:Inference Time: 246 ms | |
Inference Time: 250 ms | |
DEBUG:ai_benchmark:Inference Time: 250 ms | |
Inference Time: 251 ms | |
DEBUG:ai_benchmark:Inference Time: 251 ms | |
Inference Time: 242 ms | |
DEBUG:ai_benchmark:Inference Time: 242 ms | |
Inference Time: 244 ms | |
DEBUG:ai_benchmark:Inference Time: 244 ms | |
Inference Time: 242 ms | |
DEBUG:ai_benchmark:Inference Time: 242 ms | |
Inference Time: 247 ms | |
DEBUG:ai_benchmark:Inference Time: 247 ms | |
Inference Time: 240 ms | |
DEBUG:ai_benchmark:Inference Time: 240 ms | |
Inference Time: 241 ms | |
DEBUG:ai_benchmark:Inference Time: 241 ms | |
Inference Time: 241 ms | |
DEBUG:ai_benchmark:Inference Time: 241 ms | |
Inference Time: 241 ms | |
DEBUG:ai_benchmark:Inference Time: 241 ms | |
Inference Time: 241 ms | |
DEBUG:ai_benchmark:Inference Time: 241 ms | |
Inference Time: 240 ms | |
DEBUG:ai_benchmark:Inference Time: 240 ms | |
Inference Time: 245 ms | |
DEBUG:ai_benchmark:Inference Time: 245 ms | |
Inference Time: 243 ms | |
DEBUG:ai_benchmark:Inference Time: 243 ms | |
Inference Time: 245 ms | |
DEBUG:ai_benchmark:Inference Time: 245 ms | |
Inference Time: 246 ms | |
DEBUG:ai_benchmark:Inference Time: 246 ms | |
15.1 - inference | batch=5, size=720x720: 244 ± 3 ms | |
INFO:ai_benchmark:15.1 - inference | batch=5, size=720x720: 244 ± 3 ms | |
2023-09-04 15:45:03.100497: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:45:04.068987: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 7152 ms | |
DEBUG:ai_benchmark:Training Time: 7152 ms | |
Training Time: 157 ms | |
DEBUG:ai_benchmark:Training Time: 157 ms | |
Training Time: 154 ms | |
DEBUG:ai_benchmark:Training Time: 154 ms | |
Training Time: 153 ms | |
DEBUG:ai_benchmark:Training Time: 153 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 153 ms | |
DEBUG:ai_benchmark:Training Time: 153 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 151 ms | |
DEBUG:ai_benchmark:Training Time: 151 ms | |
Training Time: 151 ms | |
DEBUG:ai_benchmark:Training Time: 151 ms | |
Training Time: 153 ms | |
DEBUG:ai_benchmark:Training Time: 153 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 151 ms | |
DEBUG:ai_benchmark:Training Time: 151 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 151 ms | |
DEBUG:ai_benchmark:Training Time: 151 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
Training Time: 150 ms | |
DEBUG:ai_benchmark:Training Time: 150 ms | |
Training Time: 152 ms | |
DEBUG:ai_benchmark:Training Time: 152 ms | |
15.2 - training | batch=1, size=512x512: 152 ± 1 ms | |
INFO:ai_benchmark:15.2 - training | batch=1, size=512x512: 152 ± 1 ms | |
16/19. DeepLab | |
INFO:ai_benchmark: | |
16/19. DeepLab | |
2023-09-04 15:45:14.005935: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:14.006056: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:14.006095: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:14.006159: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:14.006202: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:14.006226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:45:15.220997: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:45:15.971717: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2029 ms | |
DEBUG:ai_benchmark:Inference Time: 2029 ms | |
Inference Time: 63 ms | |
DEBUG:ai_benchmark:Inference Time: 63 ms | |
Inference Time: 63 ms | |
DEBUG:ai_benchmark:Inference Time: 63 ms | |
Inference Time: 63 ms | |
DEBUG:ai_benchmark:Inference Time: 63 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 63 ms | |
DEBUG:ai_benchmark:Inference Time: 63 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
Inference Time: 61 ms | |
DEBUG:ai_benchmark:Inference Time: 61 ms | |
Inference Time: 62 ms | |
DEBUG:ai_benchmark:Inference Time: 62 ms | |
16.1 - inference | batch=2, size=512x512: 61.9 ± 0.7 ms | |
INFO:ai_benchmark:16.1 - inference | batch=2, size=512x512: 61.9 ± 0.7 ms | |
2023-09-04 15:45:22.157052: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:45:24.000836: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3498 ms | |
DEBUG:ai_benchmark:Training Time: 3498 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 93 ms | |
DEBUG:ai_benchmark:Training Time: 93 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
Training Time: 94 ms | |
DEBUG:ai_benchmark:Training Time: 94 ms | |
16.2 - training | batch=1, size=384x384: 93.7 ± 0.5 ms | |
INFO:ai_benchmark:16.2 - training | batch=1, size=384x384: 93.7 ± 0.5 ms | |
17/19. Pixel-RNN | |
INFO:ai_benchmark: | |
17/19. Pixel-RNN | |
2023-09-04 15:45:28.181679: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.181795: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.181827: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.181885: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.181922: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.181941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:45:28.414678: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.414806: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.414848: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.414922: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.414987: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:45:28.415011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:45:45.467070: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:45:45.484360: W tensorflow/c/c_api.cc:305] Operation '{name:'conv2d_out_logits/biases/Adam_1/Assign' id:47369 op device:{requested: '', assigned: ''} def:{{{node conv2d_out_logits/biases/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv2d_out_logits/biases/Adam_1, conv2d_out_logits/biases/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-04 15:45:47.055800: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:45:50.984216: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 29323 ms | |
DEBUG:ai_benchmark:Inference Time: 29323 ms | |
Inference Time: 448 ms | |
DEBUG:ai_benchmark:Inference Time: 448 ms | |
Inference Time: 455 ms | |
DEBUG:ai_benchmark:Inference Time: 455 ms | |
Inference Time: 465 ms | |
DEBUG:ai_benchmark:Inference Time: 465 ms | |
Inference Time: 463 ms | |
DEBUG:ai_benchmark:Inference Time: 463 ms | |
17.1 - inference | batch=50, size=64x64: 458 ± 7 ms | |
INFO:ai_benchmark:17.1 - inference | batch=50, size=64x64: 458 ± 7 ms | |
2023-09-04 15:46:38.798519: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 22011 ms | |
DEBUG:ai_benchmark:Training Time: 22011 ms | |
Training Time: 3473 ms | |
DEBUG:ai_benchmark:Training Time: 3473 ms | |
Training Time: 3541 ms | |
DEBUG:ai_benchmark:Training Time: 3541 ms | |
Training Time: 3514 ms | |
DEBUG:ai_benchmark:Training Time: 3514 ms | |
Training Time: 3435 ms | |
DEBUG:ai_benchmark:Training Time: 3435 ms | |
17.2 - training | batch=10, size=64x64: 3491 ± 40 ms | |
INFO:ai_benchmark:17.2 - training | batch=10, size=64x64: 3491 ± 40 ms | |
18/19. LSTM-Sentiment | |
INFO:ai_benchmark: | |
18/19. LSTM-Sentiment | |
2023-09-04 15:46:58.705347: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:46:58.705467: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:46:58.705507: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:46:58.705567: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:46:58.705607: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:46:58.705636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:46:58.975533: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:46:58.986090: W tensorflow/c/c_api.cc:305] Operation '{name:'Variable_1/Adam_1/Assign' id:351 op device:{requested: '', assigned: ''} def:{{{node Variable_1/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](Variable_1/Adam_1, Variable_1/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-04 15:46:59.028116: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:46:59.363607: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 875 ms | |
DEBUG:ai_benchmark:Inference Time: 875 ms | |
Inference Time: 591 ms | |
DEBUG:ai_benchmark:Inference Time: 591 ms | |
Inference Time: 579 ms | |
DEBUG:ai_benchmark:Inference Time: 579 ms | |
Inference Time: 610 ms | |
DEBUG:ai_benchmark:Inference Time: 610 ms | |
Inference Time: 601 ms | |
DEBUG:ai_benchmark:Inference Time: 601 ms | |
Inference Time: 587 ms | |
DEBUG:ai_benchmark:Inference Time: 587 ms | |
Inference Time: 593 ms | |
DEBUG:ai_benchmark:Inference Time: 593 ms | |
Inference Time: 583 ms | |
DEBUG:ai_benchmark:Inference Time: 583 ms | |
Inference Time: 564 ms | |
DEBUG:ai_benchmark:Inference Time: 564 ms | |
Inference Time: 590 ms | |
DEBUG:ai_benchmark:Inference Time: 590 ms | |
Inference Time: 591 ms | |
DEBUG:ai_benchmark:Inference Time: 591 ms | |
Inference Time: 580 ms | |
DEBUG:ai_benchmark:Inference Time: 580 ms | |
Inference Time: 604 ms | |
DEBUG:ai_benchmark:Inference Time: 604 ms | |
Inference Time: 586 ms | |
DEBUG:ai_benchmark:Inference Time: 586 ms | |
Inference Time: 597 ms | |
DEBUG:ai_benchmark:Inference Time: 597 ms | |
Inference Time: 573 ms | |
DEBUG:ai_benchmark:Inference Time: 573 ms | |
Inference Time: 584 ms | |
DEBUG:ai_benchmark:Inference Time: 584 ms | |
Inference Time: 583 ms | |
DEBUG:ai_benchmark:Inference Time: 583 ms | |
Inference Time: 586 ms | |
DEBUG:ai_benchmark:Inference Time: 586 ms | |
Inference Time: 605 ms | |
DEBUG:ai_benchmark:Inference Time: 605 ms | |
Inference Time: 583 ms | |
DEBUG:ai_benchmark:Inference Time: 583 ms | |
Inference Time: 600 ms | |
DEBUG:ai_benchmark:Inference Time: 600 ms | |
18.1 - inference | batch=100, size=1024x300: 589 ± 11 ms | |
INFO:ai_benchmark:18.1 - inference | batch=100, size=1024x300: 589 ± 11 ms | |
2023-09-04 15:47:16.964090: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1984 ms | |
DEBUG:ai_benchmark:Training Time: 1984 ms | |
Training Time: 1732 ms | |
DEBUG:ai_benchmark:Training Time: 1732 ms | |
Training Time: 1732 ms | |
DEBUG:ai_benchmark:Training Time: 1732 ms | |
Training Time: 1663 ms | |
DEBUG:ai_benchmark:Training Time: 1663 ms | |
Training Time: 1711 ms | |
DEBUG:ai_benchmark:Training Time: 1711 ms | |
Training Time: 1747 ms | |
DEBUG:ai_benchmark:Training Time: 1747 ms | |
Training Time: 1798 ms | |
DEBUG:ai_benchmark:Training Time: 1798 ms | |
Training Time: 1704 ms | |
DEBUG:ai_benchmark:Training Time: 1704 ms | |
Training Time: 1774 ms | |
DEBUG:ai_benchmark:Training Time: 1774 ms | |
Training Time: 1747 ms | |
DEBUG:ai_benchmark:Training Time: 1747 ms | |
Training Time: 1779 ms | |
DEBUG:ai_benchmark:Training Time: 1779 ms | |
Training Time: 1833 ms | |
DEBUG:ai_benchmark:Training Time: 1833 ms | |
Training Time: 1738 ms | |
DEBUG:ai_benchmark:Training Time: 1738 ms | |
Training Time: 2088 ms | |
DEBUG:ai_benchmark:Training Time: 2088 ms | |
Training Time: 1736 ms | |
DEBUG:ai_benchmark:Training Time: 1736 ms | |
Training Time: 1783 ms | |
DEBUG:ai_benchmark:Training Time: 1783 ms | |
Training Time: 1706 ms | |
DEBUG:ai_benchmark:Training Time: 1706 ms | |
18.2 - training | batch=10, size=1024x300: 1767 ± 92 ms | |
INFO:ai_benchmark:18.2 - training | batch=10, size=1024x300: 1767 ± 92 ms | |
19/19. GNMT-Translation | |
INFO:ai_benchmark: | |
19/19. GNMT-Translation | |
2023-09-04 15:47:47.390113: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:47:47.390226: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:47:47.390265: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:47:47.390327: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:47:47.390373: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-04 15:47:47.390397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:01:00.0 | |
2023-09-04 15:47:47.662655: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:47:47.677094: W tensorflow/c/c_api.cc:305] Operation '{name:'index_to_string/table_init' id:13 op device:{requested: '', assigned: ''} def:{{{node index_to_string/table_init}} = InitializeTableFromTextFileV2[_has_manual_control_dependencies=true, delimiter="\t", key_index=-1, offset=0, value_index=-2, vocab_size=-1](index_to_string, index_to_string/table_init/asset_filepath)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-04 15:47:47.709007: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-04 15:47:47.918263: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1030 ms | |
DEBUG:ai_benchmark:Inference Time: 1030 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
Inference Time: 187 ms | |
DEBUG:ai_benchmark:Inference Time: 187 ms | |
Inference Time: 187 ms | |
DEBUG:ai_benchmark:Inference Time: 187 ms | |
Inference Time: 185 ms | |
DEBUG:ai_benchmark:Inference Time: 185 ms | |
Inference Time: 184 ms | |
DEBUG:ai_benchmark:Inference Time: 184 ms | |
Inference Time: 184 ms | |
DEBUG:ai_benchmark:Inference Time: 184 ms | |
Inference Time: 184 ms | |
DEBUG:ai_benchmark:Inference Time: 184 ms | |
Inference Time: 183 ms | |
DEBUG:ai_benchmark:Inference Time: 183 ms | |
Inference Time: 184 ms | |
DEBUG:ai_benchmark:Inference Time: 184 ms | |
Inference Time: 185 ms | |
DEBUG:ai_benchmark:Inference Time: 185 ms | |
Inference Time: 187 ms | |
DEBUG:ai_benchmark:Inference Time: 187 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
Inference Time: 185 ms | |
DEBUG:ai_benchmark:Inference Time: 185 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
Inference Time: 185 ms | |
DEBUG:ai_benchmark:Inference Time: 185 ms | |
Inference Time: 188 ms | |
DEBUG:ai_benchmark:Inference Time: 188 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
Inference Time: 184 ms | |
DEBUG:ai_benchmark:Inference Time: 184 ms | |
Inference Time: 185 ms | |
DEBUG:ai_benchmark:Inference Time: 185 ms | |
Inference Time: 186 ms | |
DEBUG:ai_benchmark:Inference Time: 186 ms | |
19.1 - inference | batch=1, size=1x20: 185 ± 1 ms | |
INFO:ai_benchmark:19.1 - inference | batch=1, size=1x20: 185 ± 1 ms | |
Device Inference Score: 21950 | |
INFO:ai_benchmark:Device Inference Score: 21950 | |
Device Training Score: 12434 | |
INFO:ai_benchmark:Device Training Score: 12434 | |
Device AI Score: 34384 | |
INFO:ai_benchmark:Device AI Score: 34384 | |
For more information and results, please visit http://ai-benchmark.com/alpha | |
INFO:ai_benchmark:For more information and results, please visit http://ai-benchmark.com/alpha |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(tf) root@rocm:~/tmp# export ROCM_PATH=/opt/rocm | |
(tf) root@rocm:~/tmp# python bench.py | |
2023-09-16 01:08:01.778483: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. | |
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2023-09-16 01:08:02.688377: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.699485: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.699534: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713262: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713334: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713362: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713692: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713727: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713764: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:08:02.713779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:08:02.913471: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.915442: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.916020: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.919301: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.919896: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.960290: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.962323: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.964550: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.965338: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.966102: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.966905: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.967434: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.968667: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.971729: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.972210: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.978303: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.978835: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.979620: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.983667: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.984152: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.984875: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.985356: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.989608: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.990085: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.990807: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.991288: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.994083: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.994899: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.995378: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.997529: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:02.998368: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:04.913736: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 2400000000 exceeds 10% of free system memory. | |
2023-09-16 01:08:05.553178: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:05.995288: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:05.995513: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 2400000000 exceeds 10% of free system memory. | |
2023-09-16 01:08:06.578404: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.579381: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.580661: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.585309: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.586129: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.586999: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.588190: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.590628: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.591626: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.592076: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:06.592347: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 2400000000 exceeds 10% of free system memory. | |
Epoch 1/2 | |
2023-09-16 01:08:07.179126: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.179954: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.180669: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.181406: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.182121: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.182834: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.183548: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.184262: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.184970: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.185706: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.186421: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.187131: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.187840: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.188569: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.189275: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.190003: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:07.438768: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:08.420273: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f7aaa242dd0 initialized for platform ROCM (this does not guarantee that XLA will be used). Devices: | |
2023-09-16 01:08:08.420307: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Radeon RX 7900 XTX, AMDGPU ISA version: gfx1100 | |
2023-09-16 01:08:08.423665: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable. | |
2023-09-16 01:08:08.485395: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process. | |
20/20 [==============================] - ETA: 0s - loss: 1.13352023-09-16 01:08:25.935172: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:25.935742: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:08:25.936198: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
20/20 [==============================] - 19s 828ms/step - loss: 1.1335 | |
2023-09-16 01:08:25.938763: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Epoch 2/2 | |
20/20 [==============================] - 17s 834ms/step - loss: 1.1179 | |
(tf) root@rocm:~/tmp# export ROCM_PATH=/opt/rocm^C | |
(tf) root@rocm:~/tmp# nano ~/.bashrc | |
(tf) root@rocm:~/tmp# python benchmark.py | |
2023-09-16 01:09:24.012526: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. | |
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2023-09-16 01:09:24.920123: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:24.931203: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:24.931251: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
>> AI-Benchmark - 0.1.3.cm | |
>> Let the AI Games begin | |
2023-09-16 01:09:26.107868: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.107982: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.108079: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.108247: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.108306: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.108366: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.108392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:09:26.330448: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330536: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330564: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330611: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330643: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:09:26.330705: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330744: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330772: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330806: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330836: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:26.330848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
* TF Version: 2.15.0 | |
* Platform: Linux-5.15.0-83-generic-x86_64-with-glibc2.35 | |
* CPU: AMD Ryzen 9 3900X 12-Core Processor | |
* CPU RAM: 47 GB | |
* GPU/0: Radeon RX 7900 XTX | |
* GPU RAM: 23.5 GB | |
* CUDA Version: N/A | |
* CUDA Build: N/A | |
The benchmark is running... | |
The tests might take up to 20 minutes | |
Please don't interrupt the script | |
1/19. MobileNet-V2 | |
2023-09-16 01:09:27.599471: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:27.599596: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:27.599652: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:27.599742: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:27.599801: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:27.599829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:09:27.793908: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:386] MLIR V1 optimization pass is not enabled | |
2023-09-16 01:09:28.010852: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:09:28.553784: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2345 ms | |
Inference Time: 31 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
Inference Time: 22 ms | |
1.1 - inference | batch=50, size=224x224: 22.5 ± 1.9 ms | |
2023-09-16 01:09:35.785411: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:09:36.456047: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2038 ms | |
Training Time: 363 ms | |
Training Time: 364 ms | |
Training Time: 351 ms | |
Training Time: 360 ms | |
Training Time: 356 ms | |
Training Time: 356 ms | |
Training Time: 352 ms | |
Training Time: 356 ms | |
Training Time: 350 ms | |
Training Time: 354 ms | |
Training Time: 347 ms | |
Training Time: 352 ms | |
Training Time: 352 ms | |
Training Time: 352 ms | |
Training Time: 353 ms | |
Training Time: 355 ms | |
Training Time: 351 ms | |
Training Time: 355 ms | |
Training Time: 354 ms | |
Training Time: 357 ms | |
Training Time: 351 ms | |
1.2 - training | batch=50, size=224x224: 354 ± 4 ms | |
2/19. Inception-V3 | |
2023-09-16 01:09:48.934252: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:48.934335: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:48.934363: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:48.934415: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:48.934448: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:09:48.934464: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:09:49.413365: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:09:49.762318: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2285 ms | |
Inference Time: 39 ms | |
Inference Time: 33 ms | |
Inference Time: 40 ms | |
Inference Time: 35 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 42 ms | |
Inference Time: 33 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 30 ms | |
Inference Time: 32 ms | |
Inference Time: 30 ms | |
Inference Time: 33 ms | |
Inference Time: 29 ms | |
Inference Time: 33 ms | |
Inference Time: 32 ms | |
Inference Time: 30 ms | |
2.1 - inference | batch=20, size=346x346: 33.1 ± 3.2 ms | |
2023-09-16 01:09:55.682665: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:09:56.634337: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5225 ms | |
Training Time: 385 ms | |
Training Time: 380 ms | |
Training Time: 376 ms | |
Training Time: 379 ms | |
Training Time: 374 ms | |
Training Time: 377 ms | |
Training Time: 378 ms | |
Training Time: 379 ms | |
Training Time: 379 ms | |
Training Time: 381 ms | |
Training Time: 378 ms | |
Training Time: 381 ms | |
Training Time: 375 ms | |
Training Time: 375 ms | |
Training Time: 378 ms | |
Training Time: 375 ms | |
Training Time: 376 ms | |
Training Time: 376 ms | |
Training Time: 377 ms | |
Training Time: 377 ms | |
Training Time: 392 ms | |
2.2 - training | batch=20, size=346x346: 378 ± 4 ms | |
3/19. Inception-V4 | |
2023-09-16 01:10:10.904575: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:10.904662: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:10.904691: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:10.904745: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:10.904778: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:10.904793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:10:11.799543: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:12.285430: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1059 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
Inference Time: 35 ms | |
Inference Time: 36 ms | |
3.1 - inference | batch=10, size=346x346: 35.5 ± 0.5 ms | |
2023-09-16 01:10:16.971001: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:18.599017: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3954 ms | |
Training Time: 306 ms | |
Training Time: 312 ms | |
Training Time: 305 ms | |
Training Time: 306 ms | |
Training Time: 310 ms | |
Training Time: 306 ms | |
Training Time: 306 ms | |
Training Time: 306 ms | |
Training Time: 306 ms | |
Training Time: 309 ms | |
Training Time: 306 ms | |
Training Time: 305 ms | |
Training Time: 307 ms | |
Training Time: 303 ms | |
Training Time: 306 ms | |
Training Time: 308 ms | |
Training Time: 305 ms | |
Training Time: 305 ms | |
Training Time: 303 ms | |
Training Time: 304 ms | |
Training Time: 302 ms | |
3.2 - training | batch=10, size=346x346: 306 ± 2 ms | |
4/19. Inception-ResNet-V2 | |
2023-09-16 01:10:28.544092: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:28.544284: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:28.544368: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:28.544426: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:28.544461: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:28.544476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:10:30.124825: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:30.909275: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1364 ms | |
Inference Time: 43 ms | |
Inference Time: 41 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 42 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
Inference Time: 42 ms | |
Inference Time: 41 ms | |
Inference Time: 41 ms | |
4.1 - inference | batch=10, size=346x346: 41.7 ± 0.6 ms | |
2023-09-16 01:10:37.645296: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:40.838021: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5193 ms | |
Training Time: 260 ms | |
Training Time: 258 ms | |
Training Time: 259 ms | |
Training Time: 257 ms | |
Training Time: 258 ms | |
Training Time: 258 ms | |
Training Time: 259 ms | |
Training Time: 258 ms | |
Training Time: 256 ms | |
Training Time: 259 ms | |
Training Time: 261 ms | |
Training Time: 257 ms | |
Training Time: 258 ms | |
Training Time: 256 ms | |
Training Time: 260 ms | |
Training Time: 259 ms | |
Training Time: 258 ms | |
Training Time: 257 ms | |
Training Time: 258 ms | |
Training Time: 257 ms | |
Training Time: 256 ms | |
4.2 - training | batch=8, size=346x346: 258 ± 1 ms | |
5/19. ResNet-V2-50 | |
2023-09-16 01:10:49.743923: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:49.744131: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:49.744240: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:49.744299: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:49.744332: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:49.744347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:10:50.091722: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:50.321624: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 825 ms | |
Inference Time: 24 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 24 ms | |
Inference Time: 23 ms | |
Inference Time: 36 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 22 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
Inference Time: 23 ms | |
5.1 - inference | batch=10, size=346x346: 23.5 ± 2.8 ms | |
2023-09-16 01:10:53.627191: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:10:54.287310: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2376 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 86 ms | |
Training Time: 86 ms | |
Training Time: 86 ms | |
Training Time: 86 ms | |
Training Time: 86 ms | |
Training Time: 86 ms | |
Training Time: 85 ms | |
Training Time: 86 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 86 ms | |
Training Time: 84 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
5.2 - training | batch=10, size=346x346: 85.3 ± 0.6 ms | |
6/19. ResNet-V2-152 | |
2023-09-16 01:10:58.943750: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:58.943890: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:58.943952: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:58.944046: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:58.944116: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:10:58.944146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:11:00.440559: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:01.041025: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 980 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 33 ms | |
Inference Time: 31 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 31 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 31 ms | |
Inference Time: 32 ms | |
Inference Time: 31 ms | |
Inference Time: 32 ms | |
Inference Time: 32 ms | |
Inference Time: 33 ms | |
Inference Time: 31 ms | |
Inference Time: 32 ms | |
Inference Time: 31 ms | |
6.1 - inference | batch=10, size=256x256: 31.9 ± 0.7 ms | |
2023-09-16 01:11:06.748259: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:09.001562: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3261 ms | |
Training Time: 110 ms | |
Training Time: 108 ms | |
Training Time: 107 ms | |
Training Time: 109 ms | |
Training Time: 108 ms | |
Training Time: 109 ms | |
Training Time: 107 ms | |
Training Time: 110 ms | |
Training Time: 109 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 121 ms | |
Training Time: 111 ms | |
Training Time: 108 ms | |
Training Time: 108 ms | |
Training Time: 109 ms | |
Training Time: 109 ms | |
Training Time: 111 ms | |
Training Time: 111 ms | |
Training Time: 110 ms | |
6.2 - training | batch=10, size=256x256: 109 ± 3 ms | |
7/19. VGG-16 | |
2023-09-16 01:11:13.609140: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:13.609657: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:13.609693: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:13.609746: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:13.609780: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:13.609795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:11:13.671676: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:13.780649: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 3103 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 51 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 50 ms | |
Inference Time: 47 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 51 ms | |
7.1 - inference | batch=20, size=224x224: 49.4 ± 1.0 ms | |
2023-09-16 01:11:19.507927: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:20.043183: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2170 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 96 ms | |
Training Time: 82 ms | |
Training Time: 83 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 87 ms | |
Training Time: 88 ms | |
Training Time: 84 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
Training Time: 87 ms | |
Training Time: 86 ms | |
Training Time: 85 ms | |
Training Time: 86 ms | |
Training Time: 87 ms | |
Training Time: 89 ms | |
Training Time: 85 ms | |
Training Time: 85 ms | |
7.2 - training | batch=2, size=224x224: 86.0 ± 2.7 ms | |
8/19. SRCNN 9-5-5 | |
2023-09-16 01:11:24.180499: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:24.180626: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:24.180680: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:24.180764: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:24.180826: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:24.180853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:11:24.207632: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:24.401961: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1680 ms | |
Inference Time: 30 ms | |
Inference Time: 30 ms | |
Inference Time: 28 ms | |
Inference Time: 28 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 28 ms | |
Inference Time: 30 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 29 ms | |
Inference Time: 28 ms | |
Inference Time: 28 ms | |
Inference Time: 28 ms | |
Inference Time: 29 ms | |
Inference Time: 28 ms | |
8.1 - inference | batch=10, size=512x512: 28.8 ± 0.7 ms | |
Inference Time: 5214 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 25 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 25 ms | |
Inference Time: 23 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 27 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 25 ms | |
Inference Time: 25 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
Inference Time: 24 ms | |
8.2 - inference | batch=1, size=1536x1536: 24.3 ± 0.8 ms | |
2023-09-16 01:11:36.861344: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:37.401425: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 7170 ms | |
Training Time: 191 ms | |
Training Time: 189 ms | |
Training Time: 185 ms | |
Training Time: 187 ms | |
Training Time: 187 ms | |
Training Time: 186 ms | |
Training Time: 182 ms | |
Training Time: 186 ms | |
Training Time: 188 ms | |
Training Time: 184 ms | |
Training Time: 187 ms | |
Training Time: 186 ms | |
Training Time: 188 ms | |
Training Time: 187 ms | |
Training Time: 185 ms | |
Training Time: 186 ms | |
Training Time: 188 ms | |
Training Time: 187 ms | |
Training Time: 188 ms | |
Training Time: 188 ms | |
Training Time: 187 ms | |
8.3 - training | batch=10, size=512x512: 187 ± 2 ms | |
9/19. VGG-19 Super-Res | |
2023-09-16 01:11:58.573800: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:58.573879: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:58.573909: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:58.573957: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:58.573990: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:11:58.574004: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:11:58.660685: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:11:58.828911: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 342 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 36 ms | |
Inference Time: 37 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
Inference Time: 36 ms | |
9.1 - inference | batch=10, size=256x256: 36.1 ± 0.3 ms | |
Inference Time: 1210 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 57 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
Inference Time: 58 ms | |
9.2 - inference | batch=1, size=1024x1024: 57.9 ± 0.3 ms | |
2023-09-16 01:12:06.219465: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:12:06.718707: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1261 ms | |
Training Time: 209 ms | |
Training Time: 200 ms | |
Training Time: 202 ms | |
Training Time: 201 ms | |
Training Time: 202 ms | |
Training Time: 202 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 200 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
Training Time: 202 ms | |
Training Time: 202 ms | |
Training Time: 201 ms | |
Training Time: 201 ms | |
9.3 - training | batch=10, size=224x224: 202 ± 2 ms | |
10/19. ResNet-SRGAN | |
2023-09-16 01:12:19.894846: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:19.894926: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:19.894955: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:19.895008: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:19.895041: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:19.895056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:12:20.170133: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:12:20.512697: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1377 ms | |
Inference Time: 55 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 44 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
Inference Time: 45 ms | |
10.1 - inference | batch=10, size=512x512: 45.0 ± 2.3 ms | |
Inference Time: 1276 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 39 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 38 ms | |
Inference Time: 37 ms | |
10.2 - inference | batch=1, size=1536x1536: 37.9 ± 0.5 ms | |
2023-09-16 01:12:29.654666: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:12:30.354905: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2828 ms | |
Training Time: 130 ms | |
Training Time: 122 ms | |
Training Time: 120 ms | |
Training Time: 118 ms | |
Training Time: 121 ms | |
Training Time: 118 ms | |
Training Time: 120 ms | |
Training Time: 118 ms | |
Training Time: 122 ms | |
Training Time: 117 ms | |
Training Time: 117 ms | |
Training Time: 118 ms | |
Training Time: 119 ms | |
Training Time: 119 ms | |
Training Time: 118 ms | |
Training Time: 119 ms | |
Training Time: 120 ms | |
Training Time: 119 ms | |
Training Time: 119 ms | |
Training Time: 119 ms | |
Training Time: 120 ms | |
10.3 - training | batch=5, size=512x512: 120 ± 3 ms | |
11/19. ResNet-DPED | |
2023-09-16 01:12:40.172665: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:40.172746: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:40.172774: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:40.172826: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:40.172857: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:12:40.172872: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:12:40.223224: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:12:40.411515: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 793 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
Inference Time: 47 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 47 ms | |
Inference Time: 47 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
11.1 - inference | batch=10, size=256x256: 47.7 ± 0.6 ms | |
Inference Time: 4246 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
Inference Time: 79 ms | |
11.2 - inference | batch=1, size=1024x1024: 79.1 ± 0.5 ms | |
2023-09-16 01:12:52.253098: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:12:53.148495: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2086 ms | |
Training Time: 112 ms | |
Training Time: 111 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 111 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 109 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 110 ms | |
Training Time: 111 ms | |
Training Time: 109 ms | |
Training Time: 111 ms | |
Training Time: 110 ms | |
11.3 - training | batch=15, size=128x128: 110.1 ± 0.7 ms | |
12/19. U-Net | |
2023-09-16 01:13:08.590230: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:08.590310: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:08.590338: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:08.590393: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:08.590427: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:08.590442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:13:08.716029: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:13:08.878830: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 6404 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 78 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 79 ms | |
Inference Time: 81 ms | |
Inference Time: 79 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 80 ms | |
Inference Time: 81 ms | |
12.1 - inference | batch=4, size=512x512: 79.8 ± 0.7 ms | |
Inference Time: 7562 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 83 ms | |
Inference Time: 81 ms | |
Inference Time: 81 ms | |
Inference Time: 83 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 81 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
Inference Time: 82 ms | |
12.2 - inference | batch=1, size=1024x1024: 81.7 ± 0.6 ms | |
2023-09-16 01:13:28.853492: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:13:29.664292: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 6032 ms | |
Training Time: 122 ms | |
Training Time: 120 ms | |
Training Time: 131 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
Training Time: 122 ms | |
Training Time: 120 ms | |
Training Time: 120 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
Training Time: 122 ms | |
Training Time: 122 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
Training Time: 120 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
Training Time: 120 ms | |
Training Time: 121 ms | |
Training Time: 121 ms | |
12.3 - training | batch=4, size=256x256: 121 ± 2 ms | |
13/19. Nvidia-SPADE | |
2023-09-16 01:13:38.996603: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:38.996736: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:38.996792: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:38.996883: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:38.996951: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:38.996978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:13:39.647058: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:13:39.964882: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2088 ms | |
Inference Time: 51 ms | |
Inference Time: 52 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 51 ms | |
Inference Time: 52 ms | |
Inference Time: 53 ms | |
Inference Time: 53 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 54 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 51 ms | |
Inference Time: 53 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 52 ms | |
Inference Time: 51 ms | |
13.1 - inference | batch=5, size=128x128: 52.2 ± 0.8 ms | |
2023-09-16 01:13:45.599243: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:13:47.300704: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3643 ms | |
Training Time: 82 ms | |
Training Time: 81 ms | |
Training Time: 82 ms | |
Training Time: 83 ms | |
Training Time: 84 ms | |
Training Time: 81 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 81 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 82 ms | |
Training Time: 83 ms | |
Training Time: 81 ms | |
Training Time: 83 ms | |
Training Time: 82 ms | |
13.2 - training | batch=1, size=128x128: 82.0 ± 0.7 ms | |
14/19. ICNet | |
2023-09-16 01:13:51.935487: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:51.935619: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:51.935675: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:51.935762: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:51.935824: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:13:51.935854: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:13:52.244352: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:13:52.589237: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 2384 ms | |
Inference Time: 112 ms | |
Inference Time: 118 ms | |
Inference Time: 120 ms | |
Inference Time: 118 ms | |
Inference Time: 115 ms | |
Inference Time: 116 ms | |
Inference Time: 116 ms | |
Inference Time: 112 ms | |
Inference Time: 116 ms | |
Inference Time: 117 ms | |
Inference Time: 119 ms | |
Inference Time: 117 ms | |
Inference Time: 118 ms | |
Inference Time: 117 ms | |
Inference Time: 118 ms | |
Inference Time: 119 ms | |
Inference Time: 119 ms | |
Inference Time: 115 ms | |
Inference Time: 118 ms | |
Inference Time: 117 ms | |
Inference Time: 117 ms | |
14.1 - inference | batch=5, size=1024x1536: 117 ± 2 ms | |
2023-09-16 01:14:00.527180: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:14:01.451875: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 3562 ms | |
Training Time: 410 ms | |
Training Time: 394 ms | |
Training Time: 413 ms | |
Training Time: 406 ms | |
Training Time: 403 ms | |
Training Time: 394 ms | |
Training Time: 412 ms | |
Training Time: 394 ms | |
Training Time: 413 ms | |
Training Time: 392 ms | |
Training Time: 410 ms | |
Training Time: 406 ms | |
Training Time: 422 ms | |
Training Time: 406 ms | |
Training Time: 411 ms | |
Training Time: 374 ms | |
Training Time: 392 ms | |
Training Time: 385 ms | |
Training Time: 413 ms | |
Training Time: 414 ms | |
Training Time: 407 ms | |
14.2 - training | batch=10, size=1024x1536: 403 ± 11 ms | |
15/19. PSPNet | |
2023-09-16 01:14:26.256758: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:26.256842: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:26.256871: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:26.256923: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:26.256958: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:26.256973: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:14:26.673789: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:14:27.041583: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 10691 ms | |
Inference Time: 230 ms | |
Inference Time: 229 ms | |
Inference Time: 234 ms | |
Inference Time: 229 ms | |
Inference Time: 229 ms | |
Inference Time: 229 ms | |
Inference Time: 231 ms | |
Inference Time: 227 ms | |
Inference Time: 235 ms | |
Inference Time: 236 ms | |
Inference Time: 229 ms | |
Inference Time: 229 ms | |
Inference Time: 233 ms | |
Inference Time: 234 ms | |
Inference Time: 235 ms | |
Inference Time: 241 ms | |
Inference Time: 228 ms | |
Inference Time: 229 ms | |
Inference Time: 231 ms | |
Inference Time: 231 ms | |
Inference Time: 230 ms | |
15.1 - inference | batch=5, size=720x720: 231 ± 3 ms | |
2023-09-16 01:14:45.300386: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:14:46.123581: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 5854 ms | |
Training Time: 128 ms | |
Training Time: 126 ms | |
Training Time: 125 ms | |
Training Time: 126 ms | |
Training Time: 125 ms | |
Training Time: 126 ms | |
Training Time: 125 ms | |
Training Time: 124 ms | |
Training Time: 126 ms | |
Training Time: 125 ms | |
Training Time: 126 ms | |
Training Time: 126 ms | |
Training Time: 126 ms | |
Training Time: 127 ms | |
Training Time: 126 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 125 ms | |
Training Time: 126 ms | |
15.2 - training | batch=1, size=512x512: 125.6 ± 0.8 ms | |
16/19. DeepLab | |
2023-09-16 01:14:54.647838: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:54.647981: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:54.648043: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:54.648139: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:54.648215: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:14:54.648246: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:14:55.486111: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:14:56.103411: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 1426 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 48 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 50 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
Inference Time: 49 ms | |
16.1 - inference | batch=2, size=512x512: 49.1 ± 0.6 ms | |
2023-09-16 01:15:01.135335: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:15:02.648263: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 2878 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 79 ms | |
Training Time: 78 ms | |
Training Time: 79 ms | |
Training Time: 78 ms | |
Training Time: 79 ms | |
Training Time: 78 ms | |
Training Time: 79 ms | |
Training Time: 78 ms | |
Training Time: 78 ms | |
Training Time: 79 ms | |
Training Time: 79 ms | |
Training Time: 79 ms | |
Training Time: 79 ms | |
Training Time: 79 ms | |
Training Time: 82 ms | |
16.2 - training | batch=1, size=384x384: 78.6 ± 0.9 ms | |
17/19. Pixel-RNN | |
2023-09-16 01:15:06.435202: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.435363: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.435612: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.435715: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.435784: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.435814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:15:06.619004: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.619087: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.619114: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.619166: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.619195: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:15:06.619209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:15:19.205765: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:15:19.218428: W tensorflow/c/c_api.cc:305] Operation '{name:'conv2d_out_logits/biases/Adam_1/Assign' id:47369 op device:{requested: '', assigned: ''} def:{{{node conv2d_out_logits/biases/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv2d_out_logits/biases/Adam_1, conv2d_out_logits/biases/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-16 01:15:20.533055: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:15:23.932621: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 4019 ms | |
Inference Time: 394 ms | |
Inference Time: 381 ms | |
Inference Time: 394 ms | |
Inference Time: 385 ms | |
Inference Time: 386 ms | |
Inference Time: 391 ms | |
Inference Time: 377 ms | |
Inference Time: 382 ms | |
Inference Time: 371 ms | |
Inference Time: 375 ms | |
Inference Time: 373 ms | |
Inference Time: 416 ms | |
Inference Time: 373 ms | |
Inference Time: 391 ms | |
Inference Time: 376 ms | |
Inference Time: 378 ms | |
Inference Time: 418 ms | |
Inference Time: 427 ms | |
Inference Time: 418 ms | |
Inference Time: 415 ms | |
Inference Time: 408 ms | |
17.1 - inference | batch=50, size=64x64: 392 ± 17 ms | |
2023-09-16 01:16:00.780439: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 17809 ms | |
Training Time: 2783 ms | |
Training Time: 2690 ms | |
Training Time: 2772 ms | |
Training Time: 2720 ms | |
17.2 - training | batch=10, size=64x64: 2741 ± 38 ms | |
18/19. LSTM-Sentiment | |
2023-09-16 01:16:16.892938: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:16:16.893023: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:16:16.893052: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:16:16.893111: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:16:16.893145: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:16:16.893160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:16:17.096621: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:16:17.105792: W tensorflow/c/c_api.cc:305] Operation '{name:'Variable_1/Adam_1/Assign' id:351 op device:{requested: '', assigned: ''} def:{{{node Variable_1/Adam_1/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](Variable_1/Adam_1, Variable_1/Adam_1/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-16 01:16:17.144163: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:16:17.480118: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 743 ms | |
Inference Time: 448 ms | |
Inference Time: 452 ms | |
Inference Time: 436 ms | |
Inference Time: 434 ms | |
Inference Time: 445 ms | |
Inference Time: 445 ms | |
Inference Time: 442 ms | |
Inference Time: 448 ms | |
Inference Time: 448 ms | |
Inference Time: 449 ms | |
Inference Time: 448 ms | |
Inference Time: 446 ms | |
Inference Time: 449 ms | |
Inference Time: 447 ms | |
Inference Time: 442 ms | |
Inference Time: 440 ms | |
Inference Time: 452 ms | |
Inference Time: 455 ms | |
Inference Time: 439 ms | |
Inference Time: 448 ms | |
Inference Time: 452 ms | |
18.1 - inference | batch=100, size=1024x300: 446 ± 5 ms | |
2023-09-16 01:16:32.039514: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Training Time: 1554 ms | |
Training Time: 1441 ms | |
Training Time: 1414 ms | |
Training Time: 1400 ms | |
Training Time: 1390 ms | |
Training Time: 1392 ms | |
Training Time: 1374 ms | |
Training Time: 1378 ms | |
Training Time: 1416 ms | |
Training Time: 1389 ms | |
Training Time: 1384 ms | |
Training Time: 1415 ms | |
Training Time: 1381 ms | |
Training Time: 1410 ms | |
Training Time: 1365 ms | |
Training Time: 1387 ms | |
Training Time: 1394 ms | |
Training Time: 1381 ms | |
Training Time: 1359 ms | |
Training Time: 1368 ms | |
Training Time: 1431 ms | |
18.2 - training | batch=10, size=1024x300: 1393 ± 21 ms | |
19/19. GNMT-Translation | |
2023-09-16 01:17:01.992496: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:17:01.992623: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:17:01.992679: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:17:01.992762: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:17:01.992823: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:761] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero | |
2023-09-16 01:17:01.992849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1911] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24020 MB memory: -> device: 0, name: Radeon RX 7900 XTX, pci bus id: 0000:2d:00.0 | |
2023-09-16 01:17:02.196492: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:17:02.209058: W tensorflow/c/c_api.cc:305] Operation '{name:'index_to_string/table_init' id:13 op device:{requested: '', assigned: ''} def:{{{node index_to_string/table_init}} = InitializeTableFromTextFileV2[_has_manual_control_dependencies=true, delimiter="\t", key_index=-1, offset=0, value_index=-2, vocab_size=-1](index_to_string, index_to_string/table_init/asset_filepath)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. | |
2023-09-16 01:17:02.238052: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
2023-09-16 01:17:02.429881: I tensorflow/core/common_runtime/gpu_fusion_pass.cc:508] ROCm Fusion is enabled. | |
Inference Time: 988 ms | |
Inference Time: 122 ms | |
Inference Time: 121 ms | |
Inference Time: 121 ms | |
Inference Time: 122 ms | |
Inference Time: 124 ms | |
Inference Time: 122 ms | |
Inference Time: 121 ms | |
Inference Time: 122 ms | |
Inference Time: 122 ms | |
Inference Time: 124 ms | |
Inference Time: 128 ms | |
Inference Time: 124 ms | |
Inference Time: 121 ms | |
Inference Time: 122 ms | |
Inference Time: 122 ms | |
Inference Time: 121 ms | |
Inference Time: 122 ms | |
Inference Time: 124 ms | |
Inference Time: 125 ms | |
Inference Time: 121 ms | |
Inference Time: 123 ms | |
19.1 - inference | batch=1, size=1x20: 123 ± 2 ms | |
Device Inference Score: 23681 | |
Device Training Score: 14143 | |
Device AI Score: 37824 | |
For more information and results, please visit http://ai-benchmark.com/alpha |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment