-
-
Save carnaval/a45dc14b0791c633ce9b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
julia> function f(x::Float64) | |
@iaca for i = 1:1000 | |
x = x*x - 2*x | |
end | |
x | |
end | |
f (generic function with 2 methods) | |
julia> println(analyze(f, Tuple{Float64})) | |
Intel(R) Architecture Code Analyzer Version - 2.1 | |
Analyzed File - /tmp/tmpoiDgjt | |
Binary Format - 64Bit | |
Architecture - HSW | |
Analysis Type - Throughput | |
Throughput Analysis Report | |
-------------------------- | |
Block Throughput: 8.50 Cycles Throughput Bottleneck: InterIteration | |
Port Binding In Cycles Per Iteration: | |
--------------------------------------------------------------------------------------- | |
| Port | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 | | |
--------------------------------------------------------------------------------------- | |
| Cycles | 1.5 0.0 | 1.5 | 0.5 0.5 | 0.5 0.5 | 0.0 | 1.5 | 1.5 | 0.0 | | |
--------------------------------------------------------------------------------------- | |
N - port number or number of cycles resource conflict caused delay, DV - Divider pipe (on port 0) | |
D - Data fetch pipe (on ports 2 and 3), CP - on a critical path | |
F - Macro Fusion with the previous instruction occurred | |
* - instruction micro-ops not bound to a port | |
^ - Micro Fusion happened | |
# - ESP Tracking sync uop was issued | |
@ - SSE instruction followed an AVX256 instruction, dozens of cycles penalty is expected | |
! - instruction not supported, was not accounted in Analysis | |
| Num Of | Ports pressure in cycles | | | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 | | | |
--------------------------------------------------------------------------------- | |
| 1 | | | | | | 1.0 | | | | mov eax, 0x3e8 | |
| 1 | | | | | | 0.5 | 0.5 | | | mov rcx, 0x7ffde31fe000 | |
| 1 | | | 0.5 0.5 | 0.5 0.5 | | | | | | vmovsd xmm1, qword ptr [rcx] | |
| 0* | | | | | | | | | | nop dword ptr [rax], eax | |
| 1 | 0.9 | 0.1 | | | | | | | CP | vmulsd xmm2, xmm0, xmm0 | |
| 1 | 0.6 | 0.4 | | | | | | | CP | vmulsd xmm0, xmm0, xmm1 | |
| 1 | | 1.0 | | | | | | | CP | vaddsd xmm0, xmm2, xmm0 | |
| 1 | | | | | | | 1.0 | | | add rax, 0xffffffffffffffff | |
| 0F | | | | | | | | | | jnz 0xfffffffffffffff0 | |
Total Num Of Uops: 7 | |
julia> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment