8 . . TEXT ·__mm_add_epi32(SB),0,$0
9 640ms 640ms VMOVDQU x+0(FP), Y0
10 5.62s 5.62s VMOVDQU y+32(FP), Y1
11 4.81s 4.81s VPADDD Y1, Y0, Y0
12 1.16s 1.16s VMOVDQU Y0, q+64(FP)
13 1.30s 1.30s VZEROUPPER
14 . . RET
Created
August 4, 2020 12:10
-
-
Save shenwei356/35d336dbb273c1e03e625b6034267c39 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I see. I post another thread.
Thanks for you sincere advice again. I'll try to learn assembly, which is so useful for improving performance.