Skip to content

Instantly share code, notes, and snippets.

@shenwei356
Created August 4, 2020 12:10
Show Gist options
  • Save shenwei356/35d336dbb273c1e03e625b6034267c39 to your computer and use it in GitHub Desktop.
Save shenwei356/35d336dbb273c1e03e625b6034267c39 to your computer and use it in GitHub Desktop.
  8            .          .           TEXT ·__mm_add_epi32(SB),0,$0 
  9        640ms      640ms               VMOVDQU x+0(FP), Y0 
 10        5.62s      5.62s               VMOVDQU y+32(FP), Y1 
 11        4.81s      4.81s               VPADDD  Y1, Y0, Y0 
 12        1.16s      1.16s               VMOVDQU Y0, q+64(FP) 
 13        1.30s      1.30s               VZEROUPPER 
 14            .          .               RET 
@shenwei356
Copy link
Author

I see. I post another thread.

Thanks for you sincere advice again. I'll try to learn assembly, which is so useful for improving performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment