Skip to content

Instantly share code, notes, and snippets.

@Flameeyes
Last active August 29, 2015 14:08
Show Gist options
  • Select an option

  • Save Flameeyes/3ec971d349724a2f9a6d to your computer and use it in GitHub Desktop.

Select an option

Save Flameeyes/3ec971d349724a2f9a6d to your computer and use it in GitHub Desktop.
Manualy reduced testcase
extern void externCall(float);
extern float sinrotation();
extern float cosrotation();
static const float midX = 850.5f;
static const float midY = 1753.5f;
void main() {
const float srcX = midX * cosrotation() - midY * sinrotation();
externCall(srcX);
}
--- deskew-bdver1-noavx.S 2014-10-27 10:03:19.396965654 -0700
+++ deskew-bdver1.S 2014-10-27 10:03:19.372966911 -0700
@@ -25,13 +25,12 @@
movq -8(%rbp), %rax
xorq %fs:40, %rax
jne .L6
- vmovss -20(%rbp), %xmm2
- vmulss .LC1(%rip), %xmm0, %xmm0
- vmulss .LC0(%rip), %xmm2, %xmm1
+ vmulss .LC1(%rip), %xmm0, %xmm0
+ vmovss -20(%rbp), %xmm1
+ vfmsubss %xmm0, .LC0(%rip), %xmm1, %xmm0
leave
.cfi_remember_state
.cfi_def_cfa 7, 8
- vsubss %xmm0, %xmm1, %xmm0
jmp externCall@PLT
.L6:
.cfi_restore_state
@Flameeyes
Copy link
Copy Markdown
Author

Note that if I change the rotation value, the problem can go away.. 0.05f works with both settings the same, 0.06f triggers the bug too.

@mansr
Copy link
Copy Markdown

mansr commented Oct 27, 2014

0.06f isn't exact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment