You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Later on, I found how to display those values directly in LLDB.
In fact, XMM registers are 128-bit wide - there's room for 4 single-precision, 32-bit floating point numbers in there.
We can ask lldb to read a register in 'float' format but it doesn't help us:
(lldb) register read xmm0 -f f
xmm0 = 3.92929226918480722e-4942
Since it interprets xmm0 as.. a 128-bit floatint point value? or perhaps 80-bit or 64-bit wide, not sure. In any case it should be 3 and it's 3.9 to the power minus five thousands. So nope.
But we can also ask in 'float32' format and that'll give us four entries:
So... yep. It adds an undefined value to z, then another undefined value again. And yet in the assembly it chooses to do something in takes_three_floats (which yields 5.0f) and something else in main (which yields 6.0f and passes the assert).
The mystery remains!
The mystery of the 5.0f solved!
In fact it's not 5.0f at all!
Remember how in the O1 version, when it was calling take_three_floats it didn't even bother passing the first two arguments? That's what's happening in O2!
For main
Here's what LLVM thought.
// in mainassert(take_three_floats(1,2,3) ==6)
// alright, we would pass those on the stack... and then in take_three_floats we havetakes_a_vec3(&z)
// so we'd pass the address of the last one// and then in takes_a_vec3 we'd doreturnv[0]+v[1]+v[2];
So it'd rewrite that to:
// in main we could basically dofloatx=1.0;
floaty=2.0;
floatz=3.0;
assert(takes_a_vec3(&z) ==6);
// that'd be the same thing right?
And then it'd rewrite that to:
// well, in takes_a_vec3 we add'em all// so we could do:floatx=1.0;
floaty=2.0;
floatz=3.0;
assert((x+y+z) ==6);
Or just:
assert((int) (1.0 + 2.0 + 3.0) == 6);
And everybody knows 1.0 + 2.0 = 3.0, for sure, so we can do:
assert((3.0 + 3.0) == 6.0)
And that's what it does.
For takes_three_floats
Alright, so we have:
floattakes_three_floats(floatx, floaty, floatz) {
// look ma, no hands!returntakes_a_vec3(&z);
}
But it's stupid cause we only use z! So let's just pretend the first two arguments don't matter.
And we know what takes_a_vec3 does so we can just go ahead and:
But wait, we said x and y don't matter, we can't use them. Well it just so happens that z = x + y in the only case we're ever called, so we can just do:
floattakes_three_floats(floatx, floaty, floatz) {
// z is stored in %xmm2// %xmm0 and %xmm1 are not used at all.returnz+z;
}
Further confusion
But wait, in our example it just so happened that 1 + 2 + 3 = 3 + 3 . Surely that is not the case everywhere, right?
What if we try with take_three_floats(1,2,4) == 7 ?
$ ./fun
Assertion failed: (takes_three_floats(1,2,4) == 7), function main, file fun.c, line 15.
[1] 72759 abort ./fun
After inspection, the result returned is 8. Same code in take_three_floats.
How about take_three_floats(1,32767,32768) == 65536 ?
$ ./fun
All good.
How about we add a fourth parameter? have a take_four_floats(1,2,3,4) == 10 ?
$ ./fun
Assertion failed: (takes_four_floats(1, 2, 3, 4) == 12), function main, file fun.c, line 15.
[1] 73583 abort ./fun
After inspection, the result is 16, and the generated code is... :
Hey readers! Tell me if you liked it - I might want to do more frequent studies of assembly code like that, going very much in-depth, in a better presentation format on http://amos.me.
Let me know if you'd like that! I'm @nddrylliog on Twitter.
Hey readers! Tell me if you liked it - I might want to do more frequent studies of assembly code like that, going very much in-depth, in a better presentation format on http://amos.me.
Let me know if you'd like that! I'm @nddrylliog on Twitter.