Sometimes more code in a source language, can mean less code in a generated language.
NOTE: below we used
gcc 5.3
with-std=c++14 -O3
flags.
On the example of C programming language, and its translation to x64 assembly, let's take a look at the following example:
// 7 times `x`.
int foo(int x) {
return x + x + x + x + x + x + x;
}
int main() {
return foo(1);
}
A generated code (applying an optimizing compiler) would look something like this:
foo(int):
leal (%rdi,%rdi,4), %eax
leal (%rax,%rdi,2), %eax
ret
main:
movl $7, %eax
ret
While if we add even more code in the source language, say 8 times for x
:
// 8 times `x`.
int foo(int x) {
return x + x + x + x + x + x + x + x;
}
int main() {
return foo(1);
}
We can get less generated code, taking just one instruction for the actual calculation:
foo(int):
leal 0(,%rdi,8), %eax
ret
main:
movl $8, %eax
ret
How this happens and why, is related to the format of the effective address calculation arithmetics. And in fact, the lea
instruction semantically is more related to exactly address manipulations (like array elements access, fields of objects, etc).
However, compilers often reuse lea
to do simple arithmetic evaluations, although only those that fit the format of the lea
(we covered such optimizations in this gist). You can also get additional info how this generic memory access format looks like here.
In case when the capabilities of lea
are exceeded, compiler should fall-back to e.g. add
instruction since the scale can only be 1, 2, 4 or 8. Having the code for 6 times x
, we again get two instructions, but now already with less optimal addl
instruction:
// 6 times `x`.
int foo(int x) {
return x + x + x + x + x + x;
}
int main() {
return foo(1);
}
In assembly they are leal
and addl
:
foo(int):
leal (%rdi,%rdi,4), %eax
addl %edi, %eax
ret
main:
movl $6, %eax
ret
In any case though, an optimizing compiler evaluated the whole resulting expression at static time, and the result from main
function is a direct constant, instead of a runtime function call. Also notice, the optimizing compiler of course doesn't do a bunch of add
instructions (6, 7 or 8 times), but instead as we said uses an "unrelated" lea
instruction for this.
NOTE: try compiling without full optimization, i.e. excluding
-O3
flag.
Have fun with compilation ;)
P.S.: And of course depends on a compiler version, different translation strategies can be chosen. E.g. that example with 7 times x
instead of two lea
s can still be compiled with one, though less optimal instruction -- imul
(actual multiplication). This is what we get in gcc 6:
foo(int):
imull $7, %edi, %eax
ret
main:
movl $7, %eax
ret
Written by Dmitry Soshnikov.
Cool, thanks for the write up, I hadn't looked at the compiler's assembly output for ages.