In Vladimir's blog post it's said that GCC can remove a loop that the JVM cannot. This isn't true - the JVM and our Graal JIT compiler can remove the loop in the example.
The loop is this method, which we're then running within another loop to trigger
compilation of the method foo
.
def foo
i = 0; while i < 100_000; i += 1; end;
i
end
loop do
foo
end
You can use our tool for understanding what the compiler is doing, called IGV.
You need to build Graal from source and then run mx igv
. Then you can run:
$ graalvm-0.32/Contents/Home/bin/ruby -J-Dgraal.Dump=:2 -J-Dgraal.TruffleCompileOnly=foo test.rb
If you look at the graph of foo
after the last mid-tier has run. You can see
it has no loop and just returns the constant value 100_000
.
You can also look at the assembly:
$ graalvm-0.32/Contents/Home/bin/ruby -J-XX:+UnlockDiagnosticVMOptions -J-XX:+PrintAssembly -J-Dgraal.TruffleCompileOnly=foo test.rb
0x000000011158e140: mov DWORD PTR [rsp-0x14000],eax
0x000000011158e147: sub rsp,0x18
0x000000011158e14b: mov QWORD PTR [rsp+0x10],rbp
0x000000011158e150: mov rbx,rdx
0x000000011158e153: mov rax,QWORD PTR [r15+0x60]
0x000000011158e157: lea rsi,[rax+0x10]
0x000000011158e15b: movabs r10,0x7c0011338 ; {metadata('java/lang/Integer')}
0x000000011158e165: cmp rsi,QWORD PTR [r15+0x70]
0x000000011158e169: ja 0x000000011158e1a9
0x000000011158e16f: mov QWORD PTR [r15+0x60],rsi
0x000000011158e173: prefetchnta BYTE PTR [rax+0xd0]
0x000000011158e17a: mov rsi,QWORD PTR [r10+0xa8]
0x000000011158e181: mov QWORD PTR [rax],rsi
0x000000011158e184: mov DWORD PTR [rax+0x8],0xf8002267
; {metadata('java/lang/Integer')}
0x000000011158e18b: mov DWORD PTR [rax+0xc],r12d
0x000000011158e18f: mov DWORD PTR [rax+0xc],0x186a0
0x000000011158e196: mov rbp,QWORD PTR [rsp+0x10]
0x000000011158e19b: add rsp,0x18
0x000000011158e19f: test DWORD PTR [rip+0xfffffffff493be61],eax # 0x0000000105eca006
; {poll_return}
0x000000011158e1a5: vzeroupper
0x000000011158e1a8: ret
There's no loop in there (the ja
is a jump forward to a deoptimisation stub,
not a backwards jump for a loop), and you can see instruction
0x000000011158e18f
just loads 0x186a0
, which is hexademical for 100_000
,
into the object to return.