This is a lab that walks you through one approach to the challenge of how we can have easy and high performance language virtual machines.
It's designed to come after Mario Wolczko's lecture on a Concise and Opinionated History of Virtual Machines, and illustrates practically some of the concepts and techniques introduced there.
This lab is about just one approach to easy and high performance VMs, and that is the Truffle and Graal approach. Truffle is a framework for developing self-specialising AST interpreters in Java, and Graal is a dynamic compiler, which Truffle uses on your behalf to give you a just-in-time compiler for your interpreter, using partial evaluation.
This lab of course won't teach you how to work on a production language or compiler in an hour and a half, but I hope it can concretely do is to show how Truffle makes it easy to implement a language, and how accessible the Graal compiler is, so you know you can reach for these tools if you need them in future.
- Chris Seaton
- [email protected]
- https://twitter.com/ChrisGSeaton
-
Refresh the big ideas of self-specialising AST interpreters and partial evaluation
-
Set up for Truffle development
-
Tour a simple language interpreter in Truffle
-
Add new functionality to the simple language
-
Combining Truffle with Graal for partial evaluation
-
Tour Graal
-
Stretch ideas
You will need:
-
Internet access
-
Some basic development tools like Python 2 and Git
-
A Java 8 JDK and some Java ecosystem tools like Maven 3
-
An IDE like Eclipse (Neon and Oxygen version both work), or a plain text editor and
grep
if you prefer -
Probably in practice you need me to talk you through the tasks - it's supposed to be interactive
Mario talked about two techniques used in implementing high-performance language implementations easily - AST interpretation and partial evaluation. We can demonstrate these two key ideas to ourselves in order to make sure that they're clear.
Here's a simple AST interpreter in Python.
I haven't written a parser, as we'll leave parsers as a solved problem for this lab, and instead we'll create AST nodes directly by using their constructors.
You can see the essential parts of the AST interpreter - the node classes,
and the execute
methods. See how the execute
method of the AddNode
calls execute
on the left and right operands.
class ConstantNode:
def __init__(self, value):
self.value = value
def execute(self):
return self.value
class AddNode:
def __init__(self, left, right):
self.left = left
self.right = right
def execute(self):
return self.left.execute() + self.right.execute()
class SubNode:
def __init__(self, left, right):
self.left = left
self.right = right
def execute(self):
return self.left.execute() - self.right.execute()
print AddNode(ConstantNode(6),
SubNode(ConstantNode(4), ConstantNode(2))).execute()
Tasks:
- Add
MulNode
andDivNode
for a multiply and divide operation.
This just means adding a new class, and then you can see how you add the logic
for the operation in the execute
method. Notice how the code needed for the
new operations are contained within the new classes - we don't have to modify
a central execute
method. This is an advantage of writing AST interpreters.
- Manually partially evaluate your interpreter for the expression
1 + 2 - 3
Partial evaluation is a technique for compiling or executing an AST interpreter. What it means in practice is that we take the code for a call to an execute method and inline that code.
AddNode(ConstantNode(1), ConstantNode(2)).execute()
ConstantNode(1).execute() + ConstantNode(2).execute()
1 + 2
3
This is effectively what the Graal partial evaluator we're going to be using does.
To look at a Truffle language we're going to need the latest release of GraalVM. You can download it for Linux from GitHub, or from OTN for macOS:
- https://github.com/oracle/graal/releases
- http://www.oracle.com/technetwork/oracle-labs/program-languages/downloads/index.html
After extracting it, set the JAVA_HOME
environment variable to point to it (to
Contents/Home
on macOS).
You then want to clone an example Truffle language, SimpleLanguage. SimpleLanguage is a bit like a simpler version of JavaScript and is heavily commented to show how Truffle is used to build a language. After cloning it you can build it using Maven - it's just a normal Java application.
$ export JAVA_HOME=`pwd`/graalvm-ee-1.0.0-rc3
$ git clone https://github.com/graalvm/simplelanguage.git
$ cd simplelanguage
$ mvn package
Tasks:
-
Compile SimpleLanguage
-
Run the SimpleLangauge
HelloWorld.sl
test program
Look at the sl
launcher program, and the simplelanguage/language/tests
directory.
- Run some more test programs that look interesting.
SimpleLanguage is designed to be easy to read. A good way to navigate it is to
also use the Eclipse IDE. Install the m2e
and m2e-apt
plugins from Help,
Eclipse Marketplace, and then do File, Import, Maven, Existing Maven
Projects to import SimpleLanguage.
Tasks:
- Find the AST node Java class that implements the addition operator
Use Navigate, Open Type to search for classes.
You'll see that it appears to have multiple execute
methods, unlike our Python
example, and that the results of executing child nodes are passed in as
parameters to the method, rather than executing them yourself. Each overload
matches different pairs of types from the results from the children. This
reduces the number of if
branches you need to write to match the types that
your operators support.
Read the comments in this file.
- Find where the parser emits the node
A companion class called SLAddNodeGen
is created automatically by Truffle.
Look at SLNodeFactory
to see the node being created. Note how the parser
directly creates a SLAddNode
- there are no other intermediate
representations.
- Find the
if
node, and where the parser emits it
The SLIfNode
looks different to SLAddNode
and doesn't have child values
passed in as parameters. Why do you think that is? What does this logic talking
about profiling do?
- Find the
return
node - who catches the exception it throws?
SLReturnNode
throws an exception. Why is it doing that if there's no error.
Who catches this exception? Use Eclipse's right click and Find References on
the class name to find out.
- Add support for the
<
operator comparing two strings for lexical ordering
Can you get this program working by modifying SLLessThanNode
? You will need
to run mvn package
before running sl
again.
function main() {
println("a" < "b");
println("b" < "a");
println("a" < "a");
}
- Run the SimpleLanguage
LoopCall.sl
test program
LoopCall.sl
runs in a loop, so we'd expect parts of it to get hot and get
compiled. Run with the -J-Dgraal.TruffleBackgroundCompilation=false -J-Dgraal.TraceTruffleCompilation=true
options to see this happening. root add
is the root node, of the add
function, being compiled.
- Dump Graal graphs from
LoopCall.sl
to IGV
IGV is a graphical tool to understand what Graal and Truffle are doing. Download IGV from GitHub:
https://github.com/oracle/graal/releases
$ idealgraphvisualizer/bin/idealgraphvisualizer &
When you open IGV, click Remove State on the right-hand side, to make the graphs simpler.
Then run sl
with the -dump
option. Explore the tree on the left of the IGV
screen.
-
Find the Truffle AST graph in IGV
-
Find the add node in the Graal graph for the
add
function in IGV
Look at After TruffleTier
and see if you can make some sense of the compiler
nodes you see there. Can you relate them back to the Java code for the AST
interpreter?
-
Look at the graphics for the
loop
function. How does Graal represent thewhile
loop? How is it different to how thewhile
loop is represented in the Truffle AST? -
Dump assembly from
LoopCall.sl
, and find the relevantadd
assembly instruction in the compiledadd
function
Use the -disassemble
option.
-
Why is the
add
instruction followed by ajo
instruction and what does this do? -
Where does the
jo
instruction jump to?
To build Graal in order to open it in an IDE, we need to install a special JVM with a feature called JVMCI. You can get this for Linux from GitHub or OTN for macOS.
- http://www.oracle.com/technetwork/oracle-labs/program-languages/downloads/index.html
- https://github.com/graalvm/openjdk8-jvmci-builder/releases/tag/jvmci-0.46
Then clone our special build tool, mx
, and then Graal (clone at depth 1 to
save time).
$ git clone https://github.com/graalvm/mx.git
$ export PATH=`pwd`/mx:$PATH
$ git clone --depth 1 https://github.com/oracle/graal.git
$ cd graal/compiler
$ JAVA_HOME=~/Documents/labsjdk1.8.0_172-jvmci-0.46/Contents/Home mx eclipseinit
Open Eclipse, and open the graal
directory as a workspace.
File, Import, General, Existing Projects Into Workspace and select the
graal
directory again.
- Find the Graal IR graph node Java class that represents the
add
instruction we had in theadd
SimpleLanguage function
Like SimpleLanguage had SLAdd
, Graal has a node in the compiler for
IntegerAddExactNode
, that we saw when we used IGV. How does this class differ
from the AST node? What do you think the canonical
method does?
- Find where this node is created from Java bytecode
Can you find where this node is created? Is is created by a parser? What kind of parser is it?
- Can you follow the code from
IntegerAddExactNode
to find where that jump on overflow branch is created, and what bytes get emitted?
You should end at the jcc
method in AMD64Assembler
. What is addPatchAt
all
about?
-
What happens in
IntegerAddExactSplitNode
if the operation cannot overflow? -
Follow this code through to find what machine-code bytes are emitted for add-with-overflow on the AMD64 architecture
-
Find the Truffle JavaScript implementation, clone it and explore it
-
See if you can add a new language feature to JavaScript