- Assigned: Oct. 14
- Due: Oct. 30, 11:59 PM
Note: Keshav Kini ([email protected]) is the lead TA for this lab assignment.
This lab is designed mainly to give you some experience in programming in C, and a bit of experience with x86-64 assembly. You will be writing a very basic compiler – in other words, a program that produces x86-64 assembly code when you run it.
Your program will take in a reverse Polish notation arithmetic expression and produce assembly code for a function that computes the value of that expression. This will be explained in detail later in this document.
These instructions are quite long. Read them carefully!
This is an individual project. You may not share your work on lab assignments with other students, but do feel free to ask instructors for help if you are stuck, for example during office hours or a discussion section.
You will turn in a tarball (a .tar.gz
file) containing the source
code for your program, written in C. You must include a Makefile
,
such that we can build your code using the make
command. (If you
did Lab 0, you will already be familiar with writing makefiles.)
Ensure that your code compiles and runs correctly on
skipper.cs.utexas.edu
(I’m just choosing a fixed machine for
consistency).
Any clarifications and corrections for this lab will be posted on Canvas and Piazza, and if necessary, fixes will be committed to this file.
Reverse Polish notation (RPN) is an alternative way of representing arithmetic expressions like those you might type into a calculator. At one point in time, it was popular to use RPN in physical calculators because it was easier to implement a calculator that read RPN. Similarly, it will be easier for you to do this assignment with RPN than with the normal style of writing arithmetic expressions.
Here are some examples to give you an idea of how RPN works.
- Normal:
1 + 1
RPN:
1 1 +
- Normal:
3 * (4 + 7)
RPN:
3 4 7 + *
- Normal:
(8 + 9) * (4 + 3 * 10) - 5
RPN:
8 9 + 4 3 10 * + * 5 -
Do you see how it works? In RPN, arithmetic operators like +
, *
,
and -
come after their two operands, rather than in between them.
As you enter each number or operator, the RPN calculator proceeds with
its computation. When you enter a number, the calculator stores the
number on a stack. When you enter an operator, it pops the latest two
numbers off of the stack, applies the operator to them, and pushes
the result back onto the stack.
Notice how we didn’t need any parentheses to write expressions in RPN! Yes, you might think I cherry-picked examples where RPN didn’t need parentheses, but if you think about it for a while longer you’ll see that parentheses are never necessary in RPN! (Hint: it’s because all the operators take exactly 2 arguments.) That certainly makes it simpler to design the calculator, since you don’t have to keep track of nested parentheses.
Here is an example of how an RPN calculator would process that last example from above, as given by how the calculator’s stack looks after each new input.
8
8 9
17
17 4
17 4 3
17 4 3 10
17 4 30
17 34
578
578 5
573
You can read more about reverse Polish notation on Wikipedia.
We define in this section a simple compiler that takes an expression
on the command line and dumps assembly code to stdout
(standard
output). The assembly code will be a function called compute
that
accepts as arguments three values for the variables x
, y
, and z
respectively, and returns result of the expression.
Input to your program is an expression (sequence of tokens) given on
the command line. This expression may contain integer constants,
variables, and operators. Integer constants may be positive, zero, or
negative, meaning that they will consist of numeric digits, possibly
with a minus sign at the beginning. The only variables allowed are
x
, y
, and z
(because the assembly function your program will
produce will be a function that accepts three arguments, x
, y
, and
z
). The following operators are allowed: +
, *
, -
(plus,
times, minus). Expressions are written in reverse Polish notation (as
described earlier).
A couple of sample invocations of your program might be:
$ ./compiler x 4 + 2 "*" 4 -
[some assembly code is printed here]
$ ./compiler x 4 + 2 4 "*" -
[some assembly code is printed here]
Note: In most shells (including bash
, the one you are probably
using), the *
character is used to mean “all the files in the
current directory”, so unfortunately, the *
needs to be put in
quotation marks for our purposes, otherwise the shell will replace it
with a list of all the files in the current directory.
If variable x
has value 3, these expressions should equal 10 and -1,
respectively. But if variable x
has value 4, the expressions should
equal 12 and 0, respectively.
Therefore, the assembly code produced by the first run of your
compiler should be a function that returns the number 10 when called
with argument x
set to 3, and the number 12 when called with
argument x
set to 4. Meanwhile, the second piece of assembly code
should be a function that returns -1 when 3 is passed as argument x
,
and returns 0 when 4 is passed as argument x
.
The output of your program will be some x86-64 assembly code, printed
to stdout
(standard output). This code should contain a function
named compute
which accepts three arguments according to the
standard x86-64 calling convention. The function should be designed
such that it always returns the value of the input RPN expression
where the values of x
, y
, and z
have been substituted in. More
on that below.
First, think about how you evaluate RPN. Given an RPN expression, do the following:
- If you encounter a variable, push its value on the runtime stack.
- If you encounter a constant, push it on the runtime stack.
- If you encounter an operator, pop the right number of arguments, apply the operator, and push the result.
- If done, the runtime stack should simply contain the answer, so retrieve the answer and return it.
Note that we are directly using “the stack” as our RPN calculation stack, rather than making a separate stack data structure or something.
Various problems might occur. For example,
- You might encounter an unrecognizable symbol.
- There may not be enough arguments on the stack for the operator you are attempting to apply.
- There may be more than one value left on the stack when you are finished.
I’d suggest maintaining an integer variable to keep track of how many values would be on the stack at any point during your compilation / execution.
If you encounter any of these errors, print a useful error message to stderr, not stdout, and terminate with a nonzero exit code. Examples of illegal inputs are:
a
(because the variablea
is not allowed – onlyx
,y
, andz
are allowed)4 +
(because+
needs two arguments)3 4
(because it will leave two values on the stack)3 4 4 2 +
(because it will leave three values on the stack)+
(because+
needs two arguments)
Aside: It’s important to print to stderr
rather than stdout
so that
the user can separate the assembly code you’re trying to output from
any error messages you might produce. For example, when you redirect
stdout
to a file using the special shell directive ”>
”, stderr
messages will still be printed to the screen. Sample invocations
would look like this:
$ ./compiler 1 2 3 4 + + + > output1.s
$ ./compiler 1 2 3 4 "*" - 5 6 7 + + - "*" 8 9 "*" + > output2.s
$ ./compiler 2 "*" 2 > output3.s
Error: The * operator requires two arguments
$ ./compiler 2 2 "*" > output4.s
In all four invocations, any assembly instructions printed to stdout
will be silently put in the specified files, which we are naming with
the .s
extension because that extension is used for assembly code
files. But in the third invocation, there was an error message, and
that didn’t get sent to the output3.s
file, but rather to the
screen, because ”>
” only redirects stdout
, not stderr
.
Now, imagine parsing a legal RPN expression and generating an x86-64
subroutine that would evaluate it. Our expressions may contain only
variables x
, y
, and z
. We assume that these are passed (in
order) to your program via the usual calling mechanism. That is, x
,
y
, and z
are passed in the registers %rdi
, %rsi
, and %rdx
,
respectively. You should return your answer in register %rax
, which
again is specified by the standard calling conventions.
Consider the following RPN expression:
x 4 + 2 * 4 -
A first cut at compiling this might be to generate the following sequence of x86-64 instructions (probably without the comments):
.globl compute # Note: this marks that the label
# "compute:" is a function name
# that can be used from other code.
compute:
pushq %rdi # put x on the stack
pushq $4 # put 4 on the stack
popq %r10 # . add the top two things
popq %r11 # |
addq %r10, %r11 # |
pushq %r11 # '
pushq $2 # put 2 on the stack
popq %r10 # . multiply the top two things
popq %r11 # |
imulq %r10, %r11 # |
pushq %r11 # '
pushq $4 # put 4 on the stack
popq %r10 # . subtract the top two things
popq %r11 # |
subq %r10, %r11 # |
pushq %r11 # '
popq %rax # prepare to return the answer
retq # return from the function
But this is very inefficient. We are pushing numbers on the stack and
immediately popping them off again into registers. Also, we’re
popping numbers off the stack into registers before performing
arithmetic operations on them. It’s true that in x86-64, we can’t
write arithmetic instructions where both the source and destination
are in memory. But sometimes we’ll want to operate on an integer
constant and something on the stack, which can be done with some
instructions (like add
) and not others (like imul
). So maybe we
can optimize a bit.
Here’s a second attempt:
.globl compute
compute:
pushq %rdi # put x on the stack
addq $4, (%rsp) # add 4 (directly in memory)
popq %r8 # . multiply by 2 (using a temporary
imulq $2, %r8 # | register)
pushq %r8 # '
subq $4, (%rsp) # subtract 4 (directly in memory)
popq %rax # prepare to return the answer
retq # return from the function
Notice that add
and sub
can target a memory location, but imul
must target a register. Experiment with (or search the web for
information about) different instructions to see which ones can target
a memory location and which ones have to use registers.
You can use the as
command to call the GNU assembler on your
assembly code, to check if it is valid x86-64 assembly. Use the
--64
flag to ensure you’re assembling as x86-64 as opposed to 32-bit
x86 or any other kind of assembly code.
Now, there’s even more optimization we might be able to do. Notice
how we happened to never put more than one thing on the stack at a
time. So why don’t we just use a register for that one location on
the stack we keep using? Then we don’t need to access memory at all.
And in fact we can just use the %rax
register, into which we were
eventually going to pop that stack value anyway.
Here’s a third attempt:
.globl compute
compute:
movq %rdi, %rax # put x in %rax
addq $4, %rax # add 4 to %rax
imulq $2, %rax # multiply %rax by 2
subq $4, %rax # subtract 4 from %rax
retq # return from the function
Note that we’ve started doing optimizations that are pretty specific
to the exact expression we started out with, x 4 + 2 * 4 -
. They
might not work for all expressions. That means that if you want to
implement these optimizations, your C code that generates this
assembly will have to be smart and know when it can and can’t do
certain optimizations.
For your compilation, do not use callee saved registers – in other words, only use the following registers:
%rax
: for the return value and/or intermediate storage%rsp
: the stack pointer%rdi
,%rsi
, and%rdx
: the registers used for passing in the three arguments%rcx
,%r8
, and%r9
: the three unused argument-passing registers%r10
, and%r11
: the other two caller-saved registers
You can reuse registers if it’s safe to do so, and you can push things onto the stack if you run out of registers at any point.
Your code should be able to handle arbitrary RPN expressions that follow our constraints on variables.
You don’t want to submit your code without testing it first, of course! In this section we’ll cover the various ways in which you can make sure your code is working correctly.
This one is obvious. Make sure your C code doesn’t have any syntax
errors, etc., and make sure that your Makefile
is in order. You
should be able to go to the directory where your code is, type make
,
and have an executable file called compiler
be created in the
directory. Test this on skipper.cs.utexas.edu
as well – that
machine has an environment similar to what we’ll use for grading.
Make sure that when you run the compiler
executable that you
created, with a valid RPN expression passed in as a series of command
line arguments, you see valid assembly code printed to the screen.
Try some of the example invocations mentioned earlier in this README, and make up some more of your own.
Beyond just eyeballing it, make sure that the output is valid assembly
code by using the special shell directive ”>
” (as mentioned earlier)
to make your program direct its output into a .s
file such as
test.s
, and then run the assembler on it to see if there are any
syntax errors or other problems:
$ as --64 test.s
(As mentioned before, the --64
is to ensure that the assembly code
is treated as x86-64 assembly code.)
Your program is required to produce assembly code for a function
called compute
which takes three 64-bit signed integer arguments and
returns a 64-bit signed integer argument. In C, that means the
function would have looked something like this:
#include <stdint.h> // this needs to be at the top of the file
int64_t compute (int64_t x, int64_t y, int64_t z) {
// do stuff
return something;
}
(In practice you’d often write long long
rather than int64_t
, and
then you wouldn’t need the #include <stdint.h>
, but unlike long
long
, the int64_t
type is guaranteed to be exactly 64 bits long,
no matter what compiler or architecture you use.)
But since you’re writing the function in assembly, if you wanted to
use it in some C code, you would want to sort of “import” it. This is
how you do it in C: first, somewhere in your C code, you make an
extern
declaration of the function, like so:
#include <stdint.h> // this needs to be at the top of the file
extern int64_t compute (int64_t x, int64_t y, int64_t z);
Notice that no function body is provided, only the argument types and
return type and the name of the function, with the word extern
(for
“externally defined”). Then, when you compile your C code, just
specify the name of the assembly file which actually contains the
function, and gcc will hook it all up nicely. For example,
$ gcc -o full_program some_program.c compute_function.s
So, to test that your assembly code provides the correct function,
write some random little C program which uses the compute
function
you defined (maybe it calls compute
on some arguments and prints the
result? or whatever), making sure to put in the extern
definition
shown above.
Then compile the program, while telling gcc the name of the file where
you saved the assembly code output from your compiler
. If there
were no errors, gcc was able to read your assembly code as a
definition of a compute
function, as desired. (If you follow the
format shown in the assembly code examples above, you shouldn’t have
any problems.)
By the way, if you’re interested in how this linking process works, check out Chapter 7 in the textbook, which we may get to later in the semester.
Try compiling a few RPN expressions of your own choosing. For each
one, your program will produce some assembly code. Create a C program
that will link with your assembly code (as described in the previous
section) and test it by calling it with some particular values of x
,
y
, and z
and printing the result. Compile the test program
together with the assembly code and run the resulting executable to
see if it prints the number you expected.
Make sure to try various different kinds of RPN expressions, to make sure you did everything right. Try ones with many operators, ones with many numbers in a row, ones with negative numbers, ones with really large numbers, etc.
If you’re feeling really ambitious, you might try to write a
randomizing test harness that automatically tests your compiler on
many different randomly generated RPN expressions, and substitutes in
many different randomly generated values of x
, y
, and z
for each
RPN expression!
Once you’ve finished programming, it’s time to submit your work. Add
a README file explaining the optimization tricks you used to make your
assembly code faster, if any, and then use the tar
utility to create
an archive (“tarball”) of the directory in which you’ve created your
code.
Ideally the tarball should only contain source code, so you should get
rid of any compiled files, generated assembly files, and other such
cruft before making the tarball. But really, as long as we can type
make
after extracting your tarball and get a working compiled
version of your code, that’s good enough. (Make sure to test that
your tarball works on skipper.cs.utexas.edu
, as mentioned earlier.)
Your assignment is to write a compiler program as described above.
The basic grading policy for this lab is that you’ll get 90 points if
you create a program that correctly does what it is supposed to do –
that is, accepts RPN expressions on the command line and dumps
assembly code to stdout
which defines a function called compute
that takes arguments x
, y
, and z
and returns the value of the
RPN expression when those values are substituted in for the variables
x
, y
, and z
in the expression.
Even if you produce straightforward but inefficient assembly code such as that shown in the first example above, you will still earn the 90 points if the code behaves correctly.
The remaining 10 points will be assigned if you optimize the assembly code you generate in some significant way. Two such ways are as follows.
- Avoid using the stack when you can help it, and instead use the various registers you have at your disposal, which are much faster than accessing memory.
- Use constant folding to reduce the number of actual computations
that would take place at runtime.
For example, if your compiler was given an RPN expression that was constant (i.e. didn’t involve any variables), then your compiler could just figure out the integer value of the expression and create an extremely fast, two-instruction function that just wrote the correct answer into
%rax
and returned, without doing any work!Even for RPN expressions that do involve variables, maybe you could do partial constant folding – for example, simplify
x 1 2 3 4 + + + +
tox 10 +
.If you are feeling really ambitious, there are even more sophisticated forms of partial constant folding that you might try to explore, such as exploiting commutativity, factoring polynomial-like expressions, etc.
With your submission, you should include a README file explaining what optimization techniques your compiler uses.
Beyond the 90 + 10 breakdown given above, determination of further partial credit will be left up to the discretion of the grader.