Back in January I wrote bf2y which is a brainfuck to YARA compiler. bf2y takes in an arbitrary brainfuck program and outputs the instructions to execute the brainfuck code on the YARA virtual machine (well, a slightly modified VM). If you want the full details of how it works go read the code, but I want to talk about writing a Fibonacii number generator for it.
Brainfuck operates on a simple virtual machine with a very limited set of instructions. The virtual machine has a number of memory cells and a pointer. You can move the pointer forwards and backwards one cell at a time and increment and decrement the value in the memory cell. You can also read a byte of input into the current memory cell, and output the value in the cell. Lastly, you have the ability to jump forwards and backwards depending upon the value in the current memory cell.
These are the commands you get:
+ Increment the value in the current cell by 1
- Decrement the value in the current cell by 1
> Move the pointer to the next cell
- Move the pointer to the previous cell
, Read a byte of input into the current cell
. Output the value in the current cell
[ Move past the matching ] if the value in the current cell is zero
] Move back to the matching [ if the value in the current cell is non-zero
That is all you have to program this machine. It isn't much. I had already demonstrated you could compile brainfuck programs down to YARA bytecode and have it execute properly, but I had never written a program of my own. When I did the initial work back in January the point wasn't to learn how to write a brainfuck program, so I never wrote one of my own from scratch. Since I already had a solid understanding of the instructions by implementing them in the YARA virtual machine, I decided to write a Fibonacci generator in brainfuck and execute it on the YARA VM.
,>+>+<<[->>[->+>+<<]<[->>+<<]>>[-<+>]>[-<<<+>>>]<<<<]>>.
That is it. That is the entirety of the program. Let's break it down by adding some formatting and comments.
, Read into c0
>+>+ c1 = 1; c2 = 1
<< Move back to c0
[ Main loop start
->> Decrement c0 move to c2
[->+>+<<] Add c2 to c3 and c4; c2 = 0
< Move to c1
[->>+<<] Add c1 to c3; c1 = 0
>> Move to c3
[-<+>] Add c3 to c2; c3 = 0
> Move to c4
[-<<<+>>>] Move c4 to c1
<<<< Move to c0
] End main loop; jump back if c0 != 0
>> Result is in c2
. Print it
I find it easier to track this if you follow along with a pen and paper, tracking the value in each of the cells at each instruction. As you do this you will start to see some patterns:
- c0 is our main loop counter
- c1 and c2 hold the current values to add
- c3 holds the sum of c1 and c2 (and eventually gets moved to c2)
- c4 holds the old value of c2 (and eventually gets moved to c1)
The program reads a byte of input and prints out the Nth + 2 Fibonacci number. It does this because I didn't count the first two numbers in the sequence, which I initalized early in the program. Another thing to note is that the program does not print the Fibonacci numbers as it goes, it simply prints the Nth + 2 number in the sequence.
Yes, it works. It starts out fast but becomes unbearably slow:
wxs@wxs-mbp yara % for i in $(jot 50 1); do x=$(printf "%.2x" $i); echo -n "0x$x -> "; /usr/bin/time /bin/sh -c "echo \"\x$x\" | ./bf2y"; done
0x01 -> 2 0.03 real 0.01 user 0.02 sys
0x02 -> 3 0.02 real 0.00 user 0.02 sys
0x03 -> 5 0.03 real 0.01 user 0.03 sys
0x04 -> 8 0.03 real 0.00 user 0.02 sys
0x05 -> 13 0.03 real 0.00 user 0.02 sys
0x06 -> 21 0.02 real 0.00 user 0.02 sys
0x07 -> 34 0.02 real 0.00 user 0.01 sys
0x08 -> 55 0.02 real 0.00 user 0.01 sys
0x09 -> 89 0.02 real 0.00 user 0.01 sys
0x0a -> 144 0.02 real 0.00 user 0.02 sys
0x0b -> 233 0.03 real 0.00 user 0.02 sys
0x0c -> 377 0.02 real 0.00 user 0.01 sys
0x0d -> 610 0.02 real 0.00 user 0.01 sys
0x0e -> 987 0.02 real 0.00 user 0.01 sys
0x0f -> 1597 0.02 real 0.00 user 0.01 sys
0x10 -> 2584 0.02 real 0.00 user 0.01 sys
0x11 -> 4181 0.02 real 0.00 user 0.01 sys
0x12 -> 6765 0.02 real 0.01 user 0.01 sys
0x13 -> 10946 0.03 real 0.01 user 0.02 sys
0x14 -> 17711 0.03 real 0.01 user 0.01 sys
0x15 -> 28657 0.03 real 0.01 user 0.01 sys
0x16 -> 46368 0.04 real 0.02 user 0.02 sys
0x17 -> 75025 0.04 real 0.02 user 0.02 sys
0x18 -> 121393 0.05 real 0.03 user 0.01 sys
0x19 -> 196418 0.07 real 0.05 user 0.01 sys
0x1a -> 317811 0.10 real 0.08 user 0.01 sys
0x1b -> 514229 0.15 real 0.13 user 0.02 sys
0x1c -> 832040 0.23 real 0.21 user 0.02 sys
0x1d -> 1346269 0.35 real 0.33 user 0.02 sys
0x1e -> 2178309 0.56 real 0.54 user 0.02 sys
0x1f -> 3524578 0.89 real 0.86 user 0.02 sys
0x20 -> 5702887 1.42 real 1.39 user 0.02 sys
0x21 -> 9227465 2.24 real 2.22 user 0.02 sys
0x22 -> 14930352 3.65 real 3.62 user 0.02 sys
0x23 -> 24157817 5.94 real 5.91 user 0.02 sys
0x24 -> 39088169 9.75 real 9.69 user 0.03 sys
0x25 -> 63245986 17.05 real 16.67 user 0.08 sys
0x26 -> 102334155 27.23 real 26.89 user 0.14 sys
0x27 -> 165580141 43.22 real 42.85 user 0.14 sys
0x28 -> 267914296 67.14 real 66.84 user 0.12 sys
0x29 -> 433494437 109.93 real 109.26 user 0.26 sys
0x2a -> 701408733 180.41 real 179.06 user 0.52 sys
0x2b -> ^C 268.46 real 265.72 user 0.74 sys
0x2c -> ^C 1.91 real 1.87 user 0.02 sys
0x2d -> ^C 0.24 real 0.22 user 0.02 sys
0x2e -> ^C 0.03 real 0.01 user 0.01 sys
0x2f -> ^C 0.03 real 0.01 user 0.01 sys
0x30 -> ^C 0.03 real 0.01 user 0.01 sys
0x31 -> ^C 0.03 real 0.01 user 0.01 sys
0x32 -> ^C 0.03 real 0.01 user 0.01 sys
wxs@wxs-mbp yara %
As you can see, I killed the run because it was taking too long to generate each sucessive number from scratch. Maybe in the future I will convert the brainfuck code to print the number as it goes as a party trick.
I'm willing to bet there are some clever optimizations to be made here too. ;)
Grab the latest code from my bf2y branch and build it. The build process is documented in the main gist I wrote in January. Once you have it built you can write your own brainfuck program and execute it on the YARA VM yourself. One thing to note is that in the process of writing this I made the output command print an unsigned long long, so if you want to experiment with it printing characters you need to do the ASCII conversion in your head or revert the bottom part of this commit.
Because exploring computation, new machines and new languages is fun. You don't really know how something works until you see it in action for yourself. Everything else is just computing via abstractions.