(a deep dive into trampolines)
Divyansh Prakash, September 2023
In this post, I'm going to talk about one of the tradeoffs Clojure makes to be a good hosted language. That tradeoff is sticking to the host platform's function calling convention.
Unfortunately, despite being considered "high level languages", popular Java and JS VMs are still constrained by the machine's physical stack space.
This is not just a theoretical problem that fringe loons think about - it has real consequences like lazy sequences blowing the stack in Clojure, leading to bugs and lots | of | discussion.
What this means is that code like the following:
;; Clojure
(defn fact [x]
(if (< x 2)
1
(*' x (fact (dec x)))))
;; (fact 5) => 120
;; (fact 20000) => stack overflow
which is roughly equivalent to the following JS:
// JS
function fact(x) {
x = BigInt(x);
if (x < 2)
return 1n;
return x * fact(x - 1n);
}
// fact(5) => 120n
// fact(20000) => stack overflow
will quickly run out of stack space and throw a stack overflow
error,
even if you have TerraBytes of RAM lying idle.
It should be noted that this "error" is infamous enough that the planet's largest programming forum is named after it.
(You will also notice that the above code uses big integers
.
That's because integer overflow
is another restriction imposed by these
"high level" languages. But let's ignore that for the purpose of this post.)
This begs the question - why are Virtual Machines for High Level Languages constrained by the stack size of my physical machine? Yes, I can tune the JVM to have a bigger stack size, but that's still weird. You can't even control this on client browsers in the case of JS.
Since Clojure decides to be a good tenant and plays along with the host's constraints, it too ends up with the same issue.
However, Clojure gives us a tool to help us in such cases: trampoline.
The docs state that:
trampoline can be used to convert algorithms requiring mutual
recursion without stack consumption. [...]
You will notice that close to 100% of trampoline
examples online show mutually tail-recursive functions only.
Well, this is only partially correct - trampoline can also be used to "fix"
non mutually recursive functions like fact
above, as well as mutually recursive functions
whether recursive at the tail or not.
This means that there are almost NO examples out there showing the full power of trampoline!
That's very sad since I believe it was invented and put to good use by Schemers decades ago, and is integrated into most functional programming platforms by default. And Clojurians, despite being Lispers and functional programmers, are mostly unaware of this beautiful construct!
In this post, I'm going to implement a trampoline and show how it can be used to fix the factorial function above.
Here's a JS implementation of a trampoline:
class Thunk {
constructor(f) {
// f should be a function of no args
this.f = f;
}
}
function trampoline(f, ...args) {
// convert f(...args) into a function g of no args
let g = () => f(...args);
// make a Thunk from g
let t = new Thunk(g);
// process thunks
while(t instanceof Thunk)
t = t.f();
// return the non-thunk
return t;
}
Now we will see how trampoline
can be used to fix our factorial function.
The first step is to convert fact
to be tail recursive.
NOTE: I will leave implementing the Ackermann Function
as an excercise for curious readers to illustrate the difficulty of such a conversion. Try writing an implementation that can compute A(3, 10)
before reading further.
There is a very obvious way to convert fact
to be tail recursive using an accumulator,
but we are going to focus on a more general, mechanical way that can be done for arbitrary functions.
This process is called a CPS transform.
What this does is that instead of returning the result, fact
will now pass the result to a callback.
// k is the callback (continuation)
function k_fact(k, x) {
x = BigInt(x);
if (x < 2)
// call the callback with the result
return k(1n);
// tail recursion!
return k_fact(
// new k
(xMinusOneFactorial) =>
// call the callback with the result
k(x * xMinusOneFactorial),
// new x
x - 1n
);
}
Note: the tail recursion above can be optimized into a while loop which
updates the values of k
and x
instead of calling k_fact
again. This is called
Tail Call Optimization.
This new function can be called in the following manner:
// identity function
let id = (x) => x;
// new calling convention
k_fact(id, 5); // 120n
k_fact(id, 20000); // stack overflow
As you can see, k_fact
still consumes and blows the stack.
This is to be expected, since every recursive call to k_fact
creates a deeper nested callback k
.
We need one final transformation:
- instead of calling the continuation, the call should be wrapped in a
Thunk
- the function should return thunks representing "the rest of the computation"
function t_fact(k, x) {
x = BigInt(x);
if (x < 2)
// call to callback wrapped in thunk
return new Thunk(() => k(1n));
// recursive call wrapped in thunk
return new Thunk(() =>
t_fact(
(xMinusOneFactorial) =>
// call to callback wrapped in thunk
new Thunk(() => k(x * xMinusOneFactorial)),
x - 1n
)
);
}
Here's how we can call this function:
trampoline(t_fact, id, 5) // => 120n
trampoline(t_fact, id, 20000) // => 181920632...0000000000n
Magic! 🎉
We can even automate some of this stuff:
function stackless(t_fn) {
return (...args) =>
trampoline(t_fn, id, ...args);
}
which can be used in the following manner:
let factorial = stackless(t_fact);
factorial(5) // 120n
factorial(20000) // 181920632...0000000000n
Incredible!
This transformation, in my humble opinion, is second only to the Y Combinator in its beauty.
Let us now think about WHY our original (stackful) fact
implementation blows the stack?
The deliciously meta answer is that the JS interpreter's eval
function is a complex mutually recursive function!
It is the implementation's eval
function (written probably in C++) that is stackful!
Just imagine how an expression like f(a + b, c)
is eval
uated:
f
iseval
ed: it is looked up in the var table, and an object{params, body}
is received- the arguments to
f
areeval
eda + b
iseval
eda
iseval
ed: it is looked up in the var table, and1
is receivedb
iseval
ed: it is looked up in the var table, and2
is received1 + 2
iseval
ed and3
is received
c
iseval
ed: it is looked up in the var table, and4
is received
apply({params, body}, [3, 4])
is called- a new lexical scope is created, extending the current one
- in this new scope,
f
'sparams
are bound to3
and4
eval
is called for each statement inf
'sbody
This dance of eval
/ apply
, present in every evaluator, is a case of complex mutual recursion!
And since these functions are usually written in a lower-level machine-oriented language, they are susceptible to blowing up the machine's stack!
So the REAL fix here would be to go and rewrite my browser's JS evaluator using a trampoline.
That would remove an entire class of bugs - stack overflows - from JS and all languages that compile to JS, such as Clojure, TS, etc.
I've been working on my own language. The core
and impl
namespaces contain the code for a stackless evaluator. It started off as a stackless version of Clojure, something similar to SCI, but has since become my testbed for playing with langdev.
In a future post, we will write a small stackless evaluator.
We will also see how this stackless tranformation opens the door to the mind-bending world of continuations.
// trampoline implementation
// =========================
class Thunk {
constructor(f) {
this.f = f;
}
}
function trampoline(f, ...args) {
let g = () => f(...args),
t = new Thunk(g);
while(t instanceof Thunk)
t = t.f();
return t;
}
let id = (x) => x;
function stackless(t_fn) {
return (...args) =>
trampoline(t_fn, id, ...args);
}
// factorial implementation
// ========================
function t_fact(k, x) {
x = BigInt(x);
if (x < 2)
return new Thunk(() => k(1n));
return new Thunk(() =>
t_fact(
(xMinusOneFactorial) =>
new Thunk(() => k(x * xMinusOneFactorial)),
x - 1n
)
);
}
let factorial = stackless(t_fact);