Unbound Vars and Gradual Typing

Vars are one of the most beloved features of Clojure, whether users know it or not. Vars are global top-level variables in Clojure, implemented as unmanaged references (with respect to Clojure's STM implementation).

(def a 1)

The previous expression defines a new var a and assigns 1 as its root binding. To access the current value inside a var, you need just to name the var.

(assert (= a 1))

Here we provided a root binding, but vars can also be unbound.

(def u)

This is useful for forward declarations and provides flexibility in the order vars are defined.

(ann u Int)
(def u)

(ann b [-> Int])
(defn b [] u)

This code assumes at some later stage u will be defined with a root binding. But observe what happens if we run this program as-is:

(b) ;=> <Unbound clojure.lang.Var>

Dereferencing unbound vars returns a sentinel Var instance Unbound. But wait, didn't we annotate u to be an Int? This is a nightmare for gradual typing because we must assume that every instance of u could be unbound!

That is, we could solve this by annotating b to maybe return an unbound var like so:

(ann u Int)
(def u)

(ann b [-> (U Int UnboundVar)])
(defn b [] u)

Another idea is to explore the spectrum that gradual typing provides of when to check program invariants. Let's insert a cast to ensure u is bound before we use it:

(ann u Int)
(def u)

(ann [-> Int])
(defn b []
  (assert (bound? #'u))
  u)

The syntax #'u returns the actual var u, and the predicate bound? returns true if a var has a root binding. This seems like it should work, but assuming Typed Clojure type checks u as (U Int UnboundVar), this isn't quite enough: how does Typed Clojure know statically that u didn't change between the assertion and the derefence?

It doesn't. Another thread could easily remove the root binding of this var and we're back to square one. Instead, let's play to the strengths of occurrence typing and use Clojure's immutable local bindings to reason about u.

(ann u Int)
(def u)

(ann [-> Int])
(defn b []
  (let [u u]
    (assert (not (identical? clojure.lang.Var/UNBOUND u)))
    u))

Now we're getting somewhere. Assuming the right-hand-side of the let-binding has type (U Int UnboundVar), the assertion must protect us from accidentally returning an unbound variable, because the local binding u is immutable.

Performance

Clojure's vars did not always behave this way. In 2010, dereferencing a var resulted in a thrown exception, but this was changed presumably for performance reasons.

Vars are often an important factor in Clojure performance, as every occurrence of a var corresponds to a dereference, even in a loop! This means you can change running code via def at any time, which is great for development-time and other long-running services.

It is important for the final code emitted by Typed Clojure does not perform any extra checks than needed. Our first version for example that uses '#u effectively dereferences a var twice, which is probably bad.

Our final version dereferences u once, then performs a pointer-equality check, a cheap operation on the JVM.

frenchy64/vars.md

Unbound Vars and Gradual Typing

Performance