Instantly share code, notes, and snippets.
Last active
August 20, 2024 16:07
-
Star
11
(11)
You must be signed in to star a gist -
Fork
0
(0)
You must be signed in to fork a gist
-
Save AndrasKovacs/0e1bc2532e071cef5974ca582f8090ad to your computer and use it in GitHub Desktop.
NbE with implicit sharing in quotation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{- | |
At ICFP 2022 I attended a talk given by Tomasz Drab, about this paper: | |
"A simple and efficient implementation of strong call by need by an abstract machine" | |
https://dl.acm.org/doi/10.1145/3549822 | |
This is right up my alley since I've implemented strong call-by-need evaluation | |
quite a few times (without ever doing more formal analysis of it) and I'm also | |
interested in performance improvements. Such evaluation is required in | |
conversion checking in dependently typed languages. | |
I was surprised to hear in the talk that the following can be normalized in | |
linear time: | |
λ x. N ω x | |
where `ω = λ x. x x` and `N` is a Church numeral. Normalizing this expression | |
iterates self-application N times on `x`, and in the end we get a tree of | |
self-applications with depth N. | |
The familiar NbE (normalization-by-evaluation) has a lot of implicit sharing in | |
runtime values (which is one reason why it's better than naive term | |
substitution), but when we "quote" values back to terms we usually don't get any | |
implicit sharing, because we create fresh term copies from values. For example, | |
when we evaluate | |
let x = expensive in cons x (cons x (cons x nil)) | |
the value of `x` is shared across the three copies in the list, but when we | |
quote the value, the implicit sharing is discarded and we get three copies | |
of `x` as a term. | |
The idea of the paper/talk is to *also* have sharing when we compute | |
terms. In this file I present a small Haskell implementation for an NbE | |
algorithm in this style. | |
Regarding applications: in short, I don't think I would use this algorithm in | |
practical elaborators, but I do think that it's important to be aware of it. | |
The algorithm has two primary design requirements. | |
1. We must use a fresh name convention in terms, we can't use De Bruijn indices | |
or levels. The reason is that we need term weakening for free. | |
2. We must represent neutral values as embedded terms. This is essential for | |
avoiding fresh copying in quotation. | |
(1) is not too problematic; it's only a small overhead and it's not too | |
annoying. However, (2) is not really workable in elaboration, because we really | |
want neutral values with carefully chosen structure, for convenience and | |
efficiency in unification. Also: | |
- We get *non-observable sharing* from the algorithm. Such sharing is destroyed | |
when we copy the resulting terms! If we want to e.g. serialize terms, we need | |
to convert sharing to observable sharing, which is usually not cheap. | |
- In elaboration we very rarely quote to actual beta-normal terms. Instead, we | |
get the greatest amount of size compression and observable sharing from | |
*preserving definition unfoldings*. | |
-} | |
-- Mind the Strict pragma. | |
{-# language Strict, LambdaCase #-} | |
{-# options_ghc -Wincomplete-patterns #-} | |
{- | |
One important point is that we *can't* use De Bruijn convention, IIUC. The `Int` | |
variables here behave like names. The reason is that weakening for *terms* must | |
be for free in this algorithm, or else we don't get the nice asymptotics. | |
-} | |
data Tm = Var Int | App Tm Tm | Lam Int Tm deriving (Show) | |
-- Association lists of names and lazy values. | |
data Env = Nil | Def Env Int ~Val | |
-- Name of binder, env, closure body. | |
data Closure = Cl Int Env Tm | |
{- | |
Here's the main departure from vanilla NbE: neutrals are literally terms. | |
Also, VLam is annotated with a lazy term which is just the normal form | |
of the value. | |
-} | |
data Val = Ne Tm | VLam {-# unpack #-} Closure ~Tm | |
-- This means that quotation is not even recursive! It's just a projection. | |
-- This is pretty much the main point: quotation does not copy anything. | |
quote :: Val -> Tm | |
quote = \case | |
VLam _ t -> t | |
Ne t -> t | |
{-# inline quote #-} | |
-- evaluation | |
------------------------------------------------------------ | |
{- | |
We pass the next fresh variable along with the environment. The fresh variable | |
must not appear free in any of the values in the env. This is IMO the most | |
efficient way to implement fresh name generation for our purposes. We don't | |
really need global state. | |
-} | |
var :: Env -> Int -> Val | |
var e x = case e of | |
Def e x' v | x == x' -> v | |
| True -> var e x | |
_ -> undefined | |
nf :: Int -> Env -> Tm -> Tm | |
nf fresh e t = quote (eval fresh e t) | |
{-# inline nf #-} | |
-- apply a closure to a value. | |
appCl :: Int -> Closure -> Val -> Val | |
appCl fresh (Cl x e t) ~u = eval fresh (Def e x u) t | |
{-# inline appCl #-} | |
eval :: Int -> Env -> Tm -> Val | |
eval fresh e = \case | |
Var x -> var e x | |
App t u -> case eval fresh e t of | |
VLam t _ -> appCl fresh t (eval fresh e u) | |
Ne t -> Ne (App t (nf fresh e u)) | |
Lam x t -> VLam (Cl x e t) | |
(Lam fresh (nf (fresh+1) (Def e x (Ne (Var fresh))) t)) | |
{- In eval, in the neutral application case we immediately normalize the | |
argument and build the normal term. | |
In the lambda case we save a closure and compute a lazy normal form. | |
-} | |
-- Normalize a closed term. | |
nf0 :: Tm -> Tm | |
nf0 = nf 0 Nil | |
-- Syntactic sugar for variables. | |
instance Num Tm where | |
fromInteger = Var . fromInteger | |
(+) = undefined; (*) = undefined; abs = undefined | |
signum = undefined; negate = undefined | |
($$) = App | |
infixl 7 $$ | |
-- Church natural number literal. | |
cnat :: Int -> Tm | |
cnat n = Lam 0 $ Lam 1 (go n) where | |
go n | n <= 0 = 1 | |
| True = 0 $$ go (n - 1) | |
-- This one is the `λ x. N ω x` expression that I mentioned before. It runs | |
-- reasonably quickly in ghci even for args around one million. | |
test :: Int -> () | |
test n = seq (nf0 (Lam 0 $ cnat n $$ (Lam 0 $ 0 $$ 0) $$ 0)) () |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bad wording, edited.