Last active
August 5, 2024 15:57
-
-
Save AndrasKovacs/f268c0311437f7a8759d5bac57495f8b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{-# language | |
BlockArguments | |
, ConstraintKinds | |
, ImplicitParams | |
, LambdaCase | |
, OverloadedStrings | |
, PatternSynonyms | |
, Strict | |
, UnicodeSyntax | |
#-} | |
{-# options_ghc -Wincomplete-patterns -Wunused-imports #-} | |
{- | |
-- Practical eta conversion for the unit type. Also eta-long normalization. | |
------------------------------------------------------------------------------- | |
This a small demo of doing eta-conversion for the unit type in the practically | |
nicest way that I know. I don't think it's widely known. | |
This requires retaining types in values, since purely syntax-directed conversion | |
isn't possible for the unit type. The setup also makes it possible to do | |
eta-long normalization in an easy way. | |
There's a significant difference between this and the nicer formal NbE | |
definitions. | |
In formal NbE, there's a quote and an unquote operation (also called: reify and | |
reflect), both of which are type-directed. We decide conversion by first | |
computing eta-long normal forms, then comparing them. | |
In practice we want to avoid computing types and normal forms as much as | |
possible. | |
What we have here: | |
- Neutrals and binders are annotated with lazily computed types. | |
- Conversion is by recursion on values. | |
- Deciding eta for Pi and Sigma is purely syntax-directed. | |
- Conversion only computes types of neutrals, and only when necessary for | |
deciding unit eta. If a conversion problem does not actually depend on unit | |
eta, *no* types are computed, and we only have a modest constant overhead | |
compared to purely syntax-directed conversion. | |
- There is *no* unquote/reflect operation that maps neutrals to values. | |
Instead, we only ever eta-expand neutrals directly to terms, during quotation. | |
- Switching between eta-long quotation and eta-ignoring quotation is very | |
easy with a single branching on some flag. | |
- Eta-long quotation also only computes types of neutrals. | |
- There's a large amount of computation sharing in types. | |
Let's compare Agda as well: | |
- Conversion is type-directed, i.e. a type is passed to conversion at all times. | |
So, a head-normal type is computed for everything, including canonical values. | |
- For record types, the syntax-directed eta-expansion rule is being used, since | |
eager type-directed expansion would be too explosive. | |
- When comparing neutrals, types are recomputed for each value in a spine, | |
starting from the type of the head variable. There is no sharing when the same | |
neutrals are being compared multiple times. | |
I also demonstrate usage of ImplicitParams here, a rarely used GHC feature | |
that's nonetheless very useful in type theory implementations. | |
-} | |
import Prelude hiding (pi) | |
import Control.Monad | |
import Data.String | |
import GHC.Stack | |
-- import Debug.Trace | |
-- We use plain String names everywhere. No De Bruijn. | |
-- This not best practice, just laziness. | |
type Name = String | |
-- "Raw terms" are the input of elaboration. They have type-annotated let-s but | |
-- no annotation on lambda binders. Elaboration annotates the lambdas. | |
data RTm | |
= RU | |
| RPi Name RTm RTm | |
| RLam Name RTm | |
| RApp RTm RTm | |
| RVar Name | |
| RUnit | |
| RTt | |
| RLet Name RTm RTm RTm | |
deriving Show | |
type Ty = Tm | |
-- Core terms have type-annotated lambdas. | |
data Tm | |
= U | |
| Pi Name Ty Ty | |
| Lam Name ~Ty Tm | |
| App Tm Tm | |
| Var Name | |
| Unit | |
| Tt | |
| Let Name Ty Tm Tm | |
deriving Show | |
type VTy = Val | |
data Val | |
= VNe Ne ~VTy -- note the lazy type | |
| VPi Name VTy (Val -> VTy) | |
| VLam Name ~VTy (Val -> Val) -- also here | |
| VU | |
| VUnit | |
| VTt | |
data Ne = NVar String | NApp Ne Val | |
pattern VVar x a = VNe (NVar x) a | |
-- Here's a bit of lesser-known GHC magic that can be tremendously useful. | |
-- Below I define synonyms for *implicit parameter* constraints. | |
-- https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/implicit_parameters.html | |
-- These are arguments that are passed implicitly like instances, but which are | |
-- passed in a lexical manner instead of via instance resolution: the lexically | |
-- innermost "let ?name" binding or passed (?name :: ...) constraint determines | |
-- the value of ?name. | |
-- The list of variables that can possibly occur freely in a semantic value. | |
-- It's called "domain" because it's the domain of the value environment in NbE. | |
-- Operationally, it's used in fresh name generation. | |
type Domain = (?domain :: [Name]) | |
-- A mapping from variables which can occur in *terms*, to values. We use this in evaluation | |
type Env = (?env :: [(Name, Val)]) | |
-- A mapping from variables in *raw terms* to types and non-shadowing names (we get rid of shadowing | |
-- during elaboration). | |
type Cxt = (?cxt :: [(Name, (Name, VTy))]) | |
type Elab = (Domain, Env, Cxt) | |
-- Create a fresh variable as value, while extending the domain with the fresh name. | |
fresh :: Name -> VTy -> (Domain => Name -> Val -> a) -> Domain => a | |
fresh x ~a act = | |
let go x | x == "_" = x | |
| elem x ?domain = go (x ++ "'") | |
| otherwise = x | |
x' = go x | |
v = VVar x' a in | |
let ?domain = x' : ?domain in | |
act x' v | |
defineEnv :: Name -> Val -> (Env => a) -> Env => a | |
defineEnv x ~v act = let ?env = (x, v) : ?env in act | |
defineElab :: Name -> Val -> VTy -> (Elab => Name -> a) -> Elab => a | |
defineElab x a ~v act = | |
fresh x a \x' _ -> defineEnv x' v $ let ?cxt = (x, (x', a)) : ?cxt in act x' | |
bindElab :: Name -> VTy -> (Elab => Name -> Val -> a) -> Elab => a | |
bindElab x ~a act = | |
fresh x a \x' v -> defineEnv x' v $ let ?cxt = (x, (x', a)) : ?cxt in act x' v | |
impossible :: HasCallStack => a | |
impossible = error "impossible" | |
vapp :: Val -> Val -> Val | |
vapp t u = case t of | |
-- the type is lazy, so the expression after the $ is a thunk at this point that captures "a" and "u" | |
VNe n a -> VNe (NApp n u) $ case a of | |
VPi _ _ b -> b u | |
_ -> impossible | |
VLam _ _ t -> t u | |
_ -> impossible | |
eval :: Env => Tm -> Val | |
eval = \case | |
Var x -> maybe (error x) id $ lookup x ?env | |
App t u -> vapp (eval t) (eval u) | |
Let x a t u -> defineEnv x (eval t) (eval u) | |
U -> VU | |
Pi x a b -> let va = eval a in VPi x va \v -> defineEnv x v (eval b) | |
Lam x a t -> let ~va = eval a in VLam x va \v -> defineEnv x v (eval t) | |
Unit -> VUnit | |
Tt -> VTt | |
-- Are all values of a type definitionally equal? Here the irrelevant types | |
-- are just Unit and the functions returning in Unit. | |
isIrrelevant :: Domain => VTy -> Bool | |
isIrrelevant = \case | |
VUnit -> True | |
VPi x a b -> fresh x a \_ v -> isIrrelevant (b v) | |
_ -> False | |
convNe :: Domain => Ne -> Ne -> Bool | |
convNe n n' = case (n, n') of | |
(NVar x , NVar x' ) -> x == x' | |
(NApp t u, NApp t' u') -> convNe t t' && conv u u' | |
_ -> False | |
-- When comparing neutrals, we first try to recursively compare them, and if | |
-- that fails we check whether the type is irrelevant. | |
-- Alternatively, we could test relevance first. | |
-- Actually relying on unit eta is relatively rare, so it makes sense to only check | |
-- for it on demand. | |
-- The story could be different in a system with strict Prop, where heavy use of Prop | |
-- may warrant more eagerness in computing relevance. | |
conv :: Domain => Val -> Val -> Bool | |
conv t t' = case (t, t') of | |
(VU , VU ) -> True | |
(VUnit , VUnit ) -> True | |
(VPi x a b , VPi _ a' b' ) -> conv a a' && fresh x a \_ v -> conv (b v) (b' v) | |
(VLam x a t, VLam x' a' t') -> fresh x a \_ v -> conv (t v) (t' v) | |
(VNe n a , VNe n' _ ) -> convNe n n' || isIrrelevant a | |
(VTt , _ ) -> True | |
(_ , VTt ) -> True | |
(VLam x a t, t' ) -> fresh x a \_ v -> conv (t v) (vapp t' v) | |
(t , VLam x' a' t') -> fresh x' a' \_ v -> conv (vapp t v) (t' v) | |
_ -> False | |
data EtaOpt = Eta | NoEta deriving (Eq, Show) | |
type Eta = (?eta :: EtaOpt) | |
quoteNe :: Domain => Eta => Ne -> Tm | |
quoteNe = \case | |
NVar x -> Var x | |
NApp t u -> App (quoteNe t) (quote u) | |
etaExpand :: Domain => Eta => Ne -> VTy -> Tm | |
etaExpand n = \case | |
VUnit -> Tt | |
VPi y a b -> let qa = quote a in | |
fresh y a \y v -> Lam y qa (etaExpand (NApp n v) (b v)) | |
_ -> quoteNe n | |
quote :: Domain => Eta => Val -> Tm | |
quote = \case | |
VNe n a | ?eta == Eta -> etaExpand n a | |
| otherwise -> quoteNe n | |
VPi x a b -> let qa = quote a in fresh x a \x v -> Pi x qa (quote (b v)) | |
VLam x a t -> let qa = quote a in fresh x a \x v -> Lam x qa (quote (t v)) | |
VU -> U | |
VUnit -> Unit | |
VTt -> Tt | |
quoteNoEta :: Domain => Val -> Tm | |
quoteNoEta = let ?eta = NoEta in quote | |
type ElabM = Either String | |
check :: Elab => RTm -> VTy -> ElabM Tm | |
check t a = case (t, a) of | |
(RLam x t, VPi _ a b) -> do | |
let ~qa = quoteNoEta a | |
bindElab x a \x v -> | |
Lam x qa <$> check t (b v) | |
(RLet x a t u, b) -> do | |
a <- check a VU | |
let va = eval a | |
t <- check t va | |
defineElab x va (eval t) \x -> do | |
u <- check u b | |
pure (Let x a t u) | |
(t, a) -> do | |
(t, a') <- infer t | |
unless (conv a a') $ do | |
Left ("type mismatch, expected:\n\n " | |
++ show (quoteNoEta a) | |
++ "\n\ninferred:\n\n " ++ show (quoteNoEta a') | |
++ "\n\nwhile inferring type for" | |
++ "\n\n " ++ show t) | |
pure t | |
infer :: Elab => RTm -> ElabM (Tm, VTy) | |
infer = \case | |
RVar x -> | |
case lookup x ?cxt of | |
Just (x', a) -> pure (Var x', a) | |
Nothing -> Left $ "name not in scope: " ++ x | |
RU -> | |
pure (U, VU) | |
RPi x a b -> do | |
a <- check a VU | |
bindElab x (eval a) \x _-> do | |
b <- check b VU | |
pure (Pi x a b, VU) | |
t@RLam{} -> | |
Left "can't infer type for lambda" | |
RApp t u -> do | |
(t, a) <- infer t | |
case a of | |
VPi x a b -> do | |
u <- check u a | |
pure (App t u, b (eval u)) | |
a -> | |
Left $ show $ quoteNoEta a | |
RUnit -> | |
pure (Unit, VU) | |
RTt -> | |
pure (Tt, VUnit) | |
RLet x a t u -> do | |
a <- check a VU | |
let va = eval a | |
t <- check t va | |
defineElab x va (eval t) \x -> do | |
(u, b) <- infer u | |
pure (Let x a t u, b) | |
nf :: EtaOpt -> RTm -> IO () | |
nf eta t = do | |
let ?env = [] | |
let ?domain = [] | |
let ?eta = eta | |
let ?cxt = [] | |
case infer t of | |
Left e -> do | |
putStrLn "ERROR" | |
putStrLn e | |
Right (t, _) -> do | |
t <- pure $ quote $ eval t | |
print t | |
-- Testing | |
-------------------------------------------------------------------------------- | |
instance IsString RTm where | |
fromString = RVar | |
(∙) = RApp; infixl 8 ∙ | |
(==>) = RPi "_"; infixr 4 ==> | |
λ = RLam | |
u = RU | |
let_ = RLet | |
pi = RPi | |
unit = RUnit | |
tt = RTt | |
t1 :: RTm | |
t1 = | |
-- Type annotations | |
let_ "the" (pi "A" u $ "A" ==> "A") (λ "A" $ λ "x" "x") $ | |
-- -- Church numbers | |
let_ "nat" u (pi "N" u $ ("N" ==> "N") ==> "N" ==> "N") $ | |
-- let_ "zero" "nat" (λ "N" $ λ "s" $ λ "z" "z") $ | |
let_ "suc" ("nat" ==> "nat") | |
(λ "n" $ λ "N" $ λ "s" $ λ "z" $ "s" ∙ ("n" ∙ "N" ∙ "s" ∙ "z")) $ | |
let_ "add" ("nat" ==> "nat" ==> "nat") | |
(λ "n" $ λ "m" $ λ "N" $ λ "s" $ λ "z" | |
$ "n" ∙ "N" ∙ "s" ∙ ("m" ∙ "N" ∙ "s" ∙ "z")) $ | |
let_ "mul" ("nat" ==> "nat" ==> "nat") | |
(λ "n" $ λ "m" $ λ "N" $ λ "s" $ λ "z" $ | |
"n" ∙ "N" ∙ ("m" ∙ "N" ∙ "s") ∙ "z") $ | |
let_ "n5" "nat" (λ "N" $ λ "s" $ λ "z" $ | |
"s" ∙ ("s" ∙ ("s" ∙ ("s" ∙ ("s" ∙ "z"))))) $ | |
let_ "n2" "nat" (λ "N" $ λ "s" $ λ "z" $ | |
("s" ∙ ("s" ∙ "z"))) $ | |
let_ "n10" "nat" ("mul" ∙ "n5" ∙ "n2") $ | |
let_ "n100" "nat" ("mul" ∙ "n10" ∙ "n10") $ | |
-- Leibniz equality | |
let_ "Id" (pi "A" u $ "A" ==> "A" ==> u) | |
(λ "A" $ λ "x" $ λ "y" $ pi "P" ("A" ==> u) $ "P" ∙ "x" ==> "P" ∙ "y") $ | |
let_ "refl" (pi "A" u $ pi "x" "A" $ "Id" ∙ "A" ∙ "x" ∙ "x") | |
(λ "A" $ λ "x" $ λ "P" $ λ "px" "px") $ | |
-- Testing eta conversion | |
-- (f : U → U) → Id f (λ x. f x) | |
let_ "eta1" (pi "f" (u ==> u) $ "Id" ∙ (u ==> u) ∙ "f" ∙ (λ "x" $ "f" ∙ "x")) | |
(λ "f" $ "refl" ∙ (u ==> u) ∙ "f") $ | |
let_ "ty" u (u ==> (u ==> u) ==> unit) $ | |
-- (f g : U → U → ⊤) → Id f g | |
let_ "eta2" (pi "f" "ty" $ pi "g" "ty" $ "Id" ∙ "ty" ∙ "f" ∙ "g") | |
(λ "f" $ λ "g" $ "refl" ∙ "ty" ∙ "f") $ | |
-- eta-long nf, to be printed with "nf Eta t" | |
let_ "ty" u (pi "A" u $ "A" ==> (u ==> (unit ==> unit) ==> u) ==> u) $ | |
let_ "etanf" ("ty" ==> "ty") (λ "f" "f") $ | |
"etanf" | |
t2 :: RTm | |
t2 = | |
let_ "the" (pi "A" u $ "A" ==> "A") (λ "A" $ λ "x" "x") $ | |
"the" ∙ (pi "A" u $ pi "A" u $ pi "a" "A" $ "A") ∙ (λ "A" $ λ "A" $ λ "a" "a") | |
main :: IO () | |
main = nf Eta t1 |
Good point. I usually use de Bruijn where it's harder to trip up on fresh naming. I'll fix this.
Updated, the same way as you suggested.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There's an interesting bug which I ran into in my own language. Type checking then normalizing the following term causes the program to crash due to an unbound variable introduced by
check
.The problem stems from lambdas which don't have a type annotation on their argument being given one by quoting a value. Specifically here the problem is on line 245-246 where
a
is quoted then added as an annotation to theLam
.I think the best fix is to rename the argument of the lambda to the generated one, then keep track of the new name in a new field of the
Env
and replace all references to the argument with the new name while type checking. So for example(λ x. λx. x) : Unit -> Unit -> Unit
would becomeλ x : Unit. λ x' : Unit. x'
.