In Part 1 we have developed a basic set of rules for
a simple quantitative dependently typed theory. The rules were, however, non-directional (or nondeterministic). By this,
I mean that there could be multiple instances of rules that would lead to the same conclusion. For example, consider the
sequent 0 A : *, 0 B : *, ω f : A → B, ω a : A ⊢ ω (f a) : B
. We could derive this sequent at least in two ways:
---------------------------------------(A-id) -----------------------------------(A-id)
..., ω f : A → B, 0 a : A ⊢ ω f : A → B ..., 0 f : A → B, ω a : A ⊢ ω a : A
-----------------------------------------------------------------------------------(→E)
0 A : *, 0 B : *, ω f : A → B, ω a : A ⊢ ω (f a) : B
or even,
---------------------------------------(A-id) -----------------------------------(A-id)
..., ω f : A → B, ω a : A ⊢ ω f : A → B ..., ω f : A → B, ω a : A ⊢ ω a : A
-----------------------------------------------------------------------------------(→E)
0 A : *, 0 B : *, ω f : A → B, ω a : A ⊢ ω (f a) : B
This is because addition of contexts is not a reversible operation. But while this issue only arises in QTT, the →E
rule
has another problem, which is not exclusive to QTT: it is not possible, in general, to derive the types for the premises from the
conclusion. More specifically, it is not possible to derive (π x : A) → B
from B[a/x]
in the rule:
Δ ⊢ ρ f : (π x : A) → B Δ' ⊢ ρ*π a : A
------------------------------------------(→E)
Δ + Δ' ⊢ ρ (f a) : B[a/x]
All of this means, in particular that it is hard to devise a typechecker (which should follow the rules bottom-up) in a deterministic way. We must therefore develop another set of rules which is equivalent to these rules, but which is deterministic. A common technique to solve the above issue, is to define a bidirectional set of rules.
A bidirectional system (non-QTT) has two judgments: Γ ⊢ a ↑ A
, meaning "check that a
has type A
, given a context Γ
",
and Γ ⊢ a ↓ A
meaning "infer that a
has type A
, given a context Γ
".1 In a practical implementation, these two
judgments will respectively correspond to procedures check(ctx, term, type)
, which may or may not succeed (what it returns is irrelevant),
and infer(ctx, term)
, which may fail or return an inferred type. These judgments can be converted into one another with the following rules:
Γ ⊢ a ↓ A Γ ⊢ a ↑ A
----------(↓↑) ----------------(↑↓)
Γ ⊢ a ↑ A Γ ⊢ (a :: A) ↓ A
where ::
is a new construction, called the type annotation. Intuitively, the rules are as follows: if we can infer that a
has type A
, then we can check that a
has type A
,
and, conversely, if we can check that a
has type A
, then we can infer that a :: A
(a
annotated by A
) has type A
. We will postpone the other bidirectional rules
when we deal with the QTT case. It suffices to say, for now, that this bidirectional system is deterministic, and has the property that if Γ ⊢ a ↑ A
or Γ ⊢ a ↓ A
, then
Γ ⊢ a : A
, and that if Γ ⊢ a : A
, then we can get a new a'
by only annotating parts of a
, such that Γ ⊢ a' ↑ A
and Γ ⊢ a' ↓ A
.
When extending to QTT, we might be tempted to define the bidirectional judgments analogously as Δ ⊢ ρ a ↑ A
and Δ ⊢ ρ a ↓ A
.
However, this will not solve the problem of nondeterminism arising from the addition of contexts in →E
. In order to solve this,
the judgments will also return an additional inferred context, but it will take as inputs only erased contexts (or precontexts).
The inferred context, as we will see, is the smallest context which will suffice for checking/inferencing.
The judgments will therefore have form Γ ↓ Δ ⊢ ρ a ↑ A
and Γ ↓ Δ ⊢ ρ a ↓ A
2, respectively check and infer, corresponding to procedures
check(prectx, use, trm, type)
, which will either return a context or fail, and infer(prectx, use, trm)
, which will either return
a context and a type or fail.
The conversion rules are completely analogous:
Γ ↓ Δ ⊢ ρ a ↓ A Γ ↓ Δ ⊢ ρ a ↑ A
----------------(↓↑) ----------------------(↑↓)
Γ ↓ Δ ⊢ ρ a ↑ A Γ ↓ Δ ⊢ ρ (a :: A) ↓ A
The identity rule is defined as
-----------------------------------------(id)
Γ, 0 x : A, Γ' ↓ Γ, ρ x : A, Γ' ⊢ ρ x ↓ A
the function elimination rule as:
Γ ↓ Δ ⊢ ρ f ↓ (π x : A) → B Γ ↓ Δ' ⊢ ρ*π a ↑ A
--------------------------------------------------(→E)
Γ ↓ Δ + Δ' ⊢ ρ (f a) ↓ B[a/x]
and function introduction rule as:
Γ, 0 x : A ↓ Δ, π' x : A ⊢ 1 b ↑ B π' ≤ π
---------------------------------------------(→I)
Γ ↓ ρ*Δ ⊢ ρ λx. b ↑ (π x : A) → B
A few interesting things to note.
- The bidirectional identity rule is quite similar to the linear identity rule in the original system; in fact, the affinity of the system will come from the function introduction rule. Also, the identity rule will give us the minimal context needed for the term to be well-typed. This property is preserved by the other rules, making the inferred context minimal.
- The function elimination rule is now deterministic: since
(π x : A) → B
is given as output (and not as input) of the premise, only a single type (namelyB[a/x]
) could be given as output of the conclusion (the arrows were reversed). The same thing can be said about the contexts: sinceΔ
andΔ'
were outputs of the premises, only a unique context (namelyΔ + Δ'
) can be given as output. - The function introduction rule does not need division anymore; the reversal of the context arrows also reversed the operation, so we see a multiplication
in the bottom sequent instead of a division in the upper sequent. Affinity/linearity of the system is now given by a simple choice of either
π' ≤ π
orπ' = π
, respectively.
In fact, we are here in a position to extend the system to allow linear, affine, relevant or many other types of variables, all at the same type, without changing the rules at all! The same rules will work for any ordered rig (refer to the original article for more information about rigs). For example, we can have linear (denoted by 1) and affine (denoted by &) variables by defining the (partial) ordering:
- 0 ≤ 0, 0 ≤ &, 0 ≤ ω
- 1 ≤ 1, 1 ≤ &, 1 ≤ ω (notice how 1 ≰ 0 and 0 ≰ 1, which gives it its linearity)
- & ≤ &, & ≤ ω
- ω ≤ ω
1 The up/down arrow notation has a reason: when we check that a
has type A
we only need to follow the rules upwards,
but when we try to infer infer a type for a term a
, we must follow the derivation upwards leaving the inferred types
empty until we reach the leaves of the derivation, where we will get an immediate type output which then we follow downwards
filling in the empty slots of inferred types. That is, the inferred type is built from top to bottom.
2 Again, the down arrow signifies inference. While the input precontexts Γ
are built from bottom up, the output
contexts Δ
are built from top to bottom.
To prove that this bidirectional system is not just a random set of rules, we need to find a relationship between it and the non-directional system. This is given by the following theorem, the proof of which is out of the scope of this article:
- if
Γ ↓ Δ ⊢ ρ a ↑ A
, thenΔ ⊢ ρ a : A
- if
Δ ⊢ ρ a : A
, then there exists a terma'
such thaterasAnn(a') = a
,Γ ↓ Δ' ⊢ ρ a' ↑ A
andΔ' ≤ Δ
, whereerasAnn
is the function that erases all type annotations of a term.
The theorem basically says that the bidirectional system is not stronger than the non-directional system, and that by annotating terms, it is not weaker either. This, coupled with the deterministic nature of the bidirectinal system, makes it suitable to write a practical implementation of a sound type system.
Great stuff! Though the type former rule for Pi is quite straight forward, it would be nice if it was presented here (and in part1) for completeness sake.