Last active
March 1, 2025 23:03
-
-
Save VictorTaelin/7a343f542f9c9e46dcab212a246b72dd to your computer and use it in GitHub Desktop.
Hard AI Debugging Prompt / SupTT Codebase
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is a debugging challenge. | |
Read the document below, and identify the bug. | |
--- | |
# The Interaction Calculus | |
The Interaction Calculus (IC) is term rewriting system inspired by the Lambda | |
Calculus (λC), but with some major differences: | |
1. Vars are affine: they can only occur up to one time. | |
2. Vars are global: they can occur anywhere in the program. | |
3. There is a new core primitive: the superposition. | |
An IC term is defined by the following grammar: | |
``` | |
Term ::= | |
| VAR: Name | |
| LAM: "λ" Name "." Term | |
| APP: "(" Term " " Term ")" | |
| SUP: "{" Term "," Term "}" | |
| COL: "!" "{" Name "," Name "}" "=" Term ";" Term | |
``` | |
Where: | |
- VAR represents a named variable. | |
- LAM represents a lambda. | |
- APP represents a application. | |
- SUP represents a superposition. | |
- COL represents a collapse. | |
Lambdas are curried, and work like their λC counterpart, except with a relaxed | |
scope, and with affine usage. Applications eliminate lambdas, like in λC, | |
through the beta-reduce (APP-LAM) interaction. | |
Superpositions work like pairs. Collapsers eliminate superpositions through | |
the collapse (COL-SUP) interaction, which works exactly like a pair projection. | |
What makes SUPs and COLs unique is how they interact with LAMs and APPs. When a | |
SUP is applied to an argument, it reduces through the overlap interaction | |
(APP-SUP), and when a LAM is projected, it reduces through the entangle | |
interaction (COL-LAM). This gives a computational behavior for every possible | |
interaction: there are no runtime errors on IC. | |
The interaction rules are defined below: | |
Beta-Reduce: | |
``` | |
(λx.f a) | |
-------- APP-LAM | |
x <- a | |
f | |
``` | |
Superpose: | |
``` | |
({a,b} c) | |
--------------- APP-SUP | |
! {x0,x1} = c; | |
{(a x0),(b x1)} | |
``` | |
Entangle: | |
``` | |
! {r,s} = λx.f; | |
K | |
--------------- COL-LAM | |
r <- λx0.f0 | |
s <- λx1.f1 | |
x <- {x0,x1} | |
! {f0,f1} = f; | |
K | |
``` | |
Collapse: | |
``` | |
! {x,y} = {a,b}; | |
K | |
--------------- COL-SUP | |
x <- a | |
y <- b | |
K | |
``` | |
Where `x <- t` stands for a global substitution of `x` by `t`. | |
Since variables are affine, substitutions can be implemented efficiently by just | |
inserting an entry in a global substitution map (`sub[var] = value`). There is | |
no need to traverse the target term, or to handle name capture, as long as fresh | |
variable names are globally unique. It can also be implemented in a concurrent | |
setup with a single atomic-swap. | |
Below is a pseudocode implementation of these interaction rules: | |
``` | |
def app_lam(app, lam): | |
sub[lam.nam] = app.arg | |
return lam.bod | |
def app_sup(app, sup): | |
x0 = fresh() | |
x1 = fresh() | |
a0 = App(sup.lft, Var(x0)) | |
a1 = App(sup.rgt, Var(x1)) | |
return Col(x0, x1, app.arg, Sup(a0, a1)) | |
def dup_lam(dup, lam): | |
x0 = fresh() | |
x1 = fresh() | |
f0 = fresh() | |
f1 = fresh() | |
sub[dup.lft] = Lam(x0, Var(f0)) | |
sub[dup.rgt] = Lam(x1, Var(f1)) | |
sub[lam.nam] = Sup(Var(x0), Var(x1)) | |
return Col(f0, f1, lam.bod, dup.bod) | |
def dup_sup(dup, sup): | |
sub[dup.lft] = sup.lft | |
sub[dup.rgt] = sup.rgt | |
return dup.bod | |
``` | |
Terms can be reduced to weak head normal form, which means reducing until the | |
outermost constructor is a value (LAM, SUP, etc.), or until no more reductions | |
are possible. Example: | |
``` | |
def whnf(term): | |
while True: | |
match term: | |
case Var(nam): | |
if nam in sub: | |
term = sub[nam] | |
else: | |
return term | |
case App(fun, arg): | |
fun = whnf(fun) | |
match fun: | |
case Lam(_, _): | |
term = app_lam(term, fun) | |
case Sup(_, _): | |
term = app_sup(term, fun) | |
case _: | |
return App(fun, arg) | |
case Col(lft, rgt, val, bod): | |
val = whnf(val) | |
match val: | |
case Lam(_, _): | |
term = dup_lam(term, val) | |
case Sup(_, _): | |
term = dup_sup(term, val) | |
case _: | |
return Col(lft, rgt, val, bod) | |
case _: | |
return term | |
``` | |
Terms can be reduced to full normal form by recursively taking the whnf: | |
``` | |
def normal(term): | |
term = whnf(term) | |
match term: | |
case Lam(nam, bod): | |
bod_nf = normal(bod) | |
return Lam(nam, bod_nf) | |
case App(fun, arg): | |
fun_nf = normal(fun) | |
arg_nf = normal(arg) | |
return App(fun_nf, arg_nf) | |
case Sup(lft, rgt): | |
lft_nf = normal(lft) | |
rgt_nf = normal(rgt) | |
return Sup(lft_nf, rgt_nf) | |
case Col(lft, rgt, val, bod): | |
val_nf = normal(val) | |
bod_nf = normal(bod) | |
return Col(lft, rgt, val_nf, bod_nf) | |
case _: | |
return term | |
``` | |
Below are some normalization examples. | |
Example 0: (simple λ-term) | |
``` | |
(λx.λt.(t x) λy.y) | |
------------------ APP-LAM | |
λt.(t λy.y) | |
``` | |
Example 1: (larger λ-term) | |
``` | |
(λb.λt.λf.((b f) t) λT.λF.T) | |
---------------------------- APP-LAM | |
λt.λf.((λT.λF.T f) t) | |
----------------------- APP-LAM | |
λt.λf.(λF.t f) | |
-------------- APP-LAM | |
λt.λf.t | |
``` | |
Example 2: (global scopes) | |
``` | |
{x,(λx.λy.y λk.k)} | |
------------------ APP-LAM | |
{λk.k,λy.y} | |
``` | |
Example 3: (superposition) | |
``` | |
!{a,b} = {λx.x,λy.y}; (a b) | |
--------------------------- COL-SUP | |
(λx.x λy.y) | |
----------- APP-LAM | |
λy.y | |
``` | |
Example 4: (overlap) | |
``` | |
({λx.x,λy.y} λz.z) | |
------------------ APP-SUP | |
! {x0,x1} = λz.z; {(λx.x x0),(λy.y x1)} | |
--------------------------------------- COL-LAM | |
! {f0,f1} = {r,s}; {(λx.x λr.f0),(λy.y λs.f1)} | |
---------------------------------------------- COL-SUP | |
{(λx.x λr.r),(λy.y λs.s)} | |
------------------------- APP-LAM | |
{λr.r,(λy.y λs.s)} | |
------------------ APP-LAM | |
{λr.r,λs.s} | |
``` | |
Example 5: (default test term) | |
The following term can be used to test all interactions: | |
``` | |
((λf.λx.!{f0,f1}=f;(f0 (f1 x)) λB.λT.λF.((B F) T)) λa.λb.a) | |
----------------------------------------------------------- 16 interactions | |
λa.λb.a | |
``` | |
# The Superposed Type Theory (SupTT) | |
The Superposed Type Theory (SupTT) is an extension of the Interaction Calculus | |
with a type system, labelled superpositions, and new primitives. Its grammar is: | |
``` | |
Term ::= | |
-- Variables | |
| VAR: Name | |
| LET: "!" Name "=" Term ";" Term | |
-- Superpositions | |
| SUP: "&" Uint "{" Term "," Term "}" | |
| COL: "!" "&" Uint "{" Name "," Name "}" "=" Term ";" Term | |
-- Universe Type | |
| SET: "*" | |
-- Empty Type | |
| EMP: "⊥" | |
| EFQ: "¬" Term | |
-- Unit Type | |
| UNI: "⊤" | |
| NIL: "()" | |
| USE: "-" Term ";" Term | |
-- Bool Type | |
| BIT: "𝔹" | |
| BT0: "0" | |
| BT1: "1" | |
| ITE: "?" Term "{" Term "}" ";" "{" Term "}" | |
-- Sigma Type | |
| SIG: "Σ" Name ":" Term "." Term | |
| TUP: "[" Term "," Term "]" | |
| GET: "!" "[" Name "," Name "]" "=" Term ";" Term | |
-- Pi Type | |
| ALL: "Π" Name ":" Term "." Term | |
| LAM: "λ" Name "." Term | |
| APP: "(" Term " " Term ")" | |
-- Identity Type | |
| EQL: "<" Term "=" Term ">" | |
| RFL: "θ" | |
| RWT: "%" Term ";" Term | |
``` | |
The Universe Type is the type of types. | |
The Empty Type is a set without elements. It is eliminated as `¬x`, which | |
corresponds to an empty pattern-match (agda: `case x of λ{}`). | |
The Unit Type is a set with one element, `()`. It is eliminated as `-x; body`, | |
which corresponds to an unitary pattern-match (agda: `case x of λ{(): body}`). | |
The Bool Type is a set with two elements, `0` and `1`. It is eliminated as `?x { | |
t } ; { f }`, which corresponds to a boolean pattern-match (agda: `case x of λ{ | |
true → t ; false → f }`). | |
The Sigma Type is a dependent pair, constructed as `[a,b]`. It is eliminated by | |
the projection `! [a,b] = p; body`, which extracts `a` and `b` from `p`. | |
The Pi Type is a dependent function, constructed as `λx. body`. It is eliminated | |
by an application, `(f x)`, as discussed before. | |
The Identity Type represents a propositional equality, constructed by `θ`, which | |
stands for reflexivity. It is eliminated as `% e; body`, which corresponds to an | |
identity type pattern-match (agda: `case e of λ{ refl: body }`). | |
The computation rules (interactions) of SupTT are written below. | |
Constructors interact with their eliminators, as expected: | |
``` | |
- () t | |
------ USE-NIL | |
t | |
? 0 { t } ; { f } | |
----------------- ITE-BT0 | |
f | |
? 1 { t } ; { f } | |
----------------- ITE-BT1 | |
t | |
! [x,y] = [a,b]; t | |
------------------ GET-TUP | |
x <- a | |
y <- b | |
t | |
% θ; t | |
------ RWT-RFL | |
t | |
(λx.f a) | |
-------- APP-LAM | |
x <- a | |
f | |
``` | |
Constructors also interact with collapsers: | |
``` | |
! &L{x0,x1} = (); K | |
------------------- COL-NIL | |
x0 <- () | |
x1 <- () | |
K | |
! &L{x0,x1} = 0; K | |
------------------ COL-BT0 | |
x0 <- 0 | |
x1 <- 0 | |
K | |
! &L{x0,x1} = 1; K | |
------------------ COL-BT1 | |
x0 <- 1 | |
x1 <- 1 | |
K | |
! &L{x0,x1} = [a,b]; K | |
---------------------- COL-TUP | |
x0 <- [a0,b0] | |
x1 <- [a1,b1] | |
! &L{a0,a1} = a | |
! &L{b0,b1} = b | |
K | |
! &L{r,s} = λx.f; | |
K | |
----------------- COL-LAM | |
r <- λx0.f0 | |
s <- λx1.f1 | |
x <- &L{x0,x1} | |
! &L{f0,f1} = f; | |
K | |
``` | |
Eliminators interact with superpositions: | |
``` | |
- &L{a,b}; k | |
-------------------- USE-SUP | |
! &L{k0,k1} = k; | |
&L{-a;k0, -b;k1} | |
? &L{a,b} {t} ; {f} | |
---------------------------- ITE-SUP | |
! &L{t0,t1} = t; | |
! &L{f0,f1} = f; | |
&L{?a{t0};{f0}, ?b{t1};{f1}} | |
! [x,y] = &L{a,b}; k | |
---------------------------- GET-SUP | |
! &L{k0,k1} = k; | |
&L{![x,y]=a;k0, ![x,y]=b;k1} | |
% &L{a,b}; k | |
---------------- RWT-SUP | |
! &L{k0,k1} = k; | |
&L{%a;k0, %b;k1} | |
(&L{a,b} c) | |
----------------- APP-SUP | |
! &L{c0,c1} = c; | |
&L{(a c0),(b c1)} | |
``` | |
Superpositions and collapsers interact: | |
``` | |
! &L{x,y} = &L{a,b}; | |
K | |
-------------------- COL-SUP (if equal labels) | |
x <- a | |
y <- b | |
K | |
! &L{x,y} = &R{a,b}; | |
K | |
-------------------- COL-SUP (if different labels) | |
x <- &R{a0,b0} | |
y <- &R{a1,b1} | |
! &L{a0,a1} = a | |
! &L{b0,b1} = b | |
K | |
``` | |
Finally, LET interacts with anything: | |
``` | |
! x = t; body | |
------------- LET | |
x <- t | |
body | |
``` | |
# SupTT's Runtime (32-Bit) | |
SupTT-32 is implemented in portable C. | |
It represents terms with u32-pointers, which store 3 fields: | |
- sub (1-bit): true if this is a substitution | |
- tag (5-bit): the term tag | |
- lab (2-bit): the label (optionally) | |
- val (24-bit): the value of this pointer | |
The tag field can be: | |
- `VAR` | |
- `LET` | |
- `SUP` | |
- `CO0` | |
- `CO1` | |
- `SET` | |
- `EMP` | |
- `EFQ` | |
- `UNI` | |
- `NIL` | |
- `USE` | |
- `BIT` | |
- `BT0` | |
- `BT1` | |
- `ITE` | |
- `SIG` | |
- `TUP` | |
- `ALL` | |
- `LAM` | |
- `APP` | |
- `EQL` | |
- `RFL` | |
- `RWT` | |
The lab field stores a label on SUP, CO0 and CO1 terms. | |
The val field depends on the label: | |
- `VAR`: points to a Subst location | |
- `CO0`: points to a Subst location | |
- `CO1`: points to a Subst location | |
- `LET`: points to a Let Node ({val: Term, bod: Term}) | |
- `SUP`: points to a Sup Node ({lft: Term, rgt: Term}) | |
- `SET`: unused | |
- `EMP`: unused | |
- `EFQ`: points to an Efq Node ({val: Term}) | |
- `UNI`: unused | |
- `NIL`: unused | |
- `USE`: points to a Use Node ({val: Term, bod: Term}) | |
- `BIT`: unused | |
- `BT0`: unused | |
- `BT1`: unused | |
- `ITE`: points to a Ite Node ({cnd: Term, thn: Term, els: Term}) | |
- `SIG`: points to a Sig Node ({fst: Term, snd: Term}) | |
- `TUP`: points to a Tup Node ({fst: Term, snd: Term}) | |
- `GET`: points to a Get Node ({val: Term, bod: Term}) | |
- `ALL`: points to an All Node ({inp: Term, out: Term}) | |
- `LAM`: points to a Lam Node ({bod: Term}) | |
- `APP`: points to an App Node ({fun: Term, arg: Term}) | |
- `EQL`: points to an Eql Node ({lft: Term, rgt: Term}) | |
- `RFL`: unused | |
- `RWT`: points to a Rwt Node ({val: Term, bod: Term}) | |
Non-nullary terms point to a Node, i.e., a consecutive block of its child terms. | |
For example, the SUP term points to the memory location where its two child | |
terms are stored. | |
Variable terms (VAR, CO0 and CO1) point to a location in memory where the | |
substitution will be inserted. As an optimization, rather than keeping a | |
separate subst map in memory, we just re-use the location of the corresponding | |
binder. For example, the VAR term of a lambda points either to its corresponding | |
Lam Node (before the beta reduction), or to the substituted term (after the beta | |
reduction). To distinguish, we set the 'sub' bit to signal that a memory | |
location is a substitution entry. | |
Note that there is no COL term. That's because collapser nodes are special: | |
1. they aren't part of the AST | |
2. they don't store a body | |
3. their bound vars are represented by CO0 and CO1 instead of VAR | |
Because of that, the only way to access them is via the CO0 and CO1 terms, which | |
point to the collapser node. When a collapse interaction takes place, the bound | |
var that triggered the interaction immediatelly gets its half of the collapse, | |
and the collapser node is replaced by the other half, with the sub bit set, | |
allowing the other bound var to get it. For example, the COL-SUP interaction | |
could be implemented as: | |
``` | |
def col_sup(col, sup): | |
if col.lab == sup.lab: | |
tm0 = heap[sup.loc + 0] | |
tm1 = heap[sup.loc + 1] | |
heap[col.loc] = as_sub(tm1 if col.tag == CO0 else tm0) | |
return (tm0 if col.tag == CO0 else tm1) | |
else: | |
co0_loc = alloc(1) | |
co1_loc = alloc(1) | |
su0_loc = alloc(2) | |
su1_loc = alloc(2) | |
su0_val = Term(SUP, sup.lab, su0_loc) | |
su1_val = Term(SUP, sup.lab, su1_loc) | |
heap[co0_loc] = heap[sup.loc + 0] | |
heap[co1_loc] = heap[sup.loc + 1] | |
heap[su0_loc + 0] = Term(DP0, col.lab, co0_loc) | |
heap[su0_loc + 1] = Term(DP0, col.lab, co1_loc) | |
heap[su1_loc + 0] = Term(DP1, col.lab, co0_loc) | |
heap[su1_loc + 1] = Term(DP1, col.lab, co1_loc) | |
heap[col.loc] = as_sub(su1_val if col.tag == CO0 else tm1) | |
return (su0_val if col.tag == CO0 else su1_val) | |
``` | |
Note how the var (CO0 or CO1) that triggers col_sup is given one of the half of | |
the collapse, while the other half is stored on the collapser node memory | |
location, now reinterpreted as a subst entry, allowing the other var to get it. | |
# Parsing SupTT | |
On SupTT, all bound variables have global range. For example, consider the term: | |
λt.((t x) λx.λy.y) | |
Here, the `x` variable appears before its binder, `λx`. Since runtime variables | |
must point to their bound λ's, linking them correctly requires caution. A way to | |
do it is to store two structures at parse-time: | |
1. lcs: an array from names to locations | |
2. vrs: a map from names to var terms | |
Whenever we parse a name, we add the current location to the 'uses' array, and | |
whenever we parse a binder (lams, lets, etc.), we add a variable term pointing | |
to it to the 'vars' map. Then, once the parsing is done, we run iterate through | |
the 'uses' array, and write, to each location, the corresponding term. Below | |
are some example parsers using this strategy: | |
``` | |
def parse_var(loc): | |
nam = parse_name() | |
uses.push((nam,loc)) | |
def parse_lam(loc): | |
lam = alloc(1) | |
consume("λ") | |
nam = parse_name() | |
consume(".") | |
vars[nam] = Term(VAR, 0, lam) | |
parse_term(lam) | |
heap[loc] = Term(LAM, 0, lam) | |
def parse_app(loc): | |
app = alloc(2) | |
consume("(") | |
parse_term(app + 0) | |
consume(" ") | |
parse_term(app + 1) | |
consume(")") | |
heap[loc] = Term(APP, 0, app) | |
def parse_sup(loc): | |
sup = alloc(2) | |
consume("&") | |
lab = parse_uint() | |
consume("{") | |
lft = parse_term(sup + 0) | |
consume(",") | |
rgt = parse_term(sup + 1) | |
consume("}") | |
heap[loc] = Term(SUP, lab, sup) | |
def parse_col(loc): | |
col = alloc(1) | |
consume("!") | |
consume("&") | |
lab = parse_uint() | |
consume("{") | |
co0 = parse_name() | |
consume(",") | |
co1 = parse_name() | |
consume("}") | |
consume("=") | |
val = parse_term(col) | |
bod = parse_term(loc) | |
vars[co0] = Term(CO0, lab, loc) | |
vars[co1] = Term(CO1, lab, loc) | |
``` | |
# Stringifying SupTT | |
Converting SupTT terms to strings faces two challenges: | |
First, SupTT terms and nodes don't store variable names. As such, | |
Second, on SupTT, Col Nodes aren't part of the main program's AST. Instead, they | |
"float" on the heap, and are only reachable via CO0 and CO1 variables. Because | |
of that, by stringifying a term naively, collapser nodes will be missing. | |
To solve these, we proceed as follows: | |
1. Before stringifying, we pass through the full term, and assign a id to each | |
variable binder we find (on lam, let, col nodes, etc.) | |
2. We also register every col node we found, avoiding duplicates (remember the | |
same col node is pointed to by up to 2 variables, CO0 and CO1) | |
Then, to stringify the term, we first stringify each COL node, and then we | |
stringify the actual term. As such, the result will always be in the form: | |
! &{x0 x1} = t0 | |
! &{x2 x3} = t1 | |
! &{x4 x5} = t2 | |
... | |
term | |
With no COL nodes inside the ASTs of t0, t1, t2 ... and term. | |
# SupTT Project Guide | |
## What is SupTT? | |
SupTT (Superposed Type Theory) is an implementation of the Interaction Calculus | |
(IC) with a type system, labeled superpositions, and additional primitives. It | |
provides a term rewriting system inspired by Lambda Calculus but with affine | |
variables, global scope, and superpositions as a core primitive. | |
## Important Files | |
- `suptt.md` - PRIMARY SPECIFICATION for the entire project | |
- `src/main.c` - Program entry point and test runner | |
- `src/types.h` - Core data structures and term representation | |
- `src/memory.c/h` - Heap management and allocation | |
- `src/parse.c/h` - Term parsing implementation | |
- `src/whnf.c/h` - Term normalization | |
- `src/show.c/h` - Term to string conversions | |
- `src/interactions/` - Future directory for interaction implementations | |
## Build Commands | |
- Build the project: `make` | |
- Clean build artifacts: `make clean` | |
- Run with default test term: `./bin/main` | |
- Run with custom term: `./bin/main "((λf.λx.!&0{f0,f1}=f;(f0 (f1 x)) λB.λT.λF.((B F) T)) λa.λb.a)"` | |
## Code Style | |
- Use C99 standard with portable implementation | |
- Implement 32-bit term representation as specified in suptt.md | |
- Follow the memory model from suptt.md - terms as 32-bit pointers with sub/tag/lab/val fields | |
- Use `snake_case` for functions and variables | |
- Constants should be in `UPPER_CASE` | |
- 2 space indentation | |
- Include error handling for all parser functions | |
- Use descriptive variable names that reflect domain terminology | |
- Add comments for complex interactions and term operations | |
- Keep header files minimal with only necessary declarations | |
- Implement core interactions as specified in the suptt.md spec | |
#include <stdio.h> | |
#include <stdlib.h> | |
#include <string.h> | |
#include <stdbool.h> | |
#include "types.h" | |
#include "memory.h" | |
#include "show.h" | |
// Maximum string length for term representation | |
#define MAX_STR_LEN 65536 | |
// Structure to track variable names | |
typedef struct { | |
uint32_t count; // Number of variables encountered | |
uint32_t* locations; // Array of variable locations | |
TermTag* types; // Array of variable types (VAR, CO0, CO1) | |
char** names; // Array of variable names | |
uint32_t capacity; // Capacity of the arrays | |
} VarNameTable; | |
// Structure to track collapser nodes | |
typedef struct { | |
uint32_t* locations; // Array of collapser locations | |
uint32_t count; // Number of collapsers | |
uint32_t capacity; // Capacity of the array | |
} ColTable; | |
// Initialize variable name table | |
void init_var_table(VarNameTable* table) { | |
table->count = 0; | |
table->capacity = 64; | |
table->locations = (uint32_t*)malloc(table->capacity * sizeof(uint32_t)); | |
table->types = (TermTag*)malloc(table->capacity * sizeof(TermTag)); | |
table->names = (char**)malloc(table->capacity * sizeof(char*)); | |
} | |
// Free variable name table | |
void free_var_table(VarNameTable* table) { | |
for (uint32_t i = 0; i < table->count; i++) { | |
free(table->names[i]); | |
} | |
free(table->locations); | |
free(table->types); | |
free(table->names); | |
} | |
// Initialize collapser table | |
void init_col_table(ColTable* table) { | |
table->count = 0; | |
table->capacity = 64; | |
table->locations = (uint32_t*)malloc(table->capacity * sizeof(uint32_t)); | |
} | |
// Free collapser table | |
void free_col_table(ColTable* table) { | |
free(table->locations); | |
} | |
// Add a variable to the table and return its name | |
char* add_variable(VarNameTable* table, uint32_t location, TermTag type) { | |
// Check if we need to expand the table | |
if (table->count >= table->capacity) { | |
table->capacity *= 2; | |
table->locations = (uint32_t*)realloc(table->locations, table->capacity * sizeof(uint32_t)); | |
table->types = (TermTag*)realloc(table->types, table->capacity * sizeof(TermTag)); | |
table->names = (char**)realloc(table->names, table->capacity * sizeof(char*)); | |
} | |
// Check if the variable is already in the table | |
for (uint32_t i = 0; i < table->count; i++) { | |
if (table->locations[i] == location && table->types[i] == type) { | |
return table->names[i]; | |
} | |
} | |
// Add the new variable | |
table->locations[table->count] = location; | |
table->types[table->count] = type; | |
// Generate a name for the variable based on its type | |
char* name = (char*)malloc(16); | |
if (type == CO0) { | |
sprintf(name, "a%u", table->count); | |
} else if (type == CO1) { | |
sprintf(name, "b%u", table->count); | |
} else { | |
sprintf(name, "x%u", table->count); | |
} | |
table->names[table->count] = name; | |
table->count++; | |
return name; | |
} | |
// Get a variable name from the table | |
char* get_var_name(VarNameTable* table, uint32_t location, TermTag type) { | |
for (uint32_t i = 0; i < table->count; i++) { | |
if (table->locations[i] == location && table->types[i] == type) { | |
return table->names[i]; | |
} | |
} | |
return "?"; // Unknown variable | |
} | |
// Forward declarations | |
void assign_var_ids(Term term, VarNameTable* var_table, ColTable* col_table); | |
void stringify_term(Term term, VarNameTable* var_table, char* buffer, int* pos, int max_len); | |
void stringify_collapsers(ColTable* col_table, VarNameTable* var_table, char* buffer, int* pos, int max_len); | |
// Register a collapser in the table | |
bool register_collapser(ColTable* table, uint32_t location) { | |
// Check if the collapser is already in the table | |
for (uint32_t i = 0; i < table->count; i++) { | |
if (table->locations[i] == location) { | |
return false; // Already registered | |
} | |
} | |
// Check if we need to expand the table | |
if (table->count >= table->capacity) { | |
table->capacity *= 2; | |
table->locations = (uint32_t*)realloc(table->locations, table->capacity * sizeof(uint32_t)); | |
} | |
// Add the new collapser | |
table->locations[table->count++] = location; | |
return true; // Newly registered | |
} | |
// Assign IDs to variables and register collapsers | |
void assign_var_ids(Term term, VarNameTable* var_table, ColTable* col_table) { | |
// Follow substitutions | |
if (TERM_SUB(term)) { | |
return; // No need to process substitutions | |
} | |
TermTag tag = TERM_TAG(term); | |
uint32_t val = TERM_VAL(term); | |
switch (tag) { | |
case VAR: | |
// Nothing specific to do for regular variables | |
break; | |
case CO0: | |
case CO1: { | |
// Register the collapser and process its value if newly added | |
if (register_collapser(col_table, val)) { | |
Term val_term = heap[val]; | |
assign_var_ids(val_term, var_table, col_table); | |
} | |
break; | |
} | |
case LAM: { | |
// Lambda has a body | |
uint32_t lam_loc = val; | |
add_variable(var_table, lam_loc, VAR); | |
assign_var_ids(heap[lam_loc], var_table, col_table); | |
break; | |
} | |
case LET: { | |
// Let has a value and a body | |
uint32_t let_loc = val; | |
add_variable(var_table, let_loc, VAR); | |
assign_var_ids(heap[let_loc], var_table, col_table); | |
assign_var_ids(heap[let_loc + 1], var_table, col_table); | |
break; | |
} | |
case APP: { | |
// Application has function and argument | |
uint32_t app_loc = val; | |
assign_var_ids(heap[app_loc], var_table, col_table); | |
assign_var_ids(heap[app_loc + 1], var_table, col_table); | |
break; | |
} | |
case SUP: { | |
// Superposition has left and right | |
uint32_t sup_loc = val; | |
assign_var_ids(heap[sup_loc], var_table, col_table); | |
assign_var_ids(heap[sup_loc + 1], var_table, col_table); | |
break; | |
} | |
case EFQ: { | |
// EFQ has a value | |
uint32_t efq_loc = val; | |
assign_var_ids(heap[efq_loc], var_table, col_table); | |
break; | |
} | |
case USE: { | |
// USE has a value and a body | |
uint32_t use_loc = val; | |
assign_var_ids(heap[use_loc], var_table, col_table); | |
assign_var_ids(heap[use_loc + 1], var_table, col_table); | |
break; | |
} | |
case ITE: { | |
// ITE has condition, then, else | |
uint32_t ite_loc = val; | |
assign_var_ids(heap[ite_loc], var_table, col_table); | |
assign_var_ids(heap[ite_loc + 1], var_table, col_table); | |
assign_var_ids(heap[ite_loc + 2], var_table, col_table); | |
break; | |
} | |
case SIG: | |
case TUP: { | |
// Sig and Tup have first and second | |
uint32_t pair_loc = val; | |
assign_var_ids(heap[pair_loc], var_table, col_table); | |
assign_var_ids(heap[pair_loc + 1], var_table, col_table); | |
break; | |
} | |
case GET: { | |
// GET has value and body | |
uint32_t get_loc = val; | |
add_variable(var_table, get_loc, VAR); // First var | |
add_variable(var_table, get_loc + 1, VAR); // Second var | |
assign_var_ids(heap[get_loc + 2], var_table, col_table); | |
assign_var_ids(heap[get_loc + 3], var_table, col_table); | |
break; | |
} | |
case ALL: { | |
// ALL has input and output | |
uint32_t all_loc = val; | |
add_variable(var_table, all_loc, VAR); | |
assign_var_ids(heap[all_loc], var_table, col_table); | |
assign_var_ids(heap[all_loc + 1], var_table, col_table); | |
break; | |
} | |
case EQL: { | |
// EQL has left and right | |
uint32_t eql_loc = val; | |
assign_var_ids(heap[eql_loc], var_table, col_table); | |
assign_var_ids(heap[eql_loc + 1], var_table, col_table); | |
break; | |
} | |
case RWT: { | |
// RWT has value and body | |
uint32_t rwt_loc = val; | |
assign_var_ids(heap[rwt_loc], var_table, col_table); | |
assign_var_ids(heap[rwt_loc + 1], var_table, col_table); | |
break; | |
} | |
// No need to process leaf nodes (SET, EMP, UNI, NIL, BIT, BT0, BT1, RFL) | |
default: | |
break; | |
} | |
} | |
// Stringify collapsers | |
void stringify_collapsers(ColTable* col_table, VarNameTable* var_table, char* buffer, int* pos, int max_len) { | |
// First, add all collapser variables | |
for (uint32_t i = 0; i < col_table->count; i++) { | |
uint32_t col_loc = col_table->locations[i]; | |
add_variable(var_table, col_loc, CO0); | |
add_variable(var_table, col_loc, CO1); | |
} | |
// Then, stringify each collapser | |
for (uint32_t i = 0; i < col_table->count; i++) { | |
uint32_t col_loc = col_table->locations[i]; | |
Term val_term = heap[col_loc]; | |
uint8_t lab = TERM_LAB(val_term); | |
// Get variable names | |
char* var0 = get_var_name(var_table, col_loc, CO0); | |
char* var1 = get_var_name(var_table, col_loc, CO1); | |
// Add collapser header | |
*pos += snprintf(buffer + *pos, max_len - *pos, "! &%u{%s,%s} = ", lab, var0, var1); | |
// Add the value | |
stringify_term(val_term, var_table, buffer, pos, max_len); | |
// Add separator | |
*pos += snprintf(buffer + *pos, max_len - *pos, ";\n"); | |
} | |
} | |
// Stringify a term | |
void stringify_term(Term term, VarNameTable* var_table, char* buffer, int* pos, int max_len) { | |
// Check for substitution | |
if (TERM_SUB(term)) { | |
term = heap[TERM_VAL(term)]; // Follow the substitution | |
} | |
TermTag tag = TERM_TAG(term); | |
uint32_t val = TERM_VAL(term); | |
uint8_t lab = TERM_LAB(term); | |
switch (tag) { | |
case VAR: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "%s", get_var_name(var_table, val, VAR)); | |
break; | |
case CO0: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "%s", get_var_name(var_table, val, CO0)); | |
break; | |
case CO1: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "%s", get_var_name(var_table, val, CO1)); | |
break; | |
case LAM: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "λ%s.", get_var_name(var_table, val, VAR)); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
break; | |
case LET: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "!%s = ", get_var_name(var_table, val, VAR)); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "; "); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
break; | |
case APP: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "("); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, " "); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, ")"); | |
break; | |
case SUP: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "&%u{", lab); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, ","); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "}"); | |
break; | |
case SET: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "*"); | |
break; | |
case EMP: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "⊥"); | |
break; | |
case EFQ: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "¬"); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
break; | |
case UNI: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "⊤"); | |
break; | |
case NIL: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "()"); | |
break; | |
case USE: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "-"); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "; "); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
break; | |
case BIT: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "𝔹"); | |
break; | |
case BT0: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "0"); | |
break; | |
case BT1: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "1"); | |
break; | |
case ITE: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "?"); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "{"); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "}; {"); | |
stringify_term(heap[val + 2], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "}"); | |
break; | |
case SIG: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "Σ%s:", get_var_name(var_table, val, VAR)); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "."); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
break; | |
case TUP: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "["); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, ","); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "]"); | |
break; | |
case GET: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "![%s,%s] = ", | |
get_var_name(var_table, val, VAR), | |
get_var_name(var_table, val + 1, VAR)); | |
stringify_term(heap[val + 2], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "; "); | |
stringify_term(heap[val + 3], var_table, buffer, pos, max_len); | |
break; | |
case ALL: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "Π%s:", get_var_name(var_table, val, VAR)); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "."); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
break; | |
case EQL: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "<"); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "="); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, ">"); | |
break; | |
case RFL: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "θ"); | |
break; | |
case RWT: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "%%"); | |
stringify_term(heap[val], var_table, buffer, pos, max_len); | |
*pos += snprintf(buffer + *pos, max_len - *pos, "; "); | |
stringify_term(heap[val + 1], var_table, buffer, pos, max_len); | |
break; | |
default: | |
*pos += snprintf(buffer + *pos, max_len - *pos, "<?unknown term>"); | |
break; | |
} | |
} | |
// Main function to convert a term to string | |
char* term_to_string(Term term) { | |
// Initialize tables | |
VarNameTable var_table; | |
ColTable col_table; | |
init_var_table(&var_table); | |
init_col_table(&col_table); | |
// Assign IDs to variables and register collapsers | |
assign_var_ids(term, &var_table, &col_table); | |
// Allocate buffer for the string representation | |
char* buffer = (char*)malloc(MAX_STR_LEN); | |
int pos = 0; | |
// First stringify all collapsers | |
stringify_collapsers(&col_table, &var_table, buffer, &pos, MAX_STR_LEN); | |
// Then stringify the main term | |
stringify_term(term, &var_table, buffer, &pos, MAX_STR_LEN); | |
// Free tables | |
free_var_table(&var_table); | |
free_col_table(&col_table); | |
return buffer; | |
} | |
// Display a term to the specified output stream | |
void show_term(FILE* stream, Term term) { | |
char* str = term_to_string(term); | |
fprintf(stream, "%s", str); | |
free(str); | |
} | |
#include <stdio.h> | |
#include "normal.h" | |
#include "whnf.h" | |
#include "memory.h" | |
// Reduce a term to full normal form | |
Term normal(Term term) { | |
//printf("normal\n"); | |
// First reduce to WHNF | |
term = whnf(term); | |
// Get term details | |
TermTag tag = TERM_TAG(term); | |
uint32_t val = TERM_VAL(term); | |
// Recursively normalize subterms based on the term type | |
switch (tag) { | |
case LAM: { | |
Term body = normal(heap[val]); | |
heap[val] = body; | |
break; | |
} | |
case APP: { | |
Term fun = normal(heap[val]); | |
Term arg = normal(heap[val + 1]); | |
heap[val + 0] = fun; | |
heap[val + 1] = arg; | |
break; | |
} | |
case SUP: { | |
Term left = normal(heap[val]); | |
Term right = normal(heap[val + 1]); | |
heap[val + 0] = left; | |
heap[val + 1] = right; | |
break; | |
} | |
case LET: { | |
Term binding = normal(heap[val]); | |
Term body = normal(heap[val + 1]); | |
heap[val + 0] = binding; | |
heap[val + 1] = body; | |
break; | |
} | |
case EFQ: { | |
Term arg = normal(heap[val]); | |
heap[val] = arg; | |
break; | |
} | |
case USE: { | |
Term arg = normal(heap[val]); | |
Term body = normal(heap[val + 1]); | |
heap[val + 0] = arg; | |
heap[val + 1] = body; | |
break; | |
} | |
case ITE: { | |
Term cond = normal(heap[val]); | |
Term then_branch = normal(heap[val + 1]); | |
Term else_branch = normal(heap[val + 2]); | |
heap[val + 0] = cond; | |
heap[val + 1] = then_branch; | |
heap[val + 2] = else_branch; | |
break; | |
} | |
case SIG: { | |
Term arg = normal(heap[val]); | |
Term body = normal(heap[val + 1]); | |
heap[val + 0] = arg; | |
heap[val + 1] = body; | |
break; | |
} | |
case TUP: { | |
Term first = normal(heap[val]); | |
Term second = normal(heap[val + 1]); | |
heap[val + 0] = first; | |
heap[val + 1] = second; | |
break; | |
} | |
case GET: { | |
Term pair = normal(heap[val + 2]); | |
Term body = normal(heap[val + 3]); | |
heap[val + 2] = pair; | |
heap[val + 3] = body; | |
break; | |
} | |
case ALL: { | |
Term arg = normal(heap[val]); | |
Term body = normal(heap[val + 1]); | |
heap[val + 0] = arg; | |
heap[val + 1] = body; | |
break; | |
} | |
case EQL: { | |
Term left = normal(heap[val]); | |
Term right = normal(heap[val + 1]); | |
heap[val + 0] = left; | |
heap[val + 1] = right; | |
break; | |
} | |
case RWT: { | |
Term eq = normal(heap[val]); | |
Term body = normal(heap[val + 1]); | |
heap[val + 0] = eq; | |
heap[val + 1] = body; | |
break; | |
} | |
default: | |
break; | |
} | |
return term; | |
} | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! &L{x0,x1} = (); K | |
// ------------------- COL-NIL | |
// x0 <- () | |
// x1 <- () | |
// K | |
Term col_nil(Term col, Term nil) { | |
interaction_count++; | |
printf("col_nil\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
// Create the nil term to be returned and for substitution | |
Term nil_term = make_term(NIL, 0, 0); | |
// Substitute the appropriate collapser variable with nil | |
heap[col_loc] = make_sub(nil_term); | |
// Return nil | |
return nil_term; | |
} | |
// TODO: get_tup | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! [x,y] = [a,b]; t | |
// ------------------ GET-TUP | |
// x <- a | |
// y <- b | |
// t | |
Term get_tup(Term get, Term tup) { | |
interaction_count++; | |
printf("get_tup\n"); | |
uint32_t get_loc = TERM_VAL(get); | |
uint32_t tup_loc = TERM_VAL(tup); | |
// Extract tuple values | |
Term a = heap[tup_loc + 0]; | |
Term b = heap[tup_loc + 1]; | |
// Extract GET body | |
Term bod = heap[get_loc + 1]; | |
// Update locations for GET variables with substitutions | |
heap[get_loc + 2] = make_sub(a); // Substitute a for x | |
heap[get_loc + 3] = make_sub(b); // Substitute b for y | |
return bod; | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! &L{x0,x1} = 0; K | |
// ------------------ COL-BT0 | |
// x0 <- 0 | |
// x1 <- 0 | |
// K | |
Term col_b_0(Term col, Term bt0) { | |
interaction_count++; | |
printf("col_b_0\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
heap[col_loc] = make_sub(bt0); | |
return bt0; | |
} | |
// TODO: rwt_rfl.c | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// % θ; t | |
// ------ RWT-RFL | |
// t | |
Term rwt_rfl(Term rwt, Term rfl) { | |
interaction_count++; | |
printf("rwt_rfl\n"); | |
uint32_t rwt_loc = TERM_VAL(rwt); | |
Term body = heap[rwt_loc + 1]; | |
return body; | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ({a,b} c) | |
// --------------- APP-SUP | |
// ! {x0,x1} = c; | |
// {(a x0),(b x1)} | |
Term app_sup(Term app, Term sup) { | |
interaction_count++; | |
printf("app_sup\n"); | |
uint32_t app_loc = TERM_VAL(app); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t sup_lab = TERM_LAB(sup); | |
Term arg = heap[app_loc + 1]; | |
Term lft = heap[sup_loc + 0]; | |
Term rgt = heap[sup_loc + 1]; | |
uint32_t col_loc = alloc(1); | |
heap[col_loc] = arg; | |
Term x0 = make_term(CO0, sup_lab, col_loc); | |
Term x1 = make_term(CO1, sup_lab, col_loc); | |
uint32_t app0_loc = alloc(2); | |
heap[app0_loc + 0] = lft; | |
heap[app0_loc + 1] = x0; | |
uint32_t app1_loc = alloc(2); | |
heap[app1_loc + 0] = rgt; | |
heap[app1_loc + 1] = x1; | |
uint32_t sup_new_loc = alloc(2); | |
heap[sup_new_loc + 0] = make_term(APP, 0, app0_loc); | |
heap[sup_new_loc + 1] = make_term(APP, 0, app1_loc); | |
return make_term(SUP, sup_lab, sup_new_loc); | |
} | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! [x,y] = &L{a,b}; k | |
// ---------------------------- GET-SUP | |
// ! &L{k0,k1} = k; | |
// &L{![x,y]=a;k0, ![x,y]=b;k1} | |
Term get_sup(Term get, Term sup) { | |
interaction_count++; | |
printf("get_sup\n"); | |
uint32_t get_loc = TERM_VAL(get); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t sup_lab = TERM_LAB(sup); | |
Term a = heap[sup_loc + 0]; | |
Term b = heap[sup_loc + 1]; | |
Term k = heap[get_loc + 1]; | |
// Create a new collapser for the body | |
uint32_t col_loc = alloc(1); | |
heap[col_loc] = k; | |
// Create collapse variables | |
Term k0 = make_term(CO0, sup_lab, col_loc); | |
Term k1 = make_term(CO1, sup_lab, col_loc); | |
// Create two new GET eliminators | |
uint32_t get0_loc = alloc(2); | |
heap[get0_loc + 0] = a; | |
heap[get0_loc + 1] = k0; | |
uint32_t get1_loc = alloc(2); | |
heap[get1_loc + 0] = b; | |
heap[get1_loc + 1] = k1; | |
// Create a new superposition | |
uint32_t new_sup_loc = alloc(2); | |
heap[new_sup_loc + 0] = make_term(GET, 0, get0_loc); | |
heap[new_sup_loc + 1] = make_term(GET, 0, get1_loc); | |
return make_term(SUP, sup_lab, new_sup_loc); | |
} | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// - &L{a,b}; k | |
// -------------------- USE-SUP | |
// ! &L{k0,k1} = k; | |
// &L{-a;k0, -b;k1} | |
Term use_sup(Term use, Term sup) { | |
interaction_count++; | |
printf("use_sup\n"); | |
uint32_t use_loc = TERM_VAL(use); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t sup_lab = TERM_LAB(sup); | |
Term a = heap[sup_loc + 0]; | |
Term b = heap[sup_loc + 1]; | |
Term k = heap[use_loc + 1]; | |
// Create a collapser for k | |
uint32_t col_loc = alloc(1); | |
heap[col_loc] = k; | |
// Create new USE nodes for each branch | |
uint32_t use_a_loc = alloc(2); | |
heap[use_a_loc + 0] = a; | |
heap[use_a_loc + 1] = make_term(CO0, sup_lab, col_loc); | |
uint32_t use_b_loc = alloc(2); | |
heap[use_b_loc + 0] = b; | |
heap[use_b_loc + 1] = make_term(CO1, sup_lab, col_loc); | |
// Create new superposition with the USE nodes | |
uint32_t new_sup_loc = alloc(2); | |
heap[new_sup_loc + 0] = make_term(USE, 0, use_a_loc); | |
heap[new_sup_loc + 1] = make_term(USE, 0, use_b_loc); | |
return make_term(SUP, sup_lab, new_sup_loc); | |
} | |
// TODO: ite_b_0.c | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ? 0 { t } ; { f } | |
// ----------------- ITE-BT0 | |
// f | |
Term ite_b_0(Term ite, Term bt0) { | |
interaction_count++; | |
printf("ite_b_0\n"); | |
uint32_t ite_loc = TERM_VAL(ite); | |
// ITE structure: {cnd, thn, els} | |
// We need to return the else branch | |
Term els = heap[ite_loc + 2]; | |
return els; | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! &L{r,s} = λx.f; | |
// K | |
// ----------------- COL-LAM | |
// r <- λx0.f0 | |
// s <- λx1.f1 | |
// x <- &L{x0,x1} | |
// ! &L{f0,f1} = f; | |
// K | |
Term col_lam(Term col, Term lam) { | |
interaction_count++; | |
printf("col_lam\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
uint32_t lam_loc = TERM_VAL(lam); | |
uint8_t col_lab = TERM_LAB(col); | |
uint8_t is_co0 = TERM_TAG(col) == CO0; | |
Term bod = heap[lam_loc + 0]; | |
// Create new lambda variables | |
uint32_t lam0_loc = alloc(1); | |
uint32_t lam1_loc = alloc(1); | |
// Create a superposition for lambda's variable | |
uint32_t sup_loc = alloc(2); | |
heap[sup_loc + 0] = make_term(VAR, 0, lam0_loc); | |
heap[sup_loc + 1] = make_term(VAR, 0, lam1_loc); | |
// Replace lambda's variable with the superposition | |
heap[lam_loc] = make_sub(make_term(SUP, col_lab, sup_loc)); | |
// Create a collapser for lambda's body | |
uint32_t col_new_loc = alloc(1); | |
heap[col_new_loc] = bod; | |
// Set up new lambda bodies | |
heap[lam0_loc] = make_term(CO0, col_lab, col_new_loc); | |
heap[lam1_loc] = make_term(CO1, col_lab, col_new_loc); | |
// Create and return the appropriate lambda | |
Term new_lam; | |
if (is_co0) { | |
heap[col_loc] = make_sub(make_term(LAM, 0, lam1_loc)); | |
new_lam = make_term(LAM, 0, lam0_loc); | |
} else { | |
heap[col_loc] = make_sub(make_term(LAM, 0, lam0_loc)); | |
new_lam = make_term(LAM, 0, lam1_loc); | |
} | |
return new_lam; | |
} | |
// TODO: rwt_sup.c | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// % &L{a,b}; k | |
// ---------------- RWT-SUP | |
// ! &L{k0,k1} = k; | |
// &L{%a;k0, %b;k1} | |
Term rwt_sup(Term rwt, Term sup) { | |
interaction_count++; | |
printf("rwt_sup\n"); | |
uint32_t rwt_loc = TERM_VAL(rwt); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t sup_lab = TERM_LAB(sup); | |
Term bod = heap[rwt_loc + 1]; // The body of the rewrite | |
Term lft = heap[sup_loc + 0]; // Left side of superposition | |
Term rgt = heap[sup_loc + 1]; // Right side of superposition | |
// Create a collapser for the rewrite body | |
uint32_t col_loc = alloc(1); | |
heap[col_loc] = bod; | |
// Create the collapser variables | |
Term k0 = make_term(CO0, sup_lab, col_loc); | |
Term k1 = make_term(CO1, sup_lab, col_loc); | |
// Create the new rewrites | |
uint32_t rwt0_loc = alloc(2); | |
heap[rwt0_loc + 0] = lft; | |
heap[rwt0_loc + 1] = k0; | |
uint32_t rwt1_loc = alloc(2); | |
heap[rwt1_loc + 0] = rgt; | |
heap[rwt1_loc + 1] = k1; | |
// Create the resulting superposition | |
uint32_t sup_new_loc = alloc(2); | |
heap[sup_new_loc + 0] = make_term(RWT, 0, rwt0_loc); | |
heap[sup_new_loc + 1] = make_term(RWT, 0, rwt1_loc); | |
return make_term(SUP, sup_lab, sup_new_loc); | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! x = t; body | |
// ------------- LET | |
// x <- t | |
// body | |
Term let_red(Term let) { | |
interaction_count++; | |
printf("let_red\n"); | |
uint32_t let_loc = TERM_VAL(let); | |
uint32_t val_loc = let_loc; | |
uint32_t bod_loc = let_loc + 1; | |
Term val = heap[val_loc]; | |
Term bod = heap[bod_loc]; | |
heap[val_loc] = make_sub(val); | |
return bod; | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// (λx.f a) | |
// -------- APP-LAM | |
// x <- a | |
// f | |
Term app_lam(Term app, Term lam) { | |
interaction_count++; | |
printf("app_lam\n"); | |
uint32_t app_loc = TERM_VAL(app); | |
uint32_t lam_loc = TERM_VAL(lam); | |
uint32_t arg_loc = app_loc + 1; | |
Term arg = heap[arg_loc + 0]; | |
Term bod = heap[lam_loc + 0]; | |
heap[lam_loc] = make_sub(arg); | |
return bod; | |
} | |
// TODO: ite_b_1.c | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ? 1 { t } ; { f } | |
// ----------------- ITE-BT1 | |
// t | |
Term ite_b_1(Term ite, Term bt1) { | |
interaction_count++; | |
printf("ite_b_1\n"); | |
uint32_t ite_loc = TERM_VAL(ite); | |
// ITE structure: {cnd, thn, els} | |
// We need to return the then branch | |
Term thn = heap[ite_loc + 1]; | |
return thn; | |
} | |
// TODO: ite_sup.c | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ? &L{a,b} {t} ; {f} | |
// ---------------------------- ITE-SUP | |
// ! &L{t0,t1} = t; | |
// ! &L{f0,f1} = f; | |
// &L{?a{t0};{f0}, ?b{t1};{f1}} | |
Term ite_sup(Term ite, Term sup) { | |
interaction_count++; | |
printf("ite_sup\n"); | |
uint32_t ite_loc = TERM_VAL(ite); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t sup_lab = TERM_LAB(sup); | |
Term thn = heap[ite_loc + 1]; | |
Term els = heap[ite_loc + 2]; | |
Term lft = heap[sup_loc + 0]; | |
Term rgt = heap[sup_loc + 1]; | |
// Create collapsers for then and else branches | |
uint32_t col_thn_loc = alloc(1); | |
uint32_t col_els_loc = alloc(1); | |
heap[col_thn_loc] = thn; | |
heap[col_els_loc] = els; | |
// Create variables for the collapsers | |
Term t0 = make_term(CO0, sup_lab, col_thn_loc); | |
Term t1 = make_term(CO1, sup_lab, col_thn_loc); | |
Term f0 = make_term(CO0, sup_lab, col_els_loc); | |
Term f1 = make_term(CO1, sup_lab, col_els_loc); | |
// Create new ITE terms for each side of the superposition | |
uint32_t ite0_loc = alloc(3); | |
heap[ite0_loc + 0] = lft; | |
heap[ite0_loc + 1] = t0; | |
heap[ite0_loc + 2] = f0; | |
uint32_t ite1_loc = alloc(3); | |
heap[ite1_loc + 0] = rgt; | |
heap[ite1_loc + 1] = t1; | |
heap[ite1_loc + 2] = f1; | |
// Create new superposition with the two ITEs | |
uint32_t sup_new_loc = alloc(2); | |
heap[sup_new_loc + 0] = make_term(ITE, 0, ite0_loc); | |
heap[sup_new_loc + 1] = make_term(ITE, 0, ite1_loc); | |
return make_term(SUP, sup_lab, sup_new_loc); | |
} | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// - () t | |
// ------ USE-NIL | |
// t | |
Term use_nil(Term use, Term nil) { | |
interaction_count++; | |
printf("use_nil\n"); | |
uint32_t use_loc = TERM_VAL(use); | |
// The body is the second term in the Use Node | |
Term body = heap[use_loc + 1]; | |
return body; | |
} | |
#include <stdio.h> | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! &L{x0,x1} = [a,b]; K | |
// ---------------------- COL-TUP | |
// x0 <- [a0,b0] | |
// x1 <- [a1,b1] | |
// ! &L{a0,a1} = a | |
// ! &L{b0,b1} = b | |
// K | |
Term col_tup(Term col, Term tup) { | |
interaction_count++; | |
printf("col_tup\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
uint32_t tup_loc = TERM_VAL(tup); | |
uint8_t col_lab = TERM_LAB(col); | |
uint8_t is_co0 = TERM_TAG(col) == CO0; | |
Term fst = heap[tup_loc + 0]; | |
Term snd = heap[tup_loc + 1]; | |
// Create new collapsers for fst and snd | |
uint32_t col_fst_loc = alloc(1); | |
uint32_t col_snd_loc = alloc(1); | |
heap[col_fst_loc] = fst; | |
heap[col_snd_loc] = snd; | |
// Create new variables for the collapsers | |
Term a0 = make_term(CO0, col_lab, col_fst_loc); | |
Term a1 = make_term(CO1, col_lab, col_fst_loc); | |
Term b0 = make_term(CO0, col_lab, col_snd_loc); | |
Term b1 = make_term(CO1, col_lab, col_snd_loc); | |
// Create two new tuples | |
uint32_t tup0_loc = alloc(2); | |
uint32_t tup1_loc = alloc(2); | |
heap[tup0_loc + 0] = a0; | |
heap[tup0_loc + 1] = b0; | |
heap[tup1_loc + 0] = a1; | |
heap[tup1_loc + 1] = b1; | |
// Create the new tuples | |
Term tup0 = make_term(TUP, 0, tup0_loc); | |
Term tup1 = make_term(TUP, 0, tup1_loc); | |
// Substitute the variables | |
if (is_co0) { | |
heap[col_loc] = make_sub(tup1); | |
return tup0; | |
} else { | |
heap[col_loc] = make_sub(tup0); | |
return tup1; | |
} | |
} | |
#include <stdio.h> | |
#include "../whnf.h" | |
#include "../memory.h" | |
#include "../types.h" | |
// ! &L{x0,x1} = 1; K | |
// ------------------ COL-BT1 | |
// x0 <- 1 | |
// x1 <- 1 | |
// K | |
Term col_b_1(Term col, Term bt1) { | |
interaction_count++; | |
printf("col_b_1\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
heap[col_loc] = make_sub(bt1); | |
return bt1; | |
} | |
#include "../memory.h" | |
#include "../types.h" | |
#include "../whnf.h" | |
#include "../types.h" | |
#include <stdio.h> | |
#include "../types.h" | |
// ! &L{x,y} = &L{a,b}; K (if equal labels) | |
// -------------------- COL-SUP | |
// x <- a | |
// y <- b | |
// K | |
// | |
// ! &L{x,y} = &R{a,b}; K (if different labels) | |
// -------------------- COL-SUP | |
// x <- &R{a0,b0} | |
// y <- &R{a1,b1} | |
// ! &L{a0,a1} = a | |
// ! &L{b0,b1} = b | |
// K | |
Term col_sup(Term col, Term sup) { | |
interaction_count++; | |
printf("col_sup\n"); | |
uint32_t col_loc = TERM_VAL(col); | |
uint32_t sup_loc = TERM_VAL(sup); | |
uint8_t col_lab = TERM_LAB(col); | |
uint8_t sup_lab = TERM_LAB(sup); | |
uint8_t is_co0 = TERM_TAG(col) == CO0; | |
Term lft = heap[sup_loc + 0]; | |
Term rgt = heap[sup_loc + 1]; | |
if (col_lab == sup_lab) { | |
// Labels match: simple substitution | |
if (is_co0) { | |
heap[col_loc] = make_sub(rgt); | |
return lft; | |
} else { | |
heap[col_loc] = make_sub(lft); | |
return rgt; | |
} | |
} else { | |
// Labels don't match: create nested collapsers | |
uint32_t col_lft_loc = alloc(1); | |
uint32_t col_rgt_loc = alloc(1); | |
heap[col_lft_loc] = lft; | |
heap[col_rgt_loc] = rgt; | |
uint32_t sup0_loc = alloc(2); | |
heap[sup0_loc + 0] = make_term(CO0, col_lab, col_lft_loc); | |
heap[sup0_loc + 1] = make_term(CO0, col_lab, col_rgt_loc); | |
uint32_t sup1_loc = alloc(2); | |
heap[sup1_loc + 0] = make_term(CO1, col_lab, col_lft_loc); | |
heap[sup1_loc + 1] = make_term(CO1, col_lab, col_rgt_loc); | |
if (is_co0) { | |
heap[col_loc] = make_sub(make_term(SUP, sup_lab, sup1_loc)); | |
return make_term(SUP, sup_lab, sup0_loc); | |
} else { | |
heap[col_loc] = make_sub(make_term(SUP, sup_lab, sup0_loc)); | |
return make_term(SUP, sup_lab, sup1_loc); | |
} | |
} | |
} | |
#include <stdio.h> | |
#include <stdlib.h> | |
#include "memory.h" | |
// The heap (global memory buffer) | |
Term heap[HEAP_SIZE]; | |
// The current allocation pointer | |
uint32_t heap_ptr = 0; | |
// Initialize the memory system | |
void init_memory() { | |
heap_ptr = 0; | |
for (uint32_t i = 0; i < HEAP_SIZE; i++) { | |
heap[i] = 0; | |
} | |
} | |
// Allocate n consecutive terms in memory | |
uint32_t alloc(uint32_t n) { | |
if (heap_ptr + n >= HEAP_SIZE) { | |
fprintf(stderr, "Error: Out of memory\n"); | |
exit(1); | |
} | |
uint32_t ptr = heap_ptr; | |
heap_ptr += n; | |
return ptr; | |
} | |
// Create a term with the given tag and value | |
Term make_term(TermTag tag, uint8_t lab, uint32_t val) { | |
return MAKE_TERM(false, tag, lab, val); | |
} | |
// Create a substitution term | |
Term make_sub(Term term) { | |
return MAKE_TERM(true, TERM_TAG(term), TERM_LAB(term), TERM_VAL(term)); | |
} | |
// Remove the substitution bit from a term | |
Term clear_sub(Term term) { | |
return MAKE_TERM(false, TERM_TAG(term), TERM_LAB(term), TERM_VAL(term)); | |
} | |
// Check if a term is a substitution | |
bool has_sub(Term term) { | |
return TERM_SUB(term); | |
}#ifndef TYPES_H | |
#define TYPES_H | |
#include <stdint.h> | |
#include <stdbool.h> | |
// Term tags | |
typedef enum { | |
VAR, // Variable | |
LET, // Let binding | |
SUP, // Superposition | |
CO0, // Collapser first variable | |
CO1, // Collapser second variable | |
SET, // Universe type | |
EMP, // Empty type | |
EFQ, // Empty type elimination | |
UNI, // Unit type | |
NIL, // Unit value | |
USE, // Unit elimination | |
BIT, // Bool type | |
BT0, // False value | |
BT1, // True value | |
ITE, // Bool elimination | |
SIG, // Sigma type | |
TUP, // Tuple (pair) | |
GET, // Sigma elimination | |
ALL, // Pi type | |
LAM, // Lambda | |
APP, // Application | |
EQL, // Equality type | |
RFL, // Reflexivity | |
RWT // Equality elimination | |
} TermTag; | |
// Term 32-bit packed representation | |
typedef uint32_t Term; | |
// Term components | |
#define TERM_SUB_MASK 0x80000000 // 1-bit: Is this a substitution? | |
#define TERM_TAG_MASK 0x7C000000 // 5-bits: Term tag | |
#define TERM_LAB_MASK 0x03000000 // 2-bits: Label for superpositions | |
#define TERM_VAL_MASK 0x00FFFFFF // 24-bits: Value/pointer | |
// Term manipulation macros | |
#define TERM_SUB(term) (((term) & TERM_SUB_MASK) != 0) | |
#define TERM_TAG(term) (((term) & TERM_TAG_MASK) >> 26) | |
#define TERM_LAB(term) (((term) & TERM_LAB_MASK) >> 24) | |
#define TERM_VAL(term) ((term) & TERM_VAL_MASK) | |
// Term creation macro | |
#define MAKE_TERM(sub, tag, lab, val) \ | |
(((sub) ? TERM_SUB_MASK : 0) | \ | |
(((tag) << 26) & TERM_TAG_MASK) | \ | |
(((lab) << 24) & TERM_LAB_MASK) | \ | |
((val) & TERM_VAL_MASK)) | |
// Max heap size (2^24 terms) | |
#define HEAP_SIZE (1 << 24) | |
// Interaction counter | |
extern uint64_t interaction_count; | |
#endif // TYPES_H#ifndef PARSE_H | |
#define PARSE_H | |
#include "types.h" | |
#include <stdint.h> | |
#include <stdbool.h> | |
// Maximum name length for identifiers | |
#define MAX_NAME_LEN 64 | |
// Maximum number of variable uses and bindings to track | |
#define MAX_USES 1024 | |
#define MAX_VARS 1024 | |
// Variable use structure to track unresolved variable uses | |
typedef struct { | |
char name[MAX_NAME_LEN]; // Variable name | |
uint32_t loc; // Location in heap to update | |
} VarUse; | |
// Variable binding structure | |
typedef struct { | |
char name[MAX_NAME_LEN]; // Variable name | |
Term term; // Term representing the variable (VAR, CO0, CO1) | |
} VarBinding; | |
// Parser state structure | |
typedef struct { | |
const char* input; // Input string | |
size_t pos; // Current position | |
size_t line; // Current line number | |
size_t col; // Current column number | |
// Uses and bindings for variable resolution | |
VarUse lcs[MAX_USES]; // Array of unresolved variable uses | |
size_t lcs_count; // Number of uses | |
VarBinding vrs[MAX_VARS]; // Map from names to variable terms | |
size_t vrs_count; // Number of bindings | |
} Parser; | |
// Initialize a parser with the given input string | |
void init_parser(Parser* parser, const char* input); | |
// Main parsing functions | |
Term parse_string(const char* input); | |
Term parse_file(const char* filename); | |
uint32_t parse_term_alloc(Parser* parser); | |
void parse_term(Parser* parser, uint32_t loc); | |
// Variable management | |
void add_var_use(Parser* parser, const char* name, uint32_t loc); | |
void bind_var(Parser* parser, const char* name, Term term); | |
Term* lookup_var_binding(Parser* parser, const char* name); | |
void resolve_var_uses(Parser* parser); | |
// Helper parsing functions | |
void skip(Parser* parser); | |
char peek_char(Parser* parser); | |
char next_char(Parser* parser); | |
bool peek_is(Parser* parser, char c); | |
bool consume(Parser* parser, const char* str); | |
bool expect(Parser* parser, const char* token, const char* error_context); | |
uint32_t parse_uint(Parser* parser); | |
char* parse_name(Parser* parser); | |
void parse_error(Parser* parser, const char* message); | |
// UTF-8 helpers | |
bool check_utf8(Parser* parser, uint8_t b1, uint8_t b2); | |
bool check_utf8_3bytes(Parser* parser, uint8_t b1, uint8_t b2, uint8_t b3); | |
bool check_utf8_4bytes(Parser* parser, uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4); | |
void consume_utf8(Parser* parser, int bytes); | |
// Term creation helpers | |
uint32_t alloc_term(uint32_t n); | |
void store_term(uint32_t loc, TermTag tag, uint8_t label, uint32_t value); | |
// Individual term parsers | |
void parse_term_var(Parser* parser, uint32_t loc); | |
void parse_term_let(Parser* parser, uint32_t loc); | |
void parse_term_sup(Parser* parser, uint32_t loc); | |
void parse_term_col(Parser* parser, uint32_t loc); | |
void parse_term_set(Parser* parser, uint32_t loc); | |
void parse_term_emp(Parser* parser, uint32_t loc); | |
void parse_term_efq(Parser* parser, uint32_t loc); | |
void parse_term_uni(Parser* parser, uint32_t loc); | |
void parse_term_nil(Parser* parser, uint32_t loc); | |
void parse_term_use(Parser* parser, uint32_t loc); | |
void parse_term_bit(Parser* parser, uint32_t loc); | |
void parse_term_bt0(Parser* parser, uint32_t loc); | |
void parse_term_bt1(Parser* parser, uint32_t loc); | |
void parse_term_ite(Parser* parser, uint32_t loc); | |
void parse_term_sig(Parser* parser, uint32_t loc); | |
void parse_term_tup(Parser* parser, uint32_t loc); | |
void parse_term_get(Parser* parser, uint32_t loc); | |
void parse_term_all(Parser* parser, uint32_t loc); | |
void parse_term_lam(Parser* parser, uint32_t loc); | |
void parse_term_app(Parser* parser, uint32_t loc); | |
void parse_term_eql(Parser* parser, uint32_t loc); | |
void parse_term_rfl(Parser* parser, uint32_t loc); | |
void parse_term_rwt(Parser* parser, uint32_t loc); | |
#endif // PARSE_H#ifndef WHNF_H | |
#define WHNF_H | |
#include "types.h" | |
// Global interaction counter | |
extern uint64_t interaction_count; | |
// Function declarations for interactions | |
Term app_lam(Term app, Term lam); | |
Term app_sup(Term app, Term sup); | |
Term col_lam(Term col, Term lam); | |
Term col_sup(Term col, Term sup); | |
Term col_nil(Term col, Term nil); | |
Term col_b_0(Term col, Term bt0); | |
Term col_b_1(Term col, Term bt1); | |
Term col_tup(Term col, Term tup); | |
Term get_sup(Term get, Term sup); | |
Term get_tup(Term get, Term tup); | |
Term ite_b_0(Term ite, Term bt0); | |
Term ite_b_1(Term ite, Term bt1); | |
Term ite_sup(Term ite, Term sup); | |
Term rwt_rfl(Term rwt, Term rfl); | |
Term rwt_sup(Term rwt, Term sup); | |
Term use_nil(Term use, Term nil); | |
Term use_sup(Term use, Term sup); | |
Term let_red(Term let); | |
// Reduce a term to weak head normal form | |
Term whnf(Term term); | |
#endif // WHNF_H#include <stdio.h> | |
#include <stdlib.h> | |
#include <string.h> | |
#include <ctype.h> | |
#include "../parse.h" | |
#include "../memory.h" | |
// Main term parser - dispatcher for specific term types | |
void parse_term(Parser* parser, uint32_t loc) { | |
skip(parser); | |
if (parser->input[parser->pos] == '\0') { | |
parse_error(parser, "Unexpected end of input"); | |
} | |
unsigned char c = (unsigned char)parser->input[parser->pos]; | |
if (isalpha(c) || c == '_') { | |
parse_term_var(parser, loc); | |
} else if (c == '!') { | |
parser->pos++; // Peek ahead | |
char next = peek_char(parser); | |
parser->pos--; // Restore position | |
if (next == '&') { | |
parse_term_col(parser, loc); | |
} else if (next == '[') { | |
parse_term_get(parser, loc); | |
} else { | |
parse_term_let(parser, loc); | |
} | |
} else if (c == '&') { | |
parse_term_sup(parser, loc); | |
} else if (c == '*') { | |
parse_term_set(parser, loc); | |
} else if (c == 0xE2) { | |
if ((unsigned char)parser->input[parser->pos + 1] == 0x8A) { | |
if ((unsigned char)parser->input[parser->pos + 2] == 0xA5) { | |
parse_term_emp(parser, loc); | |
} else if ((unsigned char)parser->input[parser->pos + 2] == 0xA4) { | |
parse_term_uni(parser, loc); | |
} else { | |
parse_error(parser, "Unknown Unicode symbol starting with E2 8A"); | |
} | |
} else { | |
parse_error(parser, "Unknown Unicode symbol starting with E2"); | |
} | |
} else if (c == 0xC2 && (unsigned char)parser->input[parser->pos + 1] == 0xAC) { | |
parse_term_efq(parser, loc); | |
} else if (c == '(' && parser->input[parser->pos + 1] == ')') { | |
parse_term_nil(parser, loc); | |
} else if (c == '-') { | |
parse_term_use(parser, loc); | |
} else if (c == 0xF0 && | |
(unsigned char)parser->input[parser->pos + 1] == 0x9D && | |
(unsigned char)parser->input[parser->pos + 2] == 0x94 && | |
(unsigned char)parser->input[parser->pos + 3] == 0xB9) { | |
parse_term_bit(parser, loc); | |
} else if (c == '0') { | |
parse_term_bt0(parser, loc); | |
} else if (c == '1') { | |
parse_term_bt1(parser, loc); | |
} else if (c == '?') { | |
parse_term_ite(parser, loc); | |
} else if (c == 0xCE) { | |
unsigned char next_byte = (unsigned char)parser->input[parser->pos + 1]; | |
if (next_byte == 0xA3) { | |
parse_term_sig(parser, loc); | |
} else if (next_byte == 0xA0) { | |
parse_term_all(parser, loc); | |
} else if (next_byte == 0xBB) { | |
parse_term_lam(parser, loc); | |
} else if (next_byte == 0xB8) { | |
parse_term_rfl(parser, loc); | |
} else { | |
parse_error(parser, "Unknown Unicode symbol starting with CE"); | |
} | |
} else if (c == '[') { | |
parse_term_tup(parser, loc); | |
} else if (c == '(') { | |
parse_term_app(parser, loc); | |
} else if (c == '<') { | |
parse_term_eql(parser, loc); | |
} else if (c == '%') { | |
parse_term_rwt(parser, loc); | |
} else { | |
char error_msg[100]; | |
snprintf(error_msg, sizeof(error_msg), "Unexpected character: %c (code: %d)", c, (int)c); | |
parse_error(parser, error_msg); | |
} | |
} | |
#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a sigma type | |
void parse_term_sig(Parser* parser, uint32_t loc) { | |
if (check_utf8(parser, 0xCE, 0xA3)) { | |
consume_utf8(parser, 2); | |
} else if (!consume(parser, "Σ")) { | |
parse_error(parser, "Expected 'Σ' for sigma type"); | |
} | |
char* name = parse_name(parser); | |
expect(parser, ":", "after name in sigma type"); | |
uint32_t sig_node = alloc_term(2); | |
uint32_t fst_loc = sig_node; | |
uint32_t snd_loc = sig_node + 1; | |
parse_term(parser, fst_loc); | |
expect(parser, ".", "after first type in sigma type"); | |
Term var_term = make_term(VAR, 0, fst_loc); | |
bind_var(parser, name, var_term); | |
parse_term(parser, snd_loc); | |
store_term(loc, SIG, 0, sig_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a unit value | |
void parse_term_nil(Parser* parser, uint32_t loc) { | |
expect(parser, "()", "for unit value"); | |
store_term(loc, NIL, 0, 0); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse an equality elimination | |
void parse_term_rwt(Parser* parser, uint32_t loc) { | |
expect(parser, "%", "for equality elimination"); | |
uint32_t rwt_node = alloc_term(2); | |
uint32_t eq_loc = rwt_node; | |
uint32_t bod_loc = rwt_node + 1; | |
parse_term(parser, eq_loc); | |
expect(parser, ";", "after equality proof"); | |
parse_term(parser, bod_loc); | |
store_term(loc, RWT, 0, rwt_node); | |
}#include <string.h> | |
#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a collapser | |
void parse_term_col(Parser* parser, uint32_t loc) { | |
expect(parser, "!&", "for collapser"); | |
uint8_t label = parse_uint(parser) & 3; | |
expect(parser, "{", "after label in collapser"); | |
static char var1[MAX_NAME_LEN], var2[MAX_NAME_LEN]; | |
char* temp_name = parse_name(parser); | |
strncpy(var1, temp_name, MAX_NAME_LEN - 1); | |
var1[MAX_NAME_LEN - 1] = '\0'; | |
expect(parser, ",", "between names in collapser"); | |
temp_name = parse_name(parser); | |
strncpy(var2, temp_name, MAX_NAME_LEN - 1); | |
var2[MAX_NAME_LEN - 1] = '\0'; | |
expect(parser, "}", "after names in collapser"); | |
expect(parser, "=", "after names in collapser"); | |
// Allocate a node specifically for the collapse value | |
uint32_t col_node = alloc_term(1); | |
// Create collapse variable terms that point to the col_node, NOT to the loc | |
Term co0_term = make_term(CO0, label, col_node); | |
Term co1_term = make_term(CO1, label, col_node); | |
bind_var(parser, var1, co0_term); | |
bind_var(parser, var2, co1_term); | |
// Parse the collapse value into col_node | |
parse_term(parser, col_node); | |
expect(parser, ";", "after value in collapser"); | |
// Parse the body of the collapse into loc | |
parse_term(parser, loc); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a false value | |
void parse_term_bt0(Parser* parser, uint32_t loc) { | |
expect(parser, "0", "for false value"); | |
store_term(loc, BT0, 0, 0); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a lambda | |
void parse_term_lam(Parser* parser, uint32_t loc) { | |
if (check_utf8(parser, 0xCE, 0xBB)) { | |
consume_utf8(parser, 2); | |
} else if (!consume(parser, "λ")) { | |
parse_error(parser, "Expected 'λ' for lambda"); | |
} | |
char* name = parse_name(parser); | |
expect(parser, ".", "after name in lambda"); | |
uint32_t lam_node = alloc_term(1); | |
Term var_term = make_term(VAR, 0, lam_node); | |
bind_var(parser, name, var_term); | |
parse_term(parser, lam_node); | |
store_term(loc, LAM, 0, lam_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a bool elimination | |
void parse_term_ite(Parser* parser, uint32_t loc) { | |
expect(parser, "?", "for bool elimination"); | |
uint32_t ite_node = alloc_term(3); | |
uint32_t cnd_loc = ite_node; | |
uint32_t thn_loc = ite_node + 1; | |
uint32_t els_loc = ite_node + 2; | |
parse_term(parser, cnd_loc); | |
expect(parser, "{", "after condition in bool elimination"); | |
parse_term(parser, thn_loc); | |
expect(parser, "};{", "between branches in bool elimination"); | |
parse_term(parser, els_loc); | |
expect(parser, "}", "after else branch in bool elimination"); | |
store_term(loc, ITE, 0, ite_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse an empty type elimination | |
void parse_term_efq(Parser* parser, uint32_t loc) { | |
if (check_utf8(parser, 0xC2, 0xAC)) { | |
consume_utf8(parser, 2); | |
uint32_t efq_node = alloc_term(1); | |
parse_term(parser, efq_node); | |
store_term(loc, EFQ, 0, efq_node); | |
} else { | |
parse_error(parser, "Expected '¬' for empty type elimination"); | |
} | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a universe type | |
void parse_term_set(Parser* parser, uint32_t loc) { | |
expect(parser, "*", "for universe type"); | |
store_term(loc, SET, 0, 0); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a sigma type elimination | |
void parse_term_get(Parser* parser, uint32_t loc) { | |
expect(parser, "![", "for sigma elimination"); | |
char* name1 = parse_name(parser); | |
expect(parser, ",", "between names in sigma elimination"); | |
char* name2 = parse_name(parser); | |
expect(parser, "]", "after names in sigma elimination"); | |
expect(parser, "=", "after names in sigma elimination"); | |
uint32_t get_node = alloc_term(3); | |
uint32_t val_loc = get_node; | |
uint32_t bod_loc = get_node + 1; | |
parse_term(parser, val_loc); | |
expect(parser, ";", "after pair in sigma elimination"); | |
Term fst_var = make_term(VAR, 0, get_node + 2); | |
Term snd_var = make_term(VAR, 0, get_node + 2); | |
bind_var(parser, name1, fst_var); | |
bind_var(parser, name2, snd_var); | |
parse_term(parser, bod_loc); | |
store_term(loc, GET, 0, get_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a unit elimination | |
void parse_term_use(Parser* parser, uint32_t loc) { | |
expect(parser, "-", "for unit elimination"); | |
uint32_t use_node = alloc_term(2); | |
uint32_t val_loc = use_node; | |
uint32_t bod_loc = use_node + 1; | |
parse_term(parser, val_loc); | |
expect(parser, ";", "after unit value"); | |
parse_term(parser, bod_loc); | |
store_term(loc, USE, 0, use_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a reflexivity proof | |
void parse_term_rfl(Parser* parser, uint32_t loc) { | |
if (check_utf8(parser, 0xCE, 0xB8)) { | |
consume_utf8(parser, 2); | |
} else if (!consume(parser, "θ")) { | |
parse_error(parser, "Expected 'θ' for reflexivity"); | |
} | |
store_term(loc, RFL, 0, 0); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a let binding | |
void parse_term_let(Parser* parser, uint32_t loc) { | |
expect(parser, "!", "for let binding"); | |
char* name = parse_name(parser); | |
expect(parser, "=", "after name in let binding"); | |
uint32_t let_node = alloc_term(2); | |
uint32_t val_loc = let_node; | |
uint32_t bod_loc = let_node + 1; | |
bind_var(parser, name, make_term(VAR, 0, val_loc)); | |
parse_term(parser, val_loc); | |
expect(parser, ";", "after value in let binding"); | |
parse_term(parser, bod_loc); | |
store_term(loc, LET, 0, let_node); | |
} | |
#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a unit type | |
void parse_term_uni(Parser* parser, uint32_t loc) { | |
if (check_utf8_3bytes(parser, 0xE2, 0x8A, 0xA4)) { | |
consume_utf8(parser, 3); | |
store_term(loc, UNI, 0, 0); | |
} else { | |
parse_error(parser, "Expected '⊤' for unit type"); | |
} | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a bool type | |
void parse_term_bit(Parser* parser, uint32_t loc) { | |
if (check_utf8_4bytes(parser, 0xF0, 0x9D, 0x94, 0xB9)) { | |
consume_utf8(parser, 4); | |
} else if (!consume(parser, "𝔹")) { | |
parse_error(parser, "Expected '𝔹' for bool type"); | |
} | |
store_term(loc, BIT, 0, 0); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a tuple | |
void parse_term_tup(Parser* parser, uint32_t loc) { | |
expect(parser, "[", "for tuple"); | |
uint32_t tup_node = alloc_term(2); | |
uint32_t fst_loc = tup_node; | |
uint32_t snd_loc = tup_node + 1; | |
parse_term(parser, fst_loc); | |
expect(parser, ",", "between elements in tuple"); | |
parse_term(parser, snd_loc); | |
expect(parser, "]", "after elements in tuple"); | |
store_term(loc, TUP, 0, tup_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse an equality type | |
void parse_term_eql(Parser* parser, uint32_t loc) { | |
expect(parser, "<", "for equality type"); | |
uint32_t eql_node = alloc_term(2); | |
uint32_t lft_loc = eql_node; | |
uint32_t rgt_loc = eql_node + 1; | |
parse_term(parser, lft_loc); | |
expect(parser, "=", "between terms in equality type"); | |
parse_term(parser, rgt_loc); | |
expect(parser, ">", "after terms in equality type"); | |
store_term(loc, EQL, 0, eql_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
// Parse a variable | |
void parse_term_var(Parser* parser, uint32_t loc) { | |
char* name = parse_name(parser); | |
add_var_use(parser, name, loc); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a pi type | |
void parse_term_all(Parser* parser, uint32_t loc) { | |
if (check_utf8(parser, 0xCE, 0xA0)) { | |
consume_utf8(parser, 2); | |
} else if (!consume(parser, "Π")) { | |
parse_error(parser, "Expected 'Π' for pi type"); | |
} | |
char* name = parse_name(parser); | |
expect(parser, ":", "after name in pi type"); | |
uint32_t all_node = alloc_term(2); | |
uint32_t inp_loc = all_node; | |
uint32_t out_loc = all_node + 1; | |
parse_term(parser, inp_loc); | |
expect(parser, ".", "after input type in pi type"); | |
Term var_term = make_term(VAR, 0, inp_loc); | |
bind_var(parser, name, var_term); | |
parse_term(parser, out_loc); | |
store_term(loc, ALL, 0, all_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse an empty type | |
void parse_term_emp(Parser* parser, uint32_t loc) { | |
if (check_utf8_3bytes(parser, 0xE2, 0x8A, 0xA5)) { | |
consume_utf8(parser, 3); | |
store_term(loc, EMP, 0, 0); | |
} else { | |
parse_error(parser, "Expected '⊥' for empty type"); | |
} | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a superposition | |
void parse_term_sup(Parser* parser, uint32_t loc) { | |
expect(parser, "&", "for superposition"); | |
uint8_t label = parse_uint(parser) & 3; // Ensure it fits in 2 bits | |
expect(parser, "{", "after label in superposition"); | |
uint32_t sup_node = alloc_term(2); | |
uint32_t lft_loc = sup_node; | |
uint32_t rgt_loc = sup_node + 1; | |
parse_term(parser, lft_loc); | |
expect(parser, ",", "between terms in superposition"); | |
parse_term(parser, rgt_loc); | |
expect(parser, "}", "after terms in superposition"); | |
store_term(loc, SUP, label, sup_node); | |
}#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse a true value | |
void parse_term_bt1(Parser* parser, uint32_t loc) { | |
expect(parser, "1", "for true value"); | |
store_term(loc, BT1, 0, 0); | |
}#include <ctype.h> | |
#include <stddef.h> | |
#include "../../parse.h" | |
#include "../../memory.h" | |
// Parse an application | |
void parse_term_app(Parser* parser, uint32_t loc) { | |
expect(parser, "(", "for application"); | |
uint32_t app_node = alloc_term(2); | |
uint32_t fun_loc = app_node; | |
uint32_t arg_loc = app_node + 1; | |
parse_term(parser, fun_loc); | |
if (!isspace(parser->input[parser->pos])) { | |
parse_error(parser, "Expected whitespace between function and argument in application"); | |
} | |
skip(parser); | |
parse_term(parser, arg_loc); | |
expect(parser, ")", "after application"); | |
store_term(loc, APP, 0, app_node); | |
}#include <stdio.h> | |
#include <stdlib.h> | |
#include <string.h> | |
#include "types.h" | |
#include "memory.h" | |
#include "parse.h" | |
#include "whnf.h" | |
#include "normal.h" | |
#include "show.h" | |
// Default test term string | |
const char* DEFAULT_TEST_TERM = "((λf.λx.!&0{f0,f1}=f;(f0 (f1 x)) λB.λT.λF.((B F) T)) λa.λb.a)"; | |
// Run a term through normalization and print results | |
void process_term(Term term) { | |
printf("Original term:\n"); | |
show_term(stdout, term); | |
printf("\n\n"); | |
// Reset interaction counter | |
interaction_count = 0; | |
// Normalize the term | |
term = normal(term); | |
printf("Normal form:\n"); | |
show_term(stdout, term); | |
printf("\n\n"); | |
printf("Total interactions: %llu\n", interaction_count); | |
} | |
// Test function with the default term | |
void test() { | |
printf("Running with default test term: %s\n", DEFAULT_TEST_TERM); | |
// Parse the term | |
Term term = parse_string(DEFAULT_TEST_TERM); | |
// Process the term | |
process_term(term); | |
} | |
// Parse a file and return the term | |
Term parse_file(const char* filename) { | |
FILE* file = fopen(filename, "r"); | |
if (!file) { | |
fprintf(stderr, "Error: Could not open file '%s'\n", filename); | |
exit(1); | |
} | |
// Get file size | |
fseek(file, 0, SEEK_END); | |
long size = ftell(file); | |
fseek(file, 0, SEEK_SET); | |
// Allocate buffer | |
char* buffer = (char*)malloc(size + 1); | |
if (!buffer) { | |
fprintf(stderr, "Error: Memory allocation failed\n"); | |
fclose(file); | |
exit(1); | |
} | |
// Read file contents | |
size_t read_size = fread(buffer, 1, size, file); | |
fclose(file); | |
buffer[read_size] = '\0'; | |
// Parse the string | |
Term term = parse_string(buffer); | |
// Free the buffer | |
free(buffer); | |
return term; | |
} | |
void print_usage() { | |
printf("Usage: suptt <command> [arguments]\n\n"); | |
printf("Commands:\n"); | |
printf(" run <file> - Parse and normalize a SupTT file\n"); | |
printf(" eval <expr> - Parse and normalize a SupTT expression\n"); | |
printf("\n"); | |
} | |
int main(int argc, char* argv[]) { | |
// Initialize memory | |
init_memory(); | |
// Check if no arguments provided | |
if (argc < 2) { | |
print_usage(); | |
return 1; | |
} | |
// Get command | |
const char* command = argv[1]; | |
// Handle commands | |
if (strcmp(command, "run") == 0) { | |
// Check if filename is provided | |
if (argc < 3) { | |
fprintf(stderr, "Error: No file specified\n"); | |
print_usage(); | |
return 1; | |
} | |
// Parse and process the file | |
const char* filename = argv[2]; | |
Term term = parse_file(filename); | |
process_term(term); | |
} else if (strcmp(command, "eval") == 0) { | |
// Check if expression is provided | |
if (argc < 3) { | |
fprintf(stderr, "Error: No expression specified\n"); | |
print_usage(); | |
return 1; | |
} | |
// Parse and process the expression | |
const char* expression = argv[2]; | |
Term term = parse_string(expression); | |
process_term(term); | |
} else { | |
fprintf(stderr, "Error: Unknown command '%s'\n", command); | |
print_usage(); | |
return 1; | |
} | |
return 0; | |
} | |
#ifndef SHOW_H | |
#define SHOW_H | |
#include <stdio.h> | |
#include "types.h" | |
// Convert a term to its string representation | |
// The returned string is dynamically allocated and must be freed by the caller | |
char* term_to_string(Term term); | |
// Display a term to the specified output stream | |
void show_term(FILE* stream, Term term); | |
#endif // SHOW_H | |
#ifndef NORMAL_H | |
#define NORMAL_H | |
#include "types.h" | |
// Reduce a term to full normal form | |
Term normal(Term term); | |
#endif // NORMAL_H#ifndef MEMORY_H | |
#define MEMORY_H | |
#include "types.h" | |
// The heap (global memory buffer) | |
extern Term heap[HEAP_SIZE]; | |
// The current allocation pointer | |
extern uint32_t heap_ptr; | |
// Initialize the memory system | |
void init_memory(); | |
// Allocate n consecutive terms in memory | |
uint32_t alloc(uint32_t n); | |
// Create a term with the given tag and value | |
Term make_term(TermTag tag, uint8_t lab, uint32_t val); | |
// Create a substitution term | |
Term make_sub(Term term); | |
// Remove the substitution bit from a term | |
Term clear_sub(Term term); | |
// Check if a term has a substitution | |
bool has_sub(Term term); | |
#endif // MEMORY_H | |
#include "memory.h" | |
#include "show.h" | |
#include "whnf.h" | |
#include <stdio.h> | |
// Global interaction counter | |
uint64_t interaction_count = 0; | |
// Reduce a term to weak head normal form | |
Term whnf(Term term) { | |
while (1) { | |
printf("%d\n", interaction_count); | |
if (interaction_count >= 1) return term; | |
TermTag tag = TERM_TAG(term); | |
switch (tag) { | |
case VAR: { | |
uint32_t var_loc = TERM_VAL(term); | |
Term subst = heap[var_loc]; | |
if (TERM_SUB(subst)) { | |
term = clear_sub(subst); | |
continue; | |
} | |
return term; | |
} | |
case LET: { | |
term = let_red(term); | |
continue; | |
} | |
case CO0: | |
case CO1: { | |
uint32_t col_loc = TERM_VAL(term); | |
Term val = heap[col_loc]; | |
if (TERM_SUB(val)) { | |
term = clear_sub(val); | |
continue; | |
} | |
val = whnf(val); | |
heap[col_loc] = val; | |
switch (TERM_TAG(val)) { | |
case LAM: term = col_lam(term, val); continue; | |
case SUP: term = col_sup(term, val); continue; | |
case NIL: term = col_nil(term, val); continue; | |
case BT0: term = col_b_0(term, val); continue; | |
case BT1: term = col_b_1(term, val); continue; | |
case TUP: term = col_tup(term, val); continue; | |
default: return term; | |
} | |
} | |
case APP: { | |
uint32_t app_loc = TERM_VAL(term); | |
Term fun = whnf(heap[app_loc]); | |
heap[app_loc] = fun; | |
if (interaction_count >= 1) return term; | |
switch (TERM_TAG(fun)) { | |
case LAM: term = app_lam(term, fun); continue; | |
case SUP: term = app_sup(term, fun); continue; | |
default: return term; | |
} | |
} | |
case USE: { | |
uint32_t use_loc = TERM_VAL(term); | |
Term val = whnf(heap[use_loc]); | |
heap[use_loc] = val; | |
switch (TERM_TAG(val)) { | |
case NIL: term = use_nil(term, val); continue; | |
case SUP: term = use_sup(term, val); continue; | |
default: return term; | |
} | |
} | |
case ITE: { | |
uint32_t ite_loc = TERM_VAL(term); | |
Term cond = whnf(heap[ite_loc]); | |
heap[ite_loc] = cond; | |
switch (TERM_TAG(cond)) { | |
case BT0: term = ite_b_0(term, cond); continue; | |
case BT1: term = ite_b_1(term, cond); continue; | |
case SUP: term = ite_sup(term, cond); continue; | |
default: return term; | |
} | |
} | |
case GET: { | |
uint32_t get_loc = TERM_VAL(term); | |
Term val = whnf(heap[get_loc + 2]); | |
heap[get_loc + 2] = val; | |
switch (TERM_TAG(val)) { | |
case TUP: term = get_tup(term, val); continue; | |
case SUP: term = get_sup(term, val); continue; | |
default: return term; | |
} | |
} | |
case RWT: { | |
uint32_t rwt_loc = TERM_VAL(term); | |
Term val = whnf(heap[rwt_loc]); | |
heap[rwt_loc] = val; | |
switch (TERM_TAG(val)) { | |
case RFL: term = rwt_rfl(term, val); continue; | |
case SUP: term = rwt_sup(term, val); continue; | |
default: return term; | |
} | |
} | |
default: | |
return term; | |
} | |
} | |
return term; | |
} | |
#include <stdio.h> | |
#include <stdlib.h> | |
#include <string.h> | |
#include <ctype.h> | |
#include "parse.h" | |
#include "memory.h" | |
// Forward declarations for internal functions | |
uint32_t parse_term_alloc(Parser* parser); | |
void parse_term(Parser* parser, uint32_t loc); | |
void skip(Parser* parser); | |
char peek_char(Parser* parser); | |
char next_char(Parser* parser); | |
bool peek_is(Parser* parser, char c); | |
void parse_error(Parser* parser, const char* message); | |
Term* lookup_var_binding(Parser* parser, const char* name); | |
// Add a variable binding | |
void bind_var(Parser* parser, const char* name, Term term) { | |
if (parser->vrs_count >= MAX_VARS) { | |
parse_error(parser, "Too many variable bindings"); | |
} | |
strncpy(parser->vrs[parser->vrs_count].name, name, MAX_NAME_LEN - 1); | |
parser->vrs[parser->vrs_count].name[MAX_NAME_LEN - 1] = '\0'; | |
parser->vrs[parser->vrs_count].term = term; | |
parser->vrs_count++; | |
} | |
// Track a variable use to be resolved later | |
void add_var_use(Parser* parser, const char* name, uint32_t loc) { | |
if (parser->lcs_count >= MAX_USES) { | |
parse_error(parser, "Too many variable uses"); | |
} | |
strncpy(parser->lcs[parser->lcs_count].name, name, MAX_NAME_LEN - 1); | |
parser->lcs[parser->lcs_count].name[MAX_NAME_LEN - 1] = '\0'; | |
parser->lcs[parser->lcs_count].loc = loc; | |
parser->lcs_count++; | |
} | |
// Allocate memory for a term of size n | |
uint32_t alloc_term(uint32_t n) { | |
return alloc(n); | |
} | |
// Check if the next bytes match the specified UTF-8 bytes | |
bool check_utf8(Parser* parser, uint8_t b1, uint8_t b2) { | |
return (unsigned char)parser->input[parser->pos] == b1 && | |
(unsigned char)parser->input[parser->pos + 1] == b2; | |
} | |
// Check if the next bytes match the specified 3-byte UTF-8 sequence | |
bool check_utf8_3bytes(Parser* parser, uint8_t b1, uint8_t b2, uint8_t b3) { | |
return (unsigned char)parser->input[parser->pos] == b1 && | |
(unsigned char)parser->input[parser->pos + 1] == b2 && | |
(unsigned char)parser->input[parser->pos + 2] == b3; | |
} | |
// Check if the next bytes match the specified 4-byte UTF-8 sequence | |
bool check_utf8_4bytes(Parser* parser, uint8_t b1, uint8_t b2, uint8_t b3, uint8_t b4) { | |
return (unsigned char)parser->input[parser->pos] == b1 && | |
(unsigned char)parser->input[parser->pos + 1] == b2 && | |
(unsigned char)parser->input[parser->pos + 2] == b3 && | |
(unsigned char)parser->input[parser->pos + 3] == b4; | |
} | |
// Consume the next n bytes of UTF-8 character | |
void consume_utf8(Parser* parser, int bytes) { | |
for (int i = 0; i < bytes; i++) { | |
next_char(parser); | |
} | |
} | |
// Try to consume a string, returning whether it matched | |
bool consume(Parser* parser, const char* str) { | |
size_t len = strlen(str); | |
skip(parser); | |
if (strncmp(parser->input + parser->pos, str, len) == 0) { | |
for (size_t i = 0; i < len; i++) { | |
next_char(parser); | |
} | |
return true; | |
} | |
return false; | |
} | |
// Report a parsing error | |
void parse_error(Parser* parser, const char* message) { | |
fprintf(stderr, "Parse error at line %zu, column %zu: %s\n", | |
parser->line, parser->col, message); | |
fprintf(stderr, "Input: %s\n", parser->input); | |
// Create a pointer to the error location | |
fprintf(stderr, " "); | |
for (size_t i = 0; i < parser->pos && i < 40; i++) { | |
fprintf(stderr, " "); | |
} | |
fprintf(stderr, "^\n"); | |
exit(1); | |
} | |
// Expect a specific token, with an error if not found | |
bool expect(Parser* parser, const char* token, const char* error_context) { | |
if (!consume(parser, token)) { | |
char error[256]; | |
snprintf(error, sizeof(error), "Expected '%s' %s", token, error_context); | |
parse_error(parser, error); | |
return false; | |
} | |
return true; | |
} | |
// Initialize a parser with the given input string | |
void init_parser(Parser* parser, const char* input) { | |
parser->input = input; | |
parser->pos = 0; | |
parser->line = 1; | |
parser->col = 1; | |
parser->lcs_count = 0; | |
parser->vrs_count = 0; | |
} | |
// Look up a variable binding by name | |
Term* lookup_var_binding(Parser* parser, const char* name) { | |
for (size_t i = 0; i < parser->vrs_count; i++) { | |
if (strcmp(parser->vrs[i].name, name) == 0) { | |
return &parser->vrs[i].term; | |
} | |
} | |
return NULL; | |
} | |
// Parse an identifier | |
char* parse_name(Parser* parser) { | |
static char name[MAX_NAME_LEN]; | |
size_t i = 0; | |
if (!isalpha(peek_char(parser)) && peek_char(parser) != '_') { | |
parse_error(parser, "Expected name starting with letter or underscore"); | |
} | |
while (isalnum(peek_char(parser)) || peek_char(parser) == '_') { | |
if (i < MAX_NAME_LEN - 1) { | |
name[i++] = next_char(parser); | |
} else { | |
parse_error(parser, "Name too long"); | |
} | |
} | |
name[i] = '\0'; | |
return name; | |
} | |
// Get the next character and advance position | |
char next_char(Parser* parser) { | |
char c = parser->input[parser->pos++]; | |
if (c == '\n') { | |
parser->line++; | |
parser->col = 1; | |
} else { | |
parser->col++; | |
} | |
return c; | |
} | |
// Peek at the next character without advancing | |
char peek_char(Parser* parser) { | |
return parser->input[parser->pos]; | |
} | |
// Check if the next character matches the specified one | |
bool peek_is(Parser* parser, char c) { | |
return peek_char(parser) == c; | |
} | |
// Resolve all variable uses to their bindings | |
void resolve_var_uses(Parser* parser) { | |
for (size_t i = 0; i < parser->lcs_count; i++) { | |
Term* binding = lookup_var_binding(parser, parser->lcs[i].name); | |
if (binding == NULL) { | |
char error[MAX_NAME_LEN + 50]; | |
snprintf(error, sizeof(error), "Undefined variable: %s", parser->lcs[i].name); | |
parse_error(parser, error); | |
} | |
heap[parser->lcs[i].loc] = *binding; | |
} | |
} | |
// Store a term at the given location | |
void store_term(uint32_t loc, TermTag tag, uint8_t label, uint32_t value) { | |
heap[loc] = make_term(tag, label, value); | |
} | |
// Parse a string into a term | |
Term parse_string(const char* input) { | |
Parser parser; | |
init_parser(&parser, input); | |
uint32_t term_loc = parse_term_alloc(&parser); | |
// Resolve variable references | |
resolve_var_uses(&parser); | |
return heap[term_loc]; | |
} | |
// Allocate space for a term and parse into it | |
uint32_t parse_term_alloc(Parser* parser) { | |
uint32_t loc = alloc(1); | |
parse_term(parser, loc); | |
return loc; | |
} | |
// Parse an unsigned integer | |
uint32_t parse_uint(Parser* parser) { | |
uint32_t value = 0; | |
bool has_digit = false; | |
while (isdigit(peek_char(parser))) { | |
value = value * 10 + (next_char(parser) - '0'); | |
has_digit = true; | |
} | |
if (!has_digit) { | |
parse_error(parser, "Expected digit"); | |
} | |
return value; | |
} | |
// Skip over whitespace and comments | |
void skip(Parser* parser) { | |
while (1) { | |
char c = peek_char(parser); | |
if (isspace(c)) { | |
next_char(parser); | |
} else if (c == '/' && parser->input[parser->pos + 1] == '/') { | |
// Skip line comment | |
next_char(parser); // Skip '/' | |
next_char(parser); // Skip '/' | |
while (peek_char(parser) != '\0' && peek_char(parser) != '\n') { | |
next_char(parser); | |
} | |
if (peek_char(parser) == '\n') { | |
next_char(parser); | |
} | |
} else { | |
break; | |
} | |
} | |
} | |
------------------- | |
PROBLEM: for some reason, the following term isn't reducing (or stringifying) properly: | |
Original term: | |
! &0{a3,b4} = λx2.(a5 (b6 x2)); | |
! &0{a5,b6} = x0; | |
λx0.λx1.(a3 (b4 x1)) | |
0 | |
0 | |
0 | |
0 | |
0 | |
col_lam | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
Normal form: | |
! &0{a3,b4} = (a5 (b6 ?)); | |
! &0{a5,b6} = x0; | |
! &0{a7,b8} = b4; | |
λx0.λx1.(λx2.a3 (b8 x1)) | |
notice how it is showing a '?' instead of the actual value. troubleshoot the | |
issue. if you manage to figure out the problem, write it down below. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment