Skip to content

Instantly share code, notes, and snippets.

@VictorTaelin
Created November 18, 2025 18:26
Show Gist options
  • Select an option

  • Save VictorTaelin/a116dcb12966a5b1db156ce0f379f499 to your computer and use it in GitHub Desktop.

Select an option

Save VictorTaelin/a116dcb12966a5b1db156ce0f379f499 to your computer and use it in GitHub Desktop.
Prompt: debug WNF_SPOS issue on HVM4 / SupGen
=== docs/HVM.md ===
# HVM
HVM is a fast C implementation of the Interaction Calculus.
It is a pure functional runtime like Haskell's GHC, with some peculiarities:
- Variables are affine, meaning they can only occur, at most, once.
- To copy values, linear dups are used, where 2 vars refer to the same value.
- Accessing a dup triggers an incremental, layer-by-layer copy of the value.
- Lambdas can also be cloned incrementally by linear dups.
- As a side-effect, lambda-bound vars can escape (occur outside of the body).
## Language
Terms are core expressions, defined by the grammar:
```
Term ::=
# Variables
| VAR ::= Nick
| DP0 ::= Nick "₀"
| DP1 ::= Nick "₁"
# Universe
| SET ::= "*"
# Lambdas
| ALL ::= "∀" Term "." Term
| LAM ::= "λ" Nick "." Term
| APP ::= "(" Term " " Term ")"
# Superpositions
| ERA ::= "&{}"
| SUP ::= "&" Label "{" Term "," Term "}"
# Booleans
| BIT ::= "𝔹"
| BT0 ::= "#0"
| BT1 ::= "#1"
| BITM ::= "λ" "{" "#0" ":" Term ";" "#1" ":" Term ";" "}"
# Naturals
| NAT ::= "ℕ"
| ZER ::= "0n"
| SUC ::= "1n" "+" Term
| INC ::= "↑" Term
| NATM ::= "λ" "{" "0n" ":" Term ";" "1n+" ":" Term ";" "}"
# Lists
| LST ::= Term "[]"
| NIL ::= "[]"
| CON ::= Term "<>" Term
| LSTM ::= "λ" "{" "[]" ":" Term ";" "<>" ":" Term ";" "}"
# Function
| REF ::= "@" Nick
| TYP ::= "@:" Nick
```
Dups are shared values stored on the heap, written as:
```
Dups ::= "!" Nick "&" Label "=" Term
```
## Evaluation
```
wnf : Term → Term
wnf (f x) = app f x
wnf k₀ = dp0 label-of(k₀) k₀ twin-of(k₀)
wnf k₁ = dp1 label-of(k₁) twin-of(k₁) k₁
wnf @F = ref F
wnf @:F = ref-type F
wnf x = x
app : Term → Term → Term
app (wnf → &{}) v = era v
app (wnf → &L{x,y}) v = !V&L = v; &L{(x V₀),(y V₁)}
app (wnf → λx.f) v = x ← v; f
app (wnf → λ{#0:f;#1:t}) v = bitm f t v
app (wnf → λ{0n:z;1n+:s}) v = natm z s v
app (wnf → λ{[]:n;<>:c}) v = lstm n c v
app (wnf → ↑f) v = ↑(f v)
app (wnf → f) v = (f v)
dp0 : Label → Term → Term → Term
dp0 L k₀@(wnf → &{}) k₁ = k₁ ← &{}; &{}
dp0 L k₀@(wnf → &L{x,y}) k₁ = k₁ ← y; x
dp0 L k₀@(wnf → &M{x,y}) k₁ = !X&L = &M{x0,x1}; !Y&L = &M{y0,y1}; k₁ ← &M{X₁,Y₁}; &M{X₀,Y₀}
dp0 L k₀@(wnf → λx.f) k₁ = x ← &L{x0,x1}; !F&L = f; k₁ ← λx1.F₁; λx0.F₀
dp0 L k₀@(wnf → (f x)) k₁ = !F&L = f; !X&L = x; k₁ ← (F₁ X₁); (F₀ X₀)
...
dp1 : Label → Term → Term → Term
dp1 L k₀ k₁@(wnf → &{}) = k₀ ← &{}; &{}
dp1 L k₀ k₁@(wnf → &L{x,y}) = k₀ ← x; y
dp1 L k₀ k₁@(wnf → &M{x,y}) = !X&L = &M{x0,x1}; !Y&L = &M{y0,y1}; k₀ ← &M{X₀,Y₀}; &M{X₁,Y₁}
dp1 L k₀ k₁@(wnf → λx.f) = x ← &L{x0,x1}; !F&L = f; k₀ ← λx0.F₀; λx1.F₁
dp1 L k₀ k₁@(wnf → (f x)) = !F&L = f; !X&L = x; k₀ ← (F₀ X₀); (F₁ X₁)
...
ref : Nick → Term
ref F = body-of(F)
```
## Memory
The global memory (heap) is an array of Term pointers.
### Term
Term pointers are represented in memory as a 64-bit union:
```
TERM ::= SUB (1-bit) & TAG (7-bit) & LAB (24-bit) & VAL (32-bit)
```
Where:
- SUB = is-subst bit
- TAG = term variant
- LAB = nick or label (4 letters)
- VAL = node address (32-bit)
### Defs
Global definitions are stored in a global structure named 'BOOK' (size 2^24),
which maps names to a pair of heap locations: a type and a term. Concretely:
- BOOK[F].type = anchor of the (optional) type term of @F, or 0 when absent
- BOOK[F].term = anchor of the closed body term of @F
Top-level definitions can optionally include a type annotation:
```
@Nick : Type = Term ;
@Nick = Term ; -- type omitted
```
When a @ref is reduced, we expand its body from BOOK[F].term.
## Notes
- VARs point to their binding lambdas (LAM or ABS).
- DP0 and DP1 point to the dup (i.e., the shared value).
- To save space, DP0 and DP1 store the dup's label.
=== docs/linearity.md ===
# On Memory Safety and Linearity Bugs
Since HVM implements a linear graph rewrite runtime, it relies on Term pointer
being unique - i.e., the same Term can't be stored twice on memory. Because of
that, one of the main sources of bugs in this repository is accidentally using a
Term pointer more than once.
## BAD EXAMPLE 3 / ABOUT FORKs
Mixing up "Cloner Dups" and "Fork Dups".
In HVM, a powerful way to use Dups is to "fork" the current execution into
two diverging branches. We exploit that on the enumerator. For example:
... Term term_gen_expr(Term L, Term D, Term K, Term T, Term lhs, Term fol, Term rec, Term lib, Term ctx) {
...
case UVA: {
Lab lab = (Lab)L_val;
Dups D_ = new_dps(lab, D);
Dups T_ = new_dps(lab, T);
Dups LHS = new_dps(lab, lhs);
Dups FOL = new_dps(lab, fol);
Dups REC = new_dps(lab, rec);
Dups LIB_= new_dps(lab, lib);
Dups CTX_= new_dps(lab, ctx);
Term S0 = call9(Pri("GP"), Uva(L_val), D_.dp0, Uva(kv), T_.dp0, LHS.dp0, FOL.dp0, REC.dp0, LIB_.dp0, CTX_.dp0);
Term S1 = call9(Pri("GI"), Uva(L_val), D_.dp1, Uva(kp), T_.dp1, LHS.dp1, FOL.dp1, REC.dp1, LIB_.dp1, CTX_.dp1);
return new_sup(lab, S0, S1);
}
...
In the code above, we Dup:
- Dup every var in context using the same label L.
- Build two continuations: S0 (calls "TGP") | S1 (calls "TGI").
- Use only var[.dp0|.dp1] inside [S0|S1] respectivelly.
- Return a superposition of S0 and S1 with label L.
This is a fork!
=== src/term/_main_.h ===
// term.h — Core Term type, tags, book, and helpers
...
// Term Tags
#define VAR 0x00 // x
#define DEP 0x01 // .fn (free variable)
#define DRY 0x02 // .(f x)
#define ALL 0x03 // ∀x.y
#define LAM 0x04 // λx.f
#define APP 0x05 // (f x)
#define ERA 0x06 // &{}
#define SUP 0x07 // &L{x,y}
#define DUP 0x08 // !F&L=v;t
#define DP0 0x09 // x₀
#define DP1 0x0A // x₁
#define BIT 0x0B // 𝔹
#define BT0 0x0C // #0
#define BT1 0x0D // #1
#define NAT 0x0E // ℕ
#define ZER 0x0F // 0n
#define SUC 0x10 // 1n+x
#define LST 0x11 // x[]
#define NIL 0x12 // []
#define CON 0x13 // x<>y
#define SIG 0x14 // Σx.y
#define TUP 0x15 // #(x,y)
#define REF 0x16 // @F
#define TYP 0x17 // @:F (type ref)
...
#define BITM 0x1B // λ{#0:f;#1:t}
#define NATM 0x1C // λ{0n:z;1n+:s}
#define LSTM 0x1D // λ{[]:n;<>:c}
...
#define SET 0x20 // *
...
#define PRI 0x25 // /name
#define FRZ 0x26 // Frz{t}
#define INC 0x27 // ↑x
...
#define GEN 0x2B // ?N ctx : typ
...
// Book Terms
#define BKL 0x41 // Book LAM
#define BKD 0x42 // Book DUP
#define BKV 0x43 // Book VAR
#define BK0 0x44 // Book DP0
#define BK1 0x45 // Book DP1
...
// Global book of definitions: maps nick -> { type, term } heap locations
typedef struct {
Val term; // anchor of body term
Val type; // anchor of type term (optional; 0 when absent)
char* name; // definition name (owned by BOOK)
} Defn;
static Defn BOOK[BOOK_SIZE];
...
=== src/term/new.h ===
...
#define FN(var, BODY) ({ \
Val __loc = heap_alloc(1); \
Nick __nik = fresh_nick(); \
Term var = new_var(__nik, __loc); \
Term __lam = new_term(false, LAM, __nik, __loc); \
set_at(__loc + 0, (BODY)); \
__lam; \
})
#define FN_(name_cstr, var, BODY) ({ \
Val __loc = heap_alloc(1); \
Nick __nik = nick((name_cstr)); \
Term var = new_var(__nik, __loc); \
Term __lam = new_term(false, LAM, __nik, __loc); \
set_at(__loc + 0, (BODY)); \
__lam; \
})
...
static inline Term new_all(Term x, Term y) {
return new_node_2(ALL, 0, x, y);
}
...
static inline Term new_app(Term f, Term x) {
return new_node_2(APP, 0, f, x);
}
...
static inline Term new_dp0(Lab L, Val loc) {
return new_term(false, DP0, L, loc);
}
static inline Term new_dp1(Lab L, Val loc) {
return new_term(false, DP1, L, loc);
}
...
static inline Term new_sup(Lab L, Term a, Term b) {
return new_node_2(SUP, L, a, b);
}
...
static inline Term new_gen(Nick n, Term ctx, Term typ, Term seed) {
return new_node_3(GEN, n, ctx, typ, seed);
}
...
static inline Term All(Term x, Term y) {
return new_all(x, y);
}
...
static inline Term App(Term f, Term x) {
return new_app(f, x);
}
...
static inline Term Ref(const char* F) {
Lab id = register_def_name(F);
return new_ref(id);
}
...
static inline Term Pri(const char* F) {
return new_pri(nick(F));
}
...
static inline Dups new_dps(Lab L, Term v) {
Val X = anchor(v);
Dups d = { .dp0 = new_dp0(L, X), .dp1 = new_dp1(L, X) };
return d;
}
...
static inline Dups Dps(const char* L, Term v) {
return new_dps(nick(L), v);
}
...
static inline Term call1(Term f, Term a) {
return App(f, a);
}
static inline Term call2(Term f, Term a, Term b) {
return App(App(f, a), b);
}
static inline Term call3(Term f, Term a, Term b, Term c) {
return App(App(App(f, a), b), c);
}
...
static inline Term call9(Term f, Term a1, Term a2, Term a3, Term a4, Term a5, Term a6, Term a7, Term a8, Term a9) {
return App(App(App(App(App(App(App(App(App(f, a1), a2), a3), a4), a5), a6), a7), a8), a9);
}
...
// Inlines `f(x)`
static inline Term new_app_inline(Term f, Term x) {
Tag f_tag = term_tag(f);
Lab f_lab = term_lab(f);
Val f_val = term_val(f);
switch (f_tag) {
case LAM: {
WNF_ITRS[APP_LAM]++;
Val lam = term_val(f);
Term bod = take_at(lam + 0);
set_at(lam + 0, as_sub(x));
return bod;
}
default: {
return App(f, x);
}
}
}
...
=== src/prim/_MACROS_.h ===
...
// Match helper
// ------------
// New form auto-injects ERA/SUP/INC cases to remove boilerplate.
// Also reduces the scrutinee to WNF and exposes the conventional
// locals: `scrutinee##_wnf`, `scrutinee##_tag`, `scrutinee##_lab`,
// and `scrutinee##_val`.
...
#define MATCH(scrutinee, CTX, RECUR, BODY) \
Term scrutinee##_wnf = wnf(scrutinee); \
Tag scrutinee##_tag = term_tag(scrutinee##_wnf); \
Lab scrutinee##_lab = term_lab(scrutinee##_wnf); \
Val scrutinee##_val = term_val(scrutinee##_wnf); \
switch (scrutinee##_tag) { \
case ERA: { ERA_CASE(); } \
case SUP: { SUP_CASE(scrutinee, (RECUR), PRIM_DETAIL_EXPAND CTX); } \
case INC: { INC_CASE(scrutinee, (RECUR)); } \
BODY \
}
...
#define ERA_CASE() return Era()
#define INC_CASE(scrutinee, expr) \
Term scrutinee##_in = take_at(scrutinee##_val + 0); \
heap_free(scrutinee##_val, 1); \
Term S_case_; { \
Term scrutinee = scrutinee##_in; \
S_case_ = (expr); \
} \
return new_inc(S_case_)
...
#define SUP_CASE(scrutinee, expr, ...) \
Term scrutinee##_0 = take_at(scrutinee##_val + 0); \
Term scrutinee##_1 = take_at(scrutinee##_val + 1); \
heap_free(scrutinee##_val, 2); \
PRIM_DETAIL_DUP_LIST(scrutinee##_lab, ##__VA_ARGS__); \
Term S0_case_; { \
Term scrutinee = scrutinee##_0; \
PRIM_DETAIL_BIND_LIST(0, ##__VA_ARGS__); \
S0_case_ = (expr); \
} \
Term S1_case_; { \
Term scrutinee = scrutinee##_1; \
PRIM_DETAIL_BIND_LIST(1, ##__VA_ARGS__); \
S1_case_ = (expr); \
} \
return new_sup(scrutinee##_lab, S0_case_, S1_case_)
...
// Nick-based call helpers
...
#define CALL(nick, ...) CALL_DETAIL_SELECT(nick, __VA_ARGS__)
#define RECALL(func, ...) func(__VA_ARGS__)
...
=== src/prim/when.h ===
...
static inline Term when(Term k, Term v) {
MATCH(k, (v), CALL(W, k, v), {
case BT1: {
return v;
}
case BT0: {
return Era();
}
default: {
printf("Error: ill-typed call to 'when':\n");
print_term_ln(normal_seq(k_wnf));
abort();
}
});
}
static inline Term WHEN(Term* args) {
return when(args[0], args[1]);
}
...
=== src/prim/gen_body.h ===
...
static inline Term gen_body(Seed* seed, Term L, Term D, Term K, Term T, Term spn, Term fol, Term rec, Term lib, Term ctx) {
return CALL(GE, L, D, Uva(MAX_INTR), T, spn, fol, rec, lib, ctx); // gen_expr
...
}
// Wrapper for PRIM registry
static inline Term GEN_BODY(Term* args) {
Seed S = seed_init();
return gen_body(&S, args[0], args[1], args[2], args[3], args[4], args[5], args[6], args[7], args[8]);
}
...
=== src/prim/gen_call.h ===
...
static inline Term gen_call(Term L, Term D, Term K, Term T, Term spn, Term fol, Term rec, Term lib, Term ctx, Term tm, Term ty) {
MATCH(ty, (L, D, K, T, spn, fol, rec, lib, ctx, tm), CALL(GC, L, D, K, T, spn, fol, rec, lib, ctx, tm, ty), {
case ALL: {
...
}
default: {
...
}
});
}
...
static inline Term GEN_CALL(Term* args) {
return gen_call(args[0], args[1], args[2], args[3], args[4], args[5], args[6], args[7], args[8], args[9], args[10]);
}
...
=== src/prim/gen_expr.h ===
...
// gen_intr(+l, D, +k, T, spn, fol, rec, lib, ctx)
static inline Term gen_expr(Term L, Term D, Term K, Term T, Term spn, Term fol, Term rec, Term lib, Term ctx) {
MATCH(K, (L, D, T, spn, fol, rec, lib, ctx), CALL(GE, L, D, K, T, spn, fol, rec, lib, ctx), {
case UVA: {
...
MATCH(L, (D, T, spn, fol, rec, lib, ctx), CALL(GE, L, D, Uva(kv), T, spn, fol, rec, lib, ctx), {
...
});
}
}
default: {
abort();
}
});
}
static inline Term GEN_EXPR(Term* args) {
return gen_expr(args[0], args[1], args[2], args[3], args[4], args[5], args[6], args[7], args[8]);
}
...
=== src/prim/gen_lam.h ===
...
static inline Term gen_lam(Seed* seed, Term L, Term D, Term K, Term T, Term spn, Term fol, Term rec, Term lib, Term ctx) {
// match T:
MATCH(T, (L, D, K, spn, fol, rec, lib, ctx), RECALL(gen_lam, seed, L, D, K, T, spn, fol, rec, lib, ctx), {
case FRZ: {
...
}
case ALL: {
...
}
default: {
...
}
});
}
static inline Term GEN_LAM(Term* args) {
Seed S = seed_init();
return gen_lam(&S, args[0], args[1], args[2], args[3], args[4], args[5], args[6], args[7], args[8]);
}
...
=== src/prim/equal.h ===
...
static inline Term equal(Term D, Term a, Term b) {
Term a_wnf = thaw(a);
Tag a_tag = term_tag(a_wnf);
Lab a_lab = term_lab(a_wnf);
Val a_val = term_val(a_wnf);
switch (a_tag) {
case ERA: { ERA_CASE(); }
case SUP: { SUP_CASE(a, CALL(E, D, a, b), D, b); }
case INC: { INC_CASE(a, CALL(E, D, a, b)); }
default: {
break;
}
}
Term b_wnf = thaw(b);
Tag b_tag = term_tag(b_wnf);
Lab b_lab = term_lab(b_wnf);
Val b_val = term_val(b_wnf);
switch (b_tag) {
case ERA: { ERA_CASE(); }
case SUP: { SUP_CASE(b, CALL(E, D, a_wnf, b), D, a_wnf); }
case INC: { INC_CASE(b, CALL(E, D, a_wnf, b)); }
default: {
break;
}
}
if (term_sub(a_wnf) && term_sub(b_wnf)) {
abort();
}
if (a_tag == VAR && b_tag == VAR) {
abort();
}
if (a_tag == NUL || b_tag == NUL) {
return Bt1();
}
switch (a_tag) {
...
case ALL: {
if (b_tag == ALL) {
Term a_a = take_at(a_val + 0);
Term a_b = take_at(a_val + 1);
Term b_a = take_at(b_val + 0);
Term b_b = take_at(b_val + 1);
heap_free(a_val, 2);
heap_free(b_val, 2);
Dups D_ = Dps("D", D);
Term e_a = call3(Pri("E"), D_.dp0, a_a, b_a); // equal
Term e_b = call3(Pri("E"), D_.dp1, a_b, b_b); // equal
return call2(Pri("BA"), e_a, e_b); // bool_and
}
heap_free(a_val, 2);
return Bt0();
}
...
default: {
return Bt0();
}
}
}
static inline Term EQUAL(Term* args) {
return equal(args[0], args[1], args[2]);
}
...
=== src/wnf/_main_.h ===
...
// Stack-based WNF dispatcher (modular, hot helpers forced-inline)
static inline Term wnf(Term t) {
wnf_stack_init();
u64 spos = WNF_SPOS;
Term twnf = 0;
Term next = t;
while (1) {
enter: {
Tag tag = term_tag(next);
switch (tag) {
case VAR: { wnf_enter_var(&next); goto enter; }
case APP: { wnf_enter_app(&next); goto enter; }
case DUP: { wnf_enter_dup(&next); goto enter; }
case DP0:
case DP1: { wnf_enter_dps(&next); goto enter; }
case REF: { wnf_enter_ref(&next); goto enter; }
case TYP: { wnf_enter_typ(&next); goto enter; }
case RWT: { wnf_enter_rwt(&next); goto enter; }
case GEN: { wnf_enter_gen(&next); goto enter; }
default: { twnf = next; goto apply; }
}
}
apply: {
while (WNF_SPOS > spos) {
Term prev = WNF_STACK[--WNF_SPOS];
Tag ptag = term_tag(prev);
switch (ptag) {
case APP: { int go = wnf_apply_app(prev, &twnf, spos); if (go) { next = twnf; goto enter; } else continue; }
case DP0:
case DP1: { int go = wnf_apply_dps(prev, &twnf); if (go) { next = twnf; goto enter; } else continue; }
default: {
twnf = prev;
continue;
}
}
}
}
WNF_SPOS = spos;
return twnf;
}
}
...
static WNF_ALWAYS_INLINE void wnf_enter_var(Term* next) {
Val sub_loc = term_loc(*next) + 0;
Term sub_val = take_at(sub_loc);
if (term_sub(sub_val)) {
*next = clear_sub(sub_val);
} else {
set_at(sub_loc, sub_val);
}
}
static WNF_ALWAYS_INLINE void wnf_enter_app(Term* next) {
Val app_loc = term_loc(*next);
WNF_STACK[WNF_SPOS++] = *next;
*next = take_at(app_loc + 0);
}
static WNF_ALWAYS_INLINE void wnf_enter_dup(Term* next) {
Val dup_loc = term_loc(*next);
*next = take_at(dup_loc + 1);
heap_free(dup_loc + 1, 1);
}
static WNF_ALWAYS_INLINE void wnf_enter_dps(Term* next) {
Val dup_loc = term_loc(*next);
Term dup_val = take_at(dup_loc);
if (term_sub(dup_val)) { *next = clear_sub(dup_val); return; }
WNF_STACK[WNF_SPOS++] = *next;
*next = dup_val;
}
static WNF_ALWAYS_INLINE void wnf_enter_ref(Term* next) {
Lab F = term_lab(*next);
WNF_ITRS[REF_EXP]++;
*next = wnf_ref(F, false);
}
static WNF_ALWAYS_INLINE void wnf_enter_typ(Term* next) {
Lab F = term_lab(*next);
WNF_ITRS[REF_EXP]++;
*next = wnf_ref(F, true);
}
...
static WNF_ALWAYS_INLINE void wnf_enter_gen(Term* next) {
*next = wnf_gen(*next);
}
...
// Thaw
// ----
// Force-like eliminator for FRZ: recursively unwraps Frz{t} after WNF.
static inline Term thaw(Term t) {
Term w = wnf(t);
if (term_tag(w) == FRZ) {
Val v = term_val(w);
Term x = take_at(v + 0);
heap_free(v + 0, 1);
return thaw(x);
}
return w;
}
...
=== src/wnf/ref.h ===
...
// REF interaction
static inline Term wnf_ref(Lab F, bool T) {
DF = F;
wnf_ref_stacks_init();
Defn def = BOOK[F & (BOOK_SIZE - 1u)];
Val loc = T ? def.type : def.term;
if (loc == 0) {
printf("Undefined reference %u ", (u32)F);
u32 idx = (u32)(F & (BOOK_SIZE - 1u));
if (BOOK[idx].name) {
printf("@%s", BOOK[idx].name);
}
abort();
} else {
// Record bases; all pushes must be naturally popped before return.
u32 LP0 = LAB_PATH_LEN;
u32 LS0 = LAM_SUBS_LEN;
u32 DS0 = DUP_SUBS_LEN;
Term fun = get_at(loc);
//printf("[WNF_REF] %s %u || %llu\n", def.name, T, wnf_get_total_interaction_count());
// Enable the transformation for only:
// - ∀ℕ[].Λa.ℕ[]
// - ∀ℕ.Λd.∀ℕ[].Λc.∀ℕ[].Λb.ℕ[]
// That causes test/gen_sort.c to abort with a runtime type error!
if (T && (F == 1 || F == 2)) {
printf("[WNF_REF] %u ", F); print_term_ln(fun);
Term top = T ? new_typ(F) : new_ref(F);
while (WNF_SPOS > 0) {
Term app = WNF_STACK[--WNF_SPOS];
Tag app_tag = term_tag(app);
Lab app_lab = term_lab(app);
Val app_val = term_val(app);
if (app_tag == DP0 || app_tag == DP1) {
//if (T) printf("CONSUMED %u %u\n", app_tag == DP0 ? 0 : 1, app_lab);
set_at(app_val, top);
top = app;
LAB_PATH[LAB_PATH_LEN++] = new_step(app_tag == DP0 ? 0 : 1, app_lab);
} else {
WNF_STACK[WNF_SPOS++] = app;
break;
}
}
}
Term tm = wnf_ref_alloc(LP0, LS0, DS0, 0, 0, fun);
// RECONSTRUCT
for (u32 i = 0; i < LAB_PATH_LEN - LP0; ++i) {
u8 dir = step_dir(LAB_PATH[LP0 + i]);
Lab lab = step_lab(LAB_PATH[LP0 + i]);
//if (T) printf("RE-ADDED-3 %u %u\n", dir, lab);
Dups dps = new_dps(lab, tm);
if (dir == 0) {
tm = dps.dp0;
} else {
tm = dps.dp1;
}
}
LAB_PATH_LEN = LP0;
LAM_SUBS_LEN = LS0;
DUP_SUBS_LEN = DS0;
return tm;
}
}
static inline Term wnf_ref_alloc(u32 LP0, u32 LS0, u32 DS0, u32 LS, u32 DS, Term t) {
//if (1 || DF == 2) { printf("[wnf_ref_alloc] %u %u ", LS, DS); print_term_ln(t); }
Tag t_tag = term_tag(t);
Lab t_lab = term_lab(t);
Val t_val = term_val(t);
switch (t_tag) {
// Atomic/opaque
case ERA: case BIT: case BT0: case BT1:
case NAT: case ZER: case U32: case UVA:
case PRI: case NIL: case EMP: case UNI:
case NUL: case SET: case REF: case TYP:
case RFL: {
return t;
}
case VAR:
case DP0:
case DP1: {
printf("TODO VAR/DP0/DP1 ALLOC\n");
abort();
}
// Book variables / dup variables
case BKV:
case BK0:
case BK1: {
Term got;
switch (t_tag) {
case BKV: { got = LAM_SUBS[LS0 + t_val].var; break; }
case BK0: { got = DUP_SUBS[DS0 + t_val].dp0; break; }
case BK1: { got = DUP_SUBS[DS0 + t_val].dp1; break; }
default : { printf("TODO 15\n"); abort(); }
}
return got;
}
case BKD: {
// Create dup binder and keep it in scope across both branches
Val loc = heap_alloc(1);
DUP_SUBS[DUP_SUBS_LEN].dp0 = new_dp0(t_lab, loc);
DUP_SUBS[DUP_SUBS_LEN].dp1 = new_dp1(t_lab, loc);
++DUP_SUBS_LEN;
// Clone value (left) under current dup depth
Term val = wnf_ref_alloc(LP0, LS0, DS0, LS, DS + 0, get_at(t_val + 0));
set_at(loc + 0, val);
// Clone body (right) with dup in scope
Term bod = wnf_ref_alloc(LP0, LS0, DS0, LS, DS + 1, get_at(t_val + 1));
--DUP_SUBS_LEN; // pop after cloning both sides
return bod;
}
case BKL: {
// Allocate body and bind a new λ variable using the stored binder nick (t_lab)
Val loc = heap_alloc(1);
Term var = new_var(t_lab, loc);
LAM_SUBS[LAM_SUBS_LEN].var = var;
++LAM_SUBS_LEN;
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS + 1, DS, get_at(t_val + 0)));
--LAM_SUBS_LEN; // paired
return new_term(false, LAM, t_lab, loc);
}
// Unary nodes
case SUC: case LST: case INC: case FRZ: {
Val loc = heap_alloc(1);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
return new_term(false, t_tag, t_lab, loc);
}
// Binary nodes
case APP: case CON: case TUP: case ALL: case SIG: {
Val loc = heap_alloc(2);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
return new_term(false, t_tag, t_lab, loc);
}
case SUP: {
Val loc = heap_alloc(2);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
return new_term(false, SUP, t_lab, loc);
}
case GEN: {
// Clone GEN node (ternary: ctx, typ, seed)
Val loc = heap_alloc(3);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
set_at(loc + 2, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 2)));
return new_term(false, GEN, t_lab, loc);
}
...
case BITM: {
//printf("ALLOC OF λ-MATCH DISABLED\n");
//abort();
Val loc = heap_alloc(2);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
return new_term(false, t_tag, t_lab, loc);
}
case NATM: {
//printf("ALLOC OF λ-MATCH DISABLED\n");
//abort();
Val loc = heap_alloc(2);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
return new_term(false, t_tag, t_lab, loc);
}
case LSTM: {
//printf("ALLOC OF λ-MATCH DISABLED\n");
//abort();
Val loc = heap_alloc(2);
set_at(loc + 0, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 0)));
set_at(loc + 1, wnf_ref_alloc(LP0, LS0, DS0, LS, DS, get_at(t_val + 1)));
return new_term(false, t_tag, t_lab, loc);
}
...
default: {
return t;
}
}
}
...
=== src/parser/parse/book/gen.h ===
// src/parser/parse/book/gen.h
static bool parse_book_gen(Parser* P, env* E) {
if (!eat_char(P, '?')) return false;
// Parse generator base name (lowercase recommended by spec)
char* lo_nam = parse_name(P, E);
u32 lo_id = register_def_name(lo_nam);
parse_whitespace(P, E);
u32 seed_bits = 0u;
if (eat_char(P, '#')) {
size_t start = P->idx;
while (P->src[P->idx] == '0' || P->src[P->idx] == '1') P->idx++;
size_t end = P->idx;
for (size_t j = end; j > start; --j) {
char c = P->src[j - 1];
if (seed_bits == 0u) {
seed_bits = (1u << 1) | (c == '1' ? 1u : 0u);
} else {
seed_bits = (seed_bits << 1) | (c == '1' ? 1u : 0u);
}
}
parse_whitespace(P, E);
}
Nick imp_nicks[256];
u32 imp_ids[256];
u32 imp_len = 0;
if (eat_char(P, '[')) {
parse_whitespace(P, E);
if (!eat_char(P, ']')) {
for (;;) {
parse_whitespace(P, E);
parse_expect_char(P, E, '@', "in generator import list");
char* imp_str = parse_name(P, E);
u32 imp_id = register_def_name(imp_str);
Nick imp_nk = nick(imp_str);
free(imp_str);
if (imp_len >= 256) parse_error(P, "too many imports in generator");
imp_ids[imp_len] = imp_id;
imp_nicks[imp_len] = imp_nk;
imp_len++;
parse_whitespace(P, E);
if (eat_char(P, ',')) continue;
if (eat_char(P, ']')) break;
parse_error(P, "expected ',' or ']' in generator import list");
}
}
}
parse_whitespace(P, E);
parse_expect_char(P, E, ':', "after generator declaration");
P->def = lo_nam;
Term typ = parse_term(P, E);
P->def = 0;
char* up_nam = name_to_uppercase(lo_nam);
u32 up_id = register_def_name(up_nam);
if (strcmp(lo_nam, up_nam) == 0) parse_error_name(P, "generator name '", lo_nam, "' must have lowercase letters");
Val lam_locs[256];
for (u32 i = 0; i < imp_len; ++i) { lam_locs[i] = heap_alloc(1); }
Term ctx = new_nil();
Val tail_loc = 0;
bool have_any = false;
for (u32 i = 0; i < imp_len; ++i) {
Term pair = new_tup(new_var(imp_nicks[i], lam_locs[i]), new_typ(imp_ids[i]));
Term cons = new_con(pair, new_nil());
Val loc = term_val(cons);
if (!have_any) { ctx = cons; have_any = true; }
else { set_at(tail_loc + 1, cons); }
tail_loc = loc;
}
Nick lo_nk = nick(lo_nam);
Term inner = new_gen(lo_nk, ctx, new_typ(lo_id), new_uva(seed_bits));
Term upper_body = inner;
for (int i = (int)imp_len - 1; i >= 0; --i) {
set_at(lam_locs[i] + 0, upper_body);
upper_body = new_term(0, LAM, (Lab)imp_nicks[i], lam_locs[i]);
}
Term ent_upper = term_to_entry(0, 0, upper_body);
BOOK[up_id].term = anchor(ent_upper);
BOOK[up_id].type = anchor(term_to_entry(0, 0, Set()));
BOOK[up_id].name = up_nam; // transfer ownership
Term spec = new_ref(up_id);
for (u32 i = 0; i < imp_len; ++i) {
spec = new_app_inline(spec, new_ref(imp_ids[i]));
}
Term ent_wrap = term_to_entry(0, 0, new_app_inline(new_pri(nick("Y")), spec));
BOOK[lo_id].term = anchor(ent_wrap);
BOOK[lo_id].type = anchor(term_to_entry(0, 0, typ));
BOOK[lo_id].name = lo_nam; // transfer ownership
printf("→ %s : ", BOOK[up_id].name); print_term_ln(get_at(BOOK[up_id].type));
printf("→ %s = ", BOOK[up_id].name); print_term_ln(get_at(BOOK[up_id].term));
printf("→ %s : ", BOOK[lo_id].name); print_term_ln(get_at(BOOK[lo_id].type));
printf("→ %s = ", BOOK[lo_id].name); print_term_ln(get_at(BOOK[lo_id].term));
parse_whitespace(P, E);
eat_char(P, ';');
return true;
}
...
=== test/gen_sort.c ===
#include <assert.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "_TEST_.h"
int test_gen_sort(void) {
hvm_init();
// Program identical to examples/gen_sort.hvm4
static const char* SRC =
"@Y = λf.!F&X=f;(F₀ (@Y F₁));"
"@INC : ℕ[] → ℕ[] =\n"
" λ{[]:[];<>:λh.λt.(1n+h)<>(@INC t)};\n\n"
"?sort[@INC] : ℕ → ℕ[] → (ℕ[]) → ℕ[];\n"
" !ℕ[]{(@sort 4n [2n,1n,3n] []) == [1n,2n,3n]};\n"
" !ℕ[]{(@sort 5n [4n,3n,1n] []) == [1n,3n,4n]};\n"
" !ℕ[]{(@sort 5n [3n,4n,2n] []) == [2n,3n,4n]};\n\n"
"@main = @SORT;\n";
Book(SRC);
// Assert exact first collapsed line of the mined function
Term root = check_all();
char* DETECTED = collapse_first_to_string(root);
const char* EXPECTED = "λINC.λsort.λ{0n:λa.λb.[];1n+:λa.λ{[]:λb.(INC (sort a b []));<>:λ{0n:λb.λc.(0n<>(sort 1n+a b c));1n+:λb.λc.λd.(sort 1n+a c (b<>d))}}}";
assert_equal_str("gen_sort/collapsed-first", DETECTED, EXPECTED);
free(DETECTED);
return 0;
}
#ifndef HVM_EMBED_TEST
int main(void) {
return test_gen_sort();
}
#endif
...
=== test/gen_sort_manual.c ===
#include <assert.h>
#include <inttypes.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <time.h>
#include <unistd.h>
#include "_TEST_.h"
// Emulates the generator specification in examples/gen_sort.hvm4:
//
// ?sort[@INC] : ℕ → ℕ[] → (∅ℕ[]) → ℕ[];
// !ℕ[]{(@sort 4n [2n,1n,3n] []) == [1n,2n,3n]};
// !ℕ[]{(@sort 5n [4n,3n,1n] []) == [1n,3n,4n]};
// !ℕ[]{(@sort 5n [3n,4n,2n] []) == [2n,3n,4n]};
static void check_gen_sort_manual(void) {
Dups TY = Dps("TY", All(Nat(), Lam("d", All(Lst(Nat()), Lam("xs", All(Frz(Lst(Nat())), Lam("ys", Lst(Nat()))))))));
// Keep manual form but fix only mul later; the sort path already worked.
Dups F0 = Dps("F0", FN_("inc", IC, FN_("srt", F, CALL(GL, Uva(1), Uva(1), Uva(MAX_ELIM), TY.dp0, spine_new(), Nil(), Tup(F,TY.dp1), Con(Tup(IC, All(Lst(Nat()), Lam("xs", Lst(Nat())))), Nil()), Nil()))));
Dups F1 = Dps("F1", F0.dp1);
Dups F2 = Dps("F2", F1.dp1);
Dups F3 = Dps("F3", F2.dp1);
Dups F4 = Dps("F4", F3.dp1);
Term X0 = App(App(App(App(Ref("Y"), App(F0.dp0, Ref("INC"))), nat(4)), Con(nat(2), Con(nat(1), Con(nat(3), Nil())))), Nil());
Term Y0 = Con(nat(1), Con(nat(2), Con(nat(3), Nil())));
Term E0 = equal(Zer(), X0, Y0);
Term X1 = App(App(App(App(Ref("Y"), App(F1.dp0, Ref("INC"))), nat(5)), Con(nat(4), Con(nat(3), Con(nat(1), Nil())))), Nil());
Term Y1 = Con(nat(1), Con(nat(3), Con(nat(4), Nil())));
Term E1 = equal(Zer(), X1, Y1);
Term X2 = App(App(App(App(Ref("Y"), App(F2.dp0, Ref("INC"))), nat(5)), Con(nat(3), Con(nat(4), Con(nat(2), Nil())))), Nil());
Term Y2 = Con(nat(2), Con(nat(3), Con(nat(4), Nil())));
Term E2 = equal(Zer(), X2, Y2);
Term PR = Nul();
Term CT = equal(Uva(0), PR, F3.dp0);
Term OK = when(bool_and(CT, bool_and(bool_and(E0, E1), E2)), F4.dp0);
char* DETECTED = collapse_first_to_string(OK);
const char* EXPECTED = "λinc.λsrt.λ{0n:λb.λc.[];1n+:λb.λ{[]:λc.(inc (srt b c []));<>:λ{0n:λc.λd.(0n<>(srt 1n+b c d));1n+:λc.λd.λe.(srt 1n+b d (c<>e))}}}";
assert_equal_str("gen_sort", DETECTED, EXPECTED);
free(DETECTED);
}
int test_gen_sort_manual(void) {
hvm_init();
static const char* LIB =
"@ID = λa.a;"
"@NOT = λ{#0:#1;#1:#0};"
"@AND = λ{#0:λ{#0:#0;#1:#0}; #1:λ{#0:#0;#1:#1}};"
"@Y = λf.!F&X=f;(F₀ (@Y F₁));"
"@ADD = λ{0n:λb.b;1n+:λa.λb.1n+((@ADD a) b)};"
"@SUM = λ{0n:0n;1n+:λp.!P&S=p;1n+((@ADD P₀) (@SUM P₁))};"
"@INC = λ{[]:[];<>:λh.λt.(1n+h)<>(@INC t)};";
Book(LIB);
check_gen_sort_manual();
return 0;
}
#ifndef HVM_EMBED_TEST
int main(void) {
return test_gen_sort_manual();
}
#endif
...
=== PROMPT.txt ===
we're currently facing a bug that is clearly some kind of type error.
when we run: ./test/gen_sort.c
we get the following output:
→ SORT : *
→ SORT = ΛINC.?sort [#(^a,@:INC)] : @:sort
→ sort : ∀ℕ.Λd.∀ℕ[].Λc.∀ℕ[].Λb.ℕ[]
→ sort = (/Y (@SORT @INC))
[WNF_REF] 1 ∀ℕ[].Λa.ℕ[]
[WNF_REF] 1 ∀ℕ[].Λa.ℕ[]
[WNF_REF] 1 ∀ℕ[].Λa.ℕ[]
... many lines like that ...
Error: ill-typed call to 'when':
∀a:ℕ[].ℕ[]
Note that the when function has type `∀a. 𝔹 → a → a`. It returns the second
argument when the first is true. Otherwise, it aborts. Yet, in the case above,
when is being, at some point, called with `∀a:ℕ[].ℕ[]`, which is NOT a bool, as
the first argument. This is nonsense!
After investigation, it seems like this error is triggered by a recent
optimization on wnf/ref.h - i.e., the function responsible for allocating
top-level definitions and functions to the memory, when running the interpreter.
The transform that triggers it is the following. Suppose we have:
! A &L = @foo
! B &L = A₀
! C &L = B₀
! D &L = A₁
... C₀ C₁ D₀ D₁ ...
In memory, and we're trying to alloc `@foo`, entering it from the `C₀` DP0. In
this case, our WNF_STACK will be:
C₀ → B₀ → A₀ → @foo
And the actual heap memory will be like:
! A &L = _
! B &L = _
! C &L = _
... _ C₁ D₀ D₁ ...
(Since we took these DP0 pointers out of the memory into WNF_STACK.)
So, what the transformation does is: instead of allocating `@foo` directly, we
will take the surrounding DP terms out of the WNF_STACK, we will re-insert then
into the memory (since twin DP's like C₁, D₀ point to it), *then* we will alloc
`@foo`, and then we will re-wrap it with fresh new dups. So, the result is:
! A &L = @foo
! B &L = A₀
! C &L = B₀
... C₁ D₀ D₁ ...
! A_ &L = alloc(@foo)
! B_ &L = A_₀
! C_ &L = B_₀
... C_₀ ...
While some sharing is lost, this is, theoretically, equivalent, since `C₁ D₀ D₁`
still see the same term, and C_₀ created its own copy of `@foo`, with the same
surrounding "DUP path", allowing sound destructuring of SUPs present in `@foo`.
Note that the twin dps (C₁, D₀, D₁) will keep reading from the old @foo / @:foo
at the top of the dup chain (since we're not replacing it by the newly allocated
term). that is expected, and is the whole point: each DP view must call alloc
again, re-allocating @foo/@:foo. while this seems slow, his was done so that we
can trigger a specific optimization later, used inside wnf_ref_apply. Yet, it
seems that, even disabling the wnf_ref_apply optimization, and just using
wnf_ref_alloc, the bug manifests. That means that the programs above are not
behaving identically - but they should! Even more weird is the fact that, it
seems, we're able to expand complex top-level definitions with this transform
on, and run complex programs, and all works fine. The only programs that seem to
be resulting in an issue are:
∀ℕ[].Λa.ℕ[]
and
∀ℕ.Λd.∀ℕ[].Λc.∀ℕ[].Λb.ℕ[]
So, somehow, if we alloc these programs with the transformation enabled, it will
result in the runtime type error I mention. See wnf/ref.h: currently, we only
enable the transformation to them. This is specially suspicious since these
terms are just simple types, and cloning them with and without sharing should be
equivalent, I don't see why it wouldn't. So, how does they end up as the argument
of 'when'? I suspect of some problems:
- Perhaps some function (for example, the 'prim/gen_*' functions?) is using the
MATCH macro incorrectly, causing them to flip arguments on nested matches.
This is a fairly common source of runtime type errors, specially since AIs
struggle with understanding the concept of an identity function call. See
STYLE.md and LINEARITY.md for some thoughts on that.
- Perhaps the DUP-* interactions are wrong for the terms above somehow, causing
this issue. But I've inspected manually and I don't see anything special.
- Perhaps there is some bug on how ALL's are allocated, but if that was the case,
why would it work without this optimization on (it still needs to alloc ALL's).
- Perhaps something else on ref.h, but we're not using the wnf_ref_apply functions
which are the complex parts, and wnf_ref_alloc is pretty straightforward and
looks right to me. So I don't see anything wrong there.
- Perhaps some small typo or silly error somewhere, but, what? Where?
A curious hint is that test/gen_sort_manual.c is working, even though it does
the exact same search as test/gen_sort.c! There are some minor differences on
how we assemble the involved terms on the manual vs parsed versions, and these
small differences are somehow enough to make one work and the other not.
Your goal is to investigate this issue and find the root cause for a type ending
up as the argument of 'when'.
###################################################
Now, complete below with an ANSWER, in PLAIN ENGLISH (NO CODE) explaining the
ROOT CAUSE of the problem described above. DO NOT WRITE CODE BELOW. JUST WRITE
WHAT IS CAUSING THIS BUG, IN PLAIN ENGLISH.
ENGLISH ANSWER:
@VictorTaelin
Copy link
Author

For the record: this also contains some bits of the old HVM4 codebase, which I don't find privacy critical

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment