VictorTaelin · May 30, 2026 00:26
diff --git a/gistfile1.txt b/gistfile1.txt
 consider the text below:

 ```
 CGen Language:

 Type ::=
  Emp ::= "⊥"
  Uni ::= "⊤"
  Eit ::= Type "|" Type
  Par ::= Type "&" Type
  Fun ::= Type "→" Type

 Func ::=
  Lam ::= "λ" Name "." Func
  EFQ ::= "λ" "."
  Mat ::= "λ" "{"? "#0" ":" Func ";"? "#1" ":" Func ";"? "}"?
  Get ::= "λ" "!" Func
  Del ::= "λ" "-" Func
  Dup ::= "λ" "+" Func
  Voi ::= "&{}"
  Frk ::= "&" Label "{" Func "," Func "}"
  Ret ::= "~" Term

 Expr ::=
  Var ::= Name
  One ::= "()"
  Inl ::= "#0" "{"? Expr "}"?
  Inr ::= "#1" "{"? Expr "}"?
  Tup ::= "(" Expr "," Expr ")"
  Ref ::= "@" Name "(" [Expr ","?] ")"
  Era ::= "&" "{" "}"
  Sup ::= "&" Label "{" Expr "," Expr "}"
  
 Book ::=
  Def ::= "@" Name ":" Type "=" Func ";"
  End ::= ";"

 Interaction Rules:

 λx.f v
 -------- app-lam
 f[x ← v]

 λ{#0:f;#1:g} #0{x}
 ------------------ app-mat-inl
 f x

 λ{#0:f;#1:g} #1{x}
 ------------------ app-mat-inr
 g x

 λ!f (x,y)
 --------- app-get-tup
 f x y 

 Structural Rules:

 λ-f v
 ----- app-del
 f

 λ+f v
 ----- app-dup
 f v v

 Erasure Rules:

 λ{#0:f;#1:g} &{}
 ---------------- app-mat-era
 &{}

 λ!f &{}
 ------- app-get-era
 &{}

 Superposition Rules:

 λ{#0:f;#1:g} &L{x,y}
 -------------------- app-mat-sup
 &{}

 λ!f &L{x,y}
 ----------- app-get-sup
 &{}

 &{} x
 ----- app-voi
 &{}

 &L{f,g} x
 ------------ app-frk
 &L{f x, g x}

 Evaluator (CEK machine, Felleisen 1986):

 Three tail-recursive states: app (apply Func to args), eval
 (evaluate Term), cont (continue through frames with a value).
 C = Func | Term, E = Ctx (de Bruijn levels), K = [Frame].

 Frame ::=
  F_DIS  [Term] Ctx Label Func Term
  F_FRK  [Term] Ctx Label Func
  F_INL
  F_INR
  F_TUP  Term Ctx
  F_SUP  Label Term Ctx
  F_TUP2 Term
  F_SUP2 Label Term
  F_REF  Func [Term] [Term] Ctx

 norm t = eval t [] []

 app (Lam b)     (x         : s) fs c = app b s fs (x:c)
 app EFQ         _               fs _ = cont Era fs
 app (Mat a b)   (Inl x     : s) fs c = app a (x:s) fs c
 app (Mat a b)   (Inr x     : s) fs c = app b (x:s) fs c
 app (Mat a b)   (Era       : _) fs _ = cont Era fs
 app (Mat a b)   (Sup l x y : s) fs c = app (Mat a b) (x:s) (F_DIS s c l (Mat a b) y:fs) c
 app (Get b)     (Tup x y   : s) fs c = app b (x:y:s) fs c
 app (Get b)     (Era       : _) fs _ = cont Era fs
 app (Get b)     (Sup l x y : s) fs c = app (Get b) (x:s) (F_DIS s c l (Get b) y:fs) c
 app (Del b)     (_         : s) fs c = app b s fs c
 app (Dup b)     (x         : s) fs c = app b (x:x:s) fs c
 app Voi         _               fs _ = cont Era fs
 app (Frk l a b) s               fs c = app a s (F_FRK s c l b:fs) c
 app (Ret x)     _               fs c = eval x fs c
 app f           (Var i     : s) fs c = app f (c[i]:s) fs c

 eval (Var i)        fs c = eval c[i] fs c
 eval (Inl x)        fs c = eval x (F_INL:fs) c
 eval (Inr x)        fs c = eval x (F_INR:fs) c
 eval (Tup x y)      fs c = eval x (F_TUP y c:fs) c
 eval (Sup l x y)    fs c = eval x (F_SUP l y c:fs) c
 eval (Ref n (t:ts)) fs c = eval t (F_REF bk[n] [] ts c:fs) c
 eval (Ref n [])     fs _ = app bk[n] [] fs []
 eval x              fs _ = cont x fs

 cont x (F_DIS s c l f y      : fs) = app f (y:s) (F_SUP2 l x:fs) c
 cont x (F_FRK s c l g        : fs) = app g s (F_SUP2 l x:fs) c
 cont x (F_TUP y c            : fs) = eval y (F_TUP2 x:fs) c
 cont x (F_SUP l y c          : fs) = eval y (F_SUP2 l x:fs) c
 cont x (F_INL                : fs) = cont (Inl x) fs
 cont x (F_INR                : fs) = cont (Inr x) fs
 cont x (F_TUP2 a             : fs) = cont (Tup a x) fs
 cont x (F_SUP2 l a           : fs) = cont (Sup l a x) fs
 cont x (F_REF f dn (t:ts) c  : fs) = eval t (F_REF f (x:dn) ts c:fs) c
 cont x (F_REF f dn [] _      : fs) = app f (rev (x:dn)) fs []
 cont x []                          = x

 Invariant: cont never re-enters eval with a term it just wrapped.
 Wrapping frames (F_INL, F_INR, F_TUP2, F_SUP2) stay in cont.
 ```

 this text was the spec for a very old iteration of HVM, before we understood how
 DUP's (the dual of SUP's worked) and why they're necessary to make superposition
 interactions sound. since them, we built HVM, which is a very fast evaluator
 that performs ~100m interactions/s in all gen/ hard problems. yet, even though
 HVM is remarkably efficient, it is nowhere even near the old CGen interpreter,
 which performed *1.3 billion* (!!!) interactions/s, and was 10x+ faster than
 ./supgenvm.c in all benchmarks. that is mostly attributable to the CEK like
 architecture, which allowed it to match eliminators like mat and get against
 introducers like inl, inr and tup directly, without intermediate allocations of
 λ's and λ-matches, without ALO nodes, without any of the heavy machinery that
 slown down HVM. now, in this repo, we have a file named ./supgenvm.c, which is a
 distilled version of HVM with just the functionality needed to run supgen.hvm.
 that gives us a small playground which we can edit to explore HVM optimizations,
 without having to mess with the full repo. your goal is to, taking huge
 inspiration from how CGen worked, *reimplement a brand new HVM, from scratch,
 that evaluates supgen.hvm correctly, outputting the same results for all gen/
 problems, except *much* faster in terms of raw interactions/s; expectedly, 10x
 faster or so. this goal might take a long time, and you CANNOT stop working on
 it until you succeed. this is very important! the first and most difficult part
 of this is to define the evaluator; i.e., the formulas above, except redesigned
 to support the full HVM and all its computationally relevant interactions, in
 particular, the most challenging interaction is the DUP-LAM rule:

 ! F &L = λx.f
 ---------------- dup-lam
 ! G &L = f
 F₀ ← λ$x0. G₀
 F₁ ← λ$x1. G₁
 x  ← &L{$x0,$x1}

 the reason this makes it so difficult is that this rule substitutes F₀ and F₁
 with two "globally scoped" lambdas, `λ$x0` and `λ$x1`. the problem is that this
 rule is triggered when we enter the λx.f lambda via either F₀ or F₁, which may
 occur in distant parts of the program, yet, whether we access via F₀ or F₁, this
 rule affects the other variable, because whenever we enter the other variable,
 we will NOT repeat this interaction again, but merely access the duplicated
 lambda that the first visitor allocated. this lambda allocation seems
 unavoidable, and this is what makes CEK-style evaluation difficult, as the CEK
 machine never actually allocs or copies λ-matches or λ's; instead, these are
 just static *code* associated with a top-level function. the lambdas match
 against live terms like inl, inr, tup, but they're never allocated. so, how
 can we implement full HVM in this style?

 an interesting idea would be to re-implement dup-lam as:

 ! F &L = λx.f
 ---------------- dup-lam (via F₀)
 ! G &L = f
 F₀ ← λ$x0. G₀
 x  ← &L{$x0,&{}}

 i.e., if we access it via F₀, we substitute just F₀ (i.e., immediatelly
 returning it), and extend the CEK context with &L{$x0,&{}}. this is technically
 sound per interaction rules; sup / dup paths will annihilate as expected, the
 final result will match; but this destroys all sharing enjoyed by interaction
 net evaluation, resulting in asymptotic complexity explosion. so, it isn't
 really viable. something else is needed.

 as you can tell, I don't have a solution, just rough ideas. perhaps what we want
 isn't even a CEK machine, but the fact a CEK evaluator for CGen achieved 1.3
 billion i/s can serve as an inspiration that, perhaps, the way HVM is currently
 constructed is not ideal, and perhaps we could redesign it in a way that
 achieves a much higher throughput; possibly ways I'm not even thinking of.

 this is your goal: redesign HVM and achieve a much higher performance.

 do not stop working until you have a true breakthrough (at least 3x speedup in
 raw intearctions/s on all hard gen/ tests).

 important: do NOT edit any file in this repository. always work in your own
 isolated worktree branch.
	consider the text below:

	```
	CGen Language:

	Type ::=
	Emp ::= "⊥"
	Uni ::= "⊤"
	Eit ::= Type "\|" Type
	Par ::= Type "&" Type
	Fun ::= Type "→" Type

	Func ::=
	Lam ::= "λ" Name "." Func
	EFQ ::= "λ" "."
	Mat ::= "λ" "{"? "#0" ":" Func ";"? "#1" ":" Func ";"? "}"?
	Get ::= "λ" "!" Func
	Del ::= "λ" "-" Func
	Dup ::= "λ" "+" Func
	Voi ::= "&{}"
	Frk ::= "&" Label "{" Func "," Func "}"
	Ret ::= "~" Term

	Expr ::=
	Var ::= Name
	One ::= "()"
	Inl ::= "#0" "{"? Expr "}"?
	Inr ::= "#1" "{"? Expr "}"?
	Tup ::= "(" Expr "," Expr ")"
	Ref ::= "@" Name "(" [Expr ","?] ")"
	Era ::= "&" "{" "}"
	Sup ::= "&" Label "{" Expr "," Expr "}"

	Book ::=
	Def ::= "@" Name ":" Type "=" Func ";"
	End ::= ";"

	Interaction Rules:

	λx.f v
	-------- app-lam
	f[x ← v]

	λ{#0:f;#1:g} #0{x}
	------------------ app-mat-inl
	f x

	λ{#0:f;#1:g} #1{x}
	------------------ app-mat-inr
	g x

	λ!f (x,y)
	--------- app-get-tup
	f x y

	Structural Rules:

	λ-f v
	----- app-del
	f

	λ+f v
	----- app-dup
	f v v

	Erasure Rules:

	λ{#0:f;#1:g} &{}
	---------------- app-mat-era
	&{}

	λ!f &{}
	------- app-get-era
	&{}

	Superposition Rules:

	λ{#0:f;#1:g} &L{x,y}
	-------------------- app-mat-sup
	&{}

	λ!f &L{x,y}
	----------- app-get-sup
	&{}

	&{} x
	----- app-voi
	&{}

	&L{f,g} x
	------------ app-frk
	&L{f x, g x}

	Evaluator (CEK machine, Felleisen 1986):

	Three tail-recursive states: app (apply Func to args), eval
	(evaluate Term), cont (continue through frames with a value).
	C = Func \| Term, E = Ctx (de Bruijn levels), K = [Frame].

	Frame ::=
	F_DIS [Term] Ctx Label Func Term
	F_FRK [Term] Ctx Label Func
	F_INL
	F_INR
	F_TUP Term Ctx
	F_SUP Label Term Ctx
	F_TUP2 Term
	F_SUP2 Label Term
	F_REF Func [Term] [Term] Ctx

	norm t = eval t [] []

	app (Lam b) (x : s) fs c = app b s fs (x:c)
	app EFQ _ fs _ = cont Era fs
	app (Mat a b) (Inl x : s) fs c = app a (x:s) fs c
	app (Mat a b) (Inr x : s) fs c = app b (x:s) fs c
	app (Mat a b) (Era : _) fs _ = cont Era fs
	app (Mat a b) (Sup l x y : s) fs c = app (Mat a b) (x:s) (F_DIS s c l (Mat a b) y:fs) c
	app (Get b) (Tup x y : s) fs c = app b (x:y:s) fs c
	app (Get b) (Era : _) fs _ = cont Era fs
	app (Get b) (Sup l x y : s) fs c = app (Get b) (x:s) (F_DIS s c l (Get b) y:fs) c
	app (Del b) (_ : s) fs c = app b s fs c
	app (Dup b) (x : s) fs c = app b (x:x:s) fs c
	app Voi _ fs _ = cont Era fs
	app (Frk l a b) s fs c = app a s (F_FRK s c l b:fs) c
	app (Ret x) _ fs c = eval x fs c
	app f (Var i : s) fs c = app f (c[i]:s) fs c

	eval (Var i) fs c = eval c[i] fs c
	eval (Inl x) fs c = eval x (F_INL:fs) c
	eval (Inr x) fs c = eval x (F_INR:fs) c
	eval (Tup x y) fs c = eval x (F_TUP y c:fs) c
	eval (Sup l x y) fs c = eval x (F_SUP l y c:fs) c
	eval (Ref n (t:ts)) fs c = eval t (F_REF bk[n] [] ts c:fs) c
	eval (Ref n []) fs _ = app bk[n] [] fs []
	eval x fs _ = cont x fs

	cont x (F_DIS s c l f y : fs) = app f (y:s) (F_SUP2 l x:fs) c
	cont x (F_FRK s c l g : fs) = app g s (F_SUP2 l x:fs) c
	cont x (F_TUP y c : fs) = eval y (F_TUP2 x:fs) c
	cont x (F_SUP l y c : fs) = eval y (F_SUP2 l x:fs) c
	cont x (F_INL : fs) = cont (Inl x) fs
	cont x (F_INR : fs) = cont (Inr x) fs
	cont x (F_TUP2 a : fs) = cont (Tup a x) fs
	cont x (F_SUP2 l a : fs) = cont (Sup l a x) fs
	cont x (F_REF f dn (t:ts) c : fs) = eval t (F_REF f (x:dn) ts c:fs) c
	cont x (F_REF f dn [] _ : fs) = app f (rev (x:dn)) fs []
	cont x [] = x

	Invariant: cont never re-enters eval with a term it just wrapped.
	Wrapping frames (F_INL, F_INR, F_TUP2, F_SUP2) stay in cont.
	```

	this text was the spec for a very old iteration of HVM, before we understood how
	DUP's (the dual of SUP's worked) and why they're necessary to make superposition
	interactions sound. since them, we built HVM, which is a very fast evaluator
	that performs ~100m interactions/s in all gen/ hard problems. yet, even though
	HVM is remarkably efficient, it is nowhere even near the old CGen interpreter,
	which performed 1.3 billion (!!!) interactions/s, and was 10x+ faster than
	./supgenvm.c in all benchmarks. that is mostly attributable to the CEK like
	architecture, which allowed it to match eliminators like mat and get against
	introducers like inl, inr and tup directly, without intermediate allocations of
	λ's and λ-matches, without ALO nodes, without any of the heavy machinery that
	slown down HVM. now, in this repo, we have a file named ./supgenvm.c, which is a
	distilled version of HVM with just the functionality needed to run supgen.hvm.
	that gives us a small playground which we can edit to explore HVM optimizations,
	without having to mess with the full repo. your goal is to, taking huge
	inspiration from how CGen worked, *reimplement a brand new HVM, from scratch,
	that evaluates supgen.hvm correctly, outputting the same results for all gen/
	problems, except much faster in terms of raw interactions/s; expectedly, 10x
	faster or so. this goal might take a long time, and you CANNOT stop working on
	it until you succeed. this is very important! the first and most difficult part
	of this is to define the evaluator; i.e., the formulas above, except redesigned
	to support the full HVM and all its computationally relevant interactions, in
	particular, the most challenging interaction is the DUP-LAM rule:

	! F &L = λx.f
	---------------- dup-lam
	! G &L = f
	F₀ ← λ$x0. G₀
	F₁ ← λ$x1. G₁
	x ← &L{$x0,$x1}

	the reason this makes it so difficult is that this rule substitutes F₀ and F₁
	with two "globally scoped" lambdas, `λ$x0` and `λ$x1`. the problem is that this
	rule is triggered when we enter the λx.f lambda via either F₀ or F₁, which may
	occur in distant parts of the program, yet, whether we access via F₀ or F₁, this
	rule affects the other variable, because whenever we enter the other variable,
	we will NOT repeat this interaction again, but merely access the duplicated
	lambda that the first visitor allocated. this lambda allocation seems
	unavoidable, and this is what makes CEK-style evaluation difficult, as the CEK
	machine never actually allocs or copies λ-matches or λ's; instead, these are
	just static code associated with a top-level function. the lambdas match
	against live terms like inl, inr, tup, but they're never allocated. so, how
	can we implement full HVM in this style?

	an interesting idea would be to re-implement dup-lam as:

	! F &L = λx.f
	---------------- dup-lam (via F₀)
	! G &L = f
	F₀ ← λ$x0. G₀
	x ← &L{$x0,&{}}

	i.e., if we access it via F₀, we substitute just F₀ (i.e., immediatelly
	returning it), and extend the CEK context with &L{$x0,&{}}. this is technically
	sound per interaction rules; sup / dup paths will annihilate as expected, the
	final result will match; but this destroys all sharing enjoyed by interaction
	net evaluation, resulting in asymptotic complexity explosion. so, it isn't
	really viable. something else is needed.

	as you can tell, I don't have a solution, just rough ideas. perhaps what we want
	isn't even a CEK machine, but the fact a CEK evaluator for CGen achieved 1.3
	billion i/s can serve as an inspiration that, perhaps, the way HVM is currently
	constructed is not ideal, and perhaps we could redesign it in a way that
	achieves a much higher throughput; possibly ways I'm not even thinking of.

	this is your goal: redesign HVM and achieve a much higher performance.

	do not stop working until you have a true breakthrough (at least 3x speedup in
	raw intearctions/s on all hard gen/ tests).

	important: do NOT edit any file in this repository. always work in your own
	isolated worktree branch.
No results found