Skip to content

Instantly share code, notes, and snippets.

@VictorTaelin
Last active March 24, 2025 11:29
Show Gist options
  • Save VictorTaelin/41d3c07d62cbb0978924fa6d504c4bee to your computer and use it in GitHub Desktop.
Save VictorTaelin/41d3c07d62cbb0978924fa6d504c4bee to your computer and use it in GitHub Desktop.
Kolmogorov Complexity minimizing language
import Prelude.hvml
// Types
// -----
// Type ::= ⊥ | [Type]
// - ⊥ = ⊥ = the Empty type
// - [⊥] = ⊤ = the Unit type
// - [[⊥]] = ℕ = the Nat type
// - [[[⊥]]] = ℕ* = the List type
// - [[[[⊥]]]] = ℕ** = the Grid type
data Type {
#Emp
#Vec{ele}
}
// Core ::=
// | LAM λ<bod>
// | MAT ?<nil><con>
// | PUT +<idx><nxt>
// | EMI ^<idx><nxt>
// | NEW $<nxt>
// | ERA -<nxt>
// | DUP &<nxt>
// | RET .
// | CAL @<fid><nxt>
data Core {
#Lam{bod} // λx. S.push(x); <bod>
#Mat{nil con} // match pop() { []:<nil> (x,xs): S.push(xs); S.push(x); <cons> }
#Put{idx nxt} // S[-idx] = S.pop(); <nxt>
#Emi{idx nxt} // R[-idx] = S.pop(); <nxt>
#New{nxt} // S.push([]); <nxt>
#Era{nxt} // S.pop(); <nxt>
#Dup{nxt} // x = S.pop(); S.push(x); S.push(x); <nxt>
#Ret{} // return S;
#Cal{fid ari ret nxt} // S.append(F[fid]()); <nxt>
}
// Utils
// -----
@nat = λn ~ n {
0: []
p: #Cons{[] (@nat p)}
}
@u32 = λxs
@len(xs)
@insert_at(i v xs) = ~ i !v !xs {
0: !&0{x xs}=xs &0{#Cons{v x} xs}
i: !&0{x xs}=xs &0{x @insert_at(i v xs)}
}
@copy_at(i xs) = ~ i !xs {
0: !&0{x xs}=xs !&{x0 x1}=x &0{&0{x0 xs} x1}
i: !&0{x xs}=xs !&0{xs xk}=@copy_at(i xs) &0{&0{x xs} xk}
}
@apply(n xs fn) = ~ n !xs !fn {
0: &0{xs fn}
p: !&0{x xs}=xs !&0{xs fn}=@apply(p xs fn) &0{xs (fn x)}
}
@get(n rs fn) = ~ n !rs !fn {
0: fn
p: !&0{r rs}=rs @get(p rs (fn r))
}
@append(n xs ys) = ~ n !xs !ys {
0: ys
p: !&0{x xs}=xs &0{x @append(p xs ys)}
}
// Instructions
// ------------
// Pushes an external argument to the local stack
@lam(len ctx lib bod) = λx
@comp((+ len 1) &0{x ctx} lib bod)
// Pattern-matches on the top stack element
@mat(len ctx lib nil con) =
! &0{x ctx} = ctx
~ x !len !ctx !lib {
#Nil: @comp((- len 1) ctx lib nil)
#Cons{x xs}: @comp((+ len 1) &0{x &0{xs ctx}} lib con)
}
// Pushes a value into the nth element of the stack
@put(len ctx lib idx nxt) =
! &0{x ctx} = ctx
@comp((- len 1) @insert_at(idx x ctx) lib nxt)
// Emits a value into the nth slot of the return list
@emi(len ctx lib idx nxt) =
! &0{x ctx} = ctx
@insert_at(idx x @comp((- len 1) ctx lib nxt))
// Pushes an empty element to the stack
@new(len ctx lib nxt) =
@comp((+ len 1) &0{[] ctx} lib nxt)
// Drops the top element of the stack
@era(len ctx lib nxt) =
! &0{x ctx} = ctx
@comp((- len 1) ctx lib nxt)
// Clones the top element of the stack
@dup(len ctx lib nxt) =
! &0{x ctx} = ctx
! & {x0 x1} = x
@comp((+ len 1) &0{x0 &0{x1 ctx}} lib nxt)
// Return empty elements
@ret(len ctx lib) =
ctx
// Calls itself recursively
@cal(len ctx lib fid ari ret nxt) =
!& {ar0 ar1} = ari
!&0{lib fun} = @copy_at(fid lib)
!&0{ctx got} = @apply(ar0 ctx fun)
!& {le0 le1} = (- len ar1)
@comp((+ le0 ret) @append(le1 ctx got) lib nxt)
// Compilation
// -----------
@comp(len ctx lib tm) = ~ tm !len !ctx !lib {
#Lam{bod}: @lam(len ctx lib bod)
#Mat{nil con}: @mat(len ctx lib nil con)
#Put{idx nxt}: @put(len ctx lib idx nxt)
#Emi{idx nxt}: @emi(len ctx lib idx nxt)
#New{nxt}: @new(len ctx lib nxt)
#Era{nxt}: @era(len ctx lib nxt)
#Dup{nxt}: @dup(len ctx lib nxt)
#Ret{}: @ret(len ctx lib)
#Cal{fid ari ret nxt}: @cal(len ctx lib fid ari ret nxt)
}
@make(term lib) = λf @comp(0 * &0{f lib} term)
// Tests
// -----
// Core ::= λ<bod> | ?<nil><con> | +<nxt> | ^<nxt> | $<nxt> | *<nxt> | &<nxt> | . | @F<nxt>
// mul2 : (ℕ) → (ℕ)
// mul2 = λ?$.$^0^0@0.
@t_mul2 = #Lam{#Mat{
#New{#Ret{}}
#New{#Emi{0 #Emi{0 #Cal{0 1 1 #Ret{}}}}}}}
@f_mul2 = @fix(λf @comp(0 * &0{f *} @t_mul2))
// half : (ℕ) → (ℕ)
// half = λ?$.-?$.^0@0.
@t_half = #Lam{#Mat{
#New{#Ret{}}
#Era{#Mat{
#New{#Ret{}}
#Emi{0 #Cal{0 1 1 #Ret{}}}}}}}
@f_half = @fix(λf @comp(0 * &0{f *} @t_half))
// Decreases all values by 1, counts the number of 0's.
// decs : ([ℕ]) → ([ℕ],ℕ)
// decs = λ?$$.?$^[email protected]^1@012.
@t_decs = #Lam{#Mat{
#New{#New{#Ret{}}}
#Mat{#New{#Emi{0 #Cal{0 1 2 #Ret{}}}}
#Era{#Emi{1 #Cal{0 1 2 #Ret{}}}}}}}
@f_decs = @fix(λf @comp(0 * &0{f *} @t_decs))
// Converts a list to a histogram.
// xs2h : ([ℕ]) → ([ℕ])
// xs2h = λ?$.+0@012^0@111.
@t_xs2h = #Lam{#Mat{
#New{#Ret{}}
#Put{0 #Cal{0 1 2 #Emi{0 #Cal{1 1 1 #Ret{}}}}}}}
@f_xs2h = @fix(λf @comp(0 * &0{@f_decs &0{f *}} @t_xs2h))
// Converts a histogram to a list.
// h2xs : ([ℕ],ℕ) → [ℕ]
// h2xs = λ?λ-$.?λ[email protected]+0λ&^0@021.
@t_h2xs = #Lam{#Mat{
#Lam{#Era{#New{#Ret{}}}}
#Mat{
#Lam{#New{#Put{0 #Cal{0 2 1 #Ret{}}}}}
#Era{#Put{0 #Lam{#Dup{#Emi{0 #Cal{0 2 1 #Ret{}}}}}}}}}}
@f_h2xs = @fix(λf @comp(0 * &0{f *} @t_h2xs))
// sort : ([ℕ]) → ([ℕ])
// sort = λ@011$@121.
@t_sort = #Lam{#Cal{0 1 1 #New{ #Cal{1 2 1 #Ret{}}}}}}
@f_sort = @fix(λf @comp(0 * &0{@f_xs2h &0{@f_h2xs &0{f *}}} @t_sort))
// Sorting Algorithm in ~150 bits
@list = @map(@nat [3 0 2 1 5 1])
@main = @map(@u32 (@f_sort @list))
@VictorTaelin
Copy link
Author

so, today I spent most of the day thinking about a language that minimizes the minimum description length (i.e., kolmogorov complexity) of useful algorithms. I don't think the pure functional language used by SupGen is great for this, because it has too many degrees of freedom, in places where that isn't really needed. for example, the base case of an algorithm is often simple, but it can be arbitrarily complex in SupGen. also, most recursive algorithms keep their arguments in a stable order, but SupGen will swap them viciously. also, constructors can be nested around arguments, resulting in even more combinations. and many other things. so, I've spent this time thinking: "what are the truly meaningful decisions one can make while crafting an algorithm?"

by a distillation process, where I repeatedly removed redundancies, while adding more expressivity, I ended up with a DSL that is, essentially, a stack-based language with some peculiarities. the main unit of computation is a function, which is kinda like a C procedure. instead of fixed length integers, values are either unit (()), or a vector of units, or a vector of vector of units, and so on. the main computational operation is mat, which performs a list pattern-match (cons/nil) on the first element of the stack. we have a lam primitive which reads the next argument and pushes it to the stack, an use primitive that drops the first element of the stack, and a dup primitive that clones the first element of the stack. at any point, we can use the emi primitive to emit a constructor out - this ability is important to prune search branches. a function can return N values. the rec primitive recurses, recalling the function with the current stack as the arguments, and the ret primitive halts and returns an empty list. that's the only thing we can return on the halt case - more complex values are returned via emi. so, for example, to implement add(a,b), we need to fully traverse a and b, and, emitting a succ (which is cons(1,x)) for each constructor we pass, and returning [] at the end.

this language seems to have an extremely good bit-length/expressivity ratio. the function equivalent to mul2 = λx. case x { 0:0 ; 1+p: 1+1+mul2(p) } has just 21 bits. for a comparison, that function, on the binary lambda calculus, would be λn. ((n λs.λz.z) λp.λs.λz.(s λs.λz.(s (mul2 p)))), which has about ~50 bits. it is just doing more "useful things" per bits, while being in a form that is very efficient to compute, as each function is basically a linear loop that emits constructors.

it is also designed with composition in mind. my idea for that is that functions can not call other functions - i.e., each function captures a loop that operates on its own local stack, without passing control to other function. this is good for efficiency, and is also good for enumeration, since a function doesn't need to attempt to call external functions in its synthesis. instead, each function can be seen as an independent representation of some computational behavior or process. as such, composition must done externally. since a function has N inputs and N outputs, we can connect them for more complex behaviors, in a network-like fashion. functions form building blocks of larger programs. we can form a "symbolic neural net" by using functions as neurons, or we can evolve a book of increasingly more useful functions, essentially evolving a "model" that increases in capacity with compute.

notably, I could express a sorting algorithm with just about 80 bits in that language(!!!). tomorrow I'll try to find it by search. I believe this would be even a new result

@VictorTaelin
Copy link
Author

I've updated the language above (which I now call KolmoLang, because it is a Kolmogorov Complexity minimizing language) and it is really pretty right now:

https://gist.github.com/VictorTaelin/41d3c07d62cbb0978924fa6d504c4bee

we now have the ability to call other functions, and the sort function I implemented is actually ~150 bits. now, we could make it much shorter if we could merge the 3 function into one, but, doing so would require a swap primitive. so see why, consider this sort in Haskell:

sort :: [Int] -> [Int] -> Int -> [Int]
sort []     [] n = []
sort []     ys n = sort ys [] (1+n)
sort (0:xs) ys n = n : sort xs ys n
sort (x:xs) ys n = sort xs (x-1:ys) n

main :: IO ()
main = print $ sort [1,7,5,2,4,9] [] 0

this is not expressive on the current implementation of KolmoLang, because of the second clause, which swaps xs and ys. now, of course, we could go ahead and just add a swap primitive to KolmoLang, but doing so would have harsh consequences to the enumeration space. that's because swap is an operation that can be called in any context, so, the enumerator would always need to branch the current search, attempting to swap things in context. not only that, but swapping two values will result in wildly divergent unfoldings of the recursion - i.e., SUP nodes would collapse and we'd lose a lot of sharing. and to make things even worse, to make swap really expressive, what we really want is a mov operation that brings the nth element of the stack to the top - and that will result in even more permutations. in short, I predict the "stable argument order" of KolmoLang will be very important for program search, and, as such, I want to avoid swap and mov operations, on the stack, as much as possible. but how can we do so, while still keeping the language expressive?

so, here's a crazy idea:

what if we stratify the stack in layers, where each layer has a different type?

this is something we can do on KolmoLang, because this language has a countable amount of different value types

to understand why, notice KolmoLang has only 2 type formers:

Type ::= ⊥ | [Type]

I.e., the only types it has are Empty type, and the List type

this allows us to construct the Unit type, Nats, Lists of Nats, Lists of Lists of Nats, and so on:

- ⊥         = ⊥   = the Empty type
- [⊥]       = ⊤   = the Unit type
- [[⊥]]     = ℕ   = the Nat type
- [[[⊥]]]   = ℕ*  = the List type
- [[[[⊥]]]] = ℕ** = the Grid type

and it also allows us to draw an isomorphism between types and natural numbers:

type :: ℕ → Set
type 0   = ⊥
type 1+n = [type n]

one of the reasons this is cool, and desirable, is that it gives us a very lean "enumeration of all types". one of the problems that SupGen had is that types were very rich, so, when we tried to create an auxiliary function, or to extend an existing function with a "new state", we had to try way too many types, including tons of redundant types, and including types that wouldn't match. for example, the reverse :: [a] → [a] function can't be implemented without an auxiliary list (an extra argument), or an auxiliary function. but adding an extra argument would mean we have to enumerate all S in reverse :: [a] → S → [a], and that would include A LOT of types (bit, trit, pair of bit, pair of trit, list of pairs of ... etc .etc.). and if we added an extra aux function, like push :: S, we would also need to try countless types that make no sense, like push :: Pokemon → PokerCard, and almost none of the enumerated push functions would even be callable inside the enumeration of reverse. so, this, I believe, will greatly simplify the search of functions that need aux functions, or extra arguments.

but the other reason that I believe this is useful, is that - and I'm not sure of this, but it does make sense to me - (...)

is that it allows us to stratify the stack in layers, so that, instead of a single stack, we have one stack for x : ⊤, one stack for n : ℕ, one stack for xs : [ℕ], etc. - i.e., stacks group members of the same type (and we don't need to add a type annotation!). and, this, I believe, allows the language to be very expressive, even without the ability to swap elements of the stack. how so?

consider, for example, a min function:

min :: [ℕ] → [ℕ] → [ℕ]
...
min (x : xs) (y : ys) = ...

in this clause, we'd have the following stack:

[xs, x, ys, y]

now, what we want to do is take the lte of x and y. but we can't, because these numbers aren't on the top of the stack: there's a ys "in our way". so, to circumvent this situation, we need to perform:

mov(2)

which would bring the stack to:

[xs, ys, y, x]

which is still not what we want. so, now we need a:

mov(1)

giving us:

[xs, ys, x, y]

which now allows us to call lte, and complete the algorithm.

now, this may look inoffensive, but, again, remember mov would greatly increase the search space.

now, if we instead stratified the stack in types, we'd have, from the very beginning:

[x, y] : ℕ
[xs, ys] : [ℕ]

and we can now safely call lte : ℕ → ℕ → ℕ, as it will draw the two elements from the stack!

the implementation of lte becomes much shorter

now, what about sort? how does this help?

oops - I made a mistake in my reasoning.

the problematic clause, on sort, isn't:

sort []     ys n = sort ys [] (1+n)

(where we must swap xs and ys)

but, instead:

sort (x:xs) n = sort xs (x-1:ys) n

because, in this clause, our stack would be:

[xs x ys]

and we want to push x into ys, but we can't, as a put instruction would put ys inside x (which is ill typed), and there would be no way to swap them

now, with stacks stratified by types, we'd have, instead:

[x]
[xs ys]

and a put operation will correctly push x into ys, which is what we want!

now, regarding the clause:

sort []     ys n = sort ys [] (1+n)

this one is different; it doesn't require "swapping" to fix an "ill stack", but it is an actual "part of the algorithm", that we need to clear ys and move all its elements into xs. instead of a primitive swap, though (which would impact the search space of ALL functions), I think this should be done with an auxiliary move function, which can be expressed, and would move all elements of ys to xs (instead of literally swapping them).

so, basically - one of the key aspects of KolmoLang is that arguments are "sticky" in their positions. there is a notion of "persistence" that forbids us from swapping arguments when recursing - foo(xs,ys) can only recurse as foo(xs,ys), and not foo(ys,xs) - and that, I believe, greatly reduces the search space, while keeping the language very expressive. and partitioning the stack in types seems like the right thing to do, in that context.

finally, note that, if stratification was truly pervasive, then, we could implement sort in an even more terse way:

sort :: [Int] -> [Int] -> [Int]
sort []     [] = []
sort []     ys = 1 + sort ys []
sort (0:xs) ys = 0 : sort xs ys
sort (x:xs) ys = sort xs (x-1:ys)

main :: IO ()
main = print $ sort [1,7,5,2,4,9] []

here, we're avoiding the extra n argument (which requires DUP!) by, instead, applying 1 + to the sorted list on the second clause. this would be ill-typed - we can't succ a list. but, if the succ constructor lives on the layer 2 (the layer of the type), and if the cons constructo rlives on the layer 3 (the layer of the [ℕ] type), then, succ of a list would be legit: it would "passthrough", and essentially operate as map (1 +) (sort ys [])

this is the concept of "auto vectorization" discovered by T6, and I believe it is very relevant here, since it increases the useful expressivity per bit of the language - which is the main metric that KolmoLang is trying to maximize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment