Skip to content

Instantly share code, notes, and snippets.

@luqui
Last active December 13, 2015 17:39
Show Gist options
  • Save luqui/4949993 to your computer and use it in GitHub Desktop.
Save luqui/4949993 to your computer and use it in GitHub Desktop.
Module system brainstorming

Goals

It's hard for me to make progress on opusmodus when I don't feel like I can sufficiently modularize my changes. The file is 700 lines, and that does not include test support. It is too much to manage at once. Now I have an algorithm for splitting & beaming which involves a number of auxiliary concepts (a theory of intervals and linear transformations), which don't seem at home at the top level. I want to be able to package them up in a pretty way, and this is an opportunity to explore some of the abstract flexiblility I have been pondering for the last two and a half years.

In my case, there is a concept of intervals, which comes with some operations and laws. There is also vocabulary associated with intervals, which may involve some other concepts. This is all independent of their implementations -- the operations and the vocabulary. Traditionally, the vocabulary is a collection of functions implemented in terms of the basic operations on intervals; however, the key I have been exploring is a shift of foundation; meaning that sometimes derived vocabulary is recast to be the basic operations of the same abstraction or a new abstraction. I want a blurring of this line between fundamental operations and derived vocabulary.

Particularly, the concept of intervals (in Agda-style notation) needed to define my beaming algorithm involves the following vocabulary

Interval : *
contains     : Interval x Interval -> Bool
disjoint     : Interval x Interval -> Bool
intersection : Interval x Interval -> Interval
length       : Interval -> Rational
split        : Integer  x Interval -> List Interval
transform    : Interval -> (Rational <-> Rational)   -- (<->) is invertible function 
-----
beam : Interval -> Beaming

That is a lot of assumptions, and further these are only the ones that involve Interval directly. There is also a theory of Bools, Rationals, Lists, invertible functions, and Beamings. There are also unmentioned laws describing the algebraic meaning of these operations, or perhaps denotational semantics describing their denotational meaning. It is certainly too much to list all this directly just to define beam. However, beam is truly abstract in these operations, so we would like to express that somehow -- if one can give an alternate model of Intervals, beam can use it for better performance or alternate semantics.

It seems reasonable that good support for this kind of abstraction should also give us the ability to express properties of more general scope such as homomorphisms and free constructions. It seems that I am looking for an abstract-algebraic approach to programming.

I'm using the relatively trivial theory of intervals to demonstrate this, which might seem silly since intervals over Rationals are pretty obviously implemented, likely optimally, by pairs of rationals over the constraint that the left is less than or equal to the right. But this is something I have had in mind for opusmodus from the beginning, although I never thought I would need to use it (I still don't, but I like to dream), since I am using obviously inefficient basic operations on my notation model to do calculations. Representing ordered keyed sets as unordered lists, that's ridiculously stupid, and if performance were a concern, would be a game-killer. However, the task is difficult enough that doing things correctly from the beginning would have been too hard. I want a system in which I could start with a model and a naive implementation and gradually move to a better model. If I built a conversion between the bad representation and the good representation, I could do this gradual move by conjugating (T inverse . f . T) when the necessary operations were not implemented on the good representation.

These are the kinds of abilities I would like my module system to facilitate. Now we reach the harder part, the part of engineering rather than dreaming of engineering -- how?

Theoretical Foundations

We could define a theory as a collection of operations as above, a model as an implementation of those operations, and a morphism as structure-preserving map between models of the same theory, then go nuts in category land (or begin to, perhaps starting to understand it meanwhile). That is certainly tempting, but I don't yet understand how to handle relationships between operations of multiple theories. For example:

split : Integer x Interval -> List Interval

This operation connects three different theories -- Integer, Interval, and List (or perhaps List Interval). Let us look at its Haskell implementation:

split (n, (a,b)) = [ (a + l*k, a + l*(k+1)) | k <- [0..n-1], let ki = fromIntegral k ]
    where
    l = (b-a) / fromIntegral n

This definition uses the concrete pairs-of-rationals model of Intervals -- it is not defined in terms of the other abstract operations of intervals. We could say that it is defined abstractly on Integers and Lists, perhaps (Haskell sees these as concrete types, so it is difficult to differentiate), but it also makes use of a connection between Integers and Rationals: fromIntegral. Since it is defined concretely the Rational x Rational model of Interval, we can extend its definition to any model of Interval which is isomorphic (reqd. because it is in the dom and cod of this function) to the Rational x Rational model.

Let us back up a moment and look at the parameters of the subtheory needed to define beam. It is a subtheory because in my mind the whole theory of intervals is open: it includes all vocabulary about intervals that ever will be used. But here we have a concrete set of operations we need. Anyway, these operations mention the following types: Bool, Rational, List, (<->), Beaming (it is odd to me to consider Beaming part of the subtheory of intervals -- but perhaps that is just a naming problem. Maybe we could call this the subtheory of beam, which then makes sense). But types are just free variables if you don't consider their their operations. Technically, the part of each of these theories that we need to define this theory of intervals includes whatever operations and properties we need to express the laws of the theory. And, as I just reminded myself in this parenthetical, we are talking about the subtheory of beam, not intervals, so we only need those laws about intervals that are needed to prove that beam is correct (of course we will not include a formal Martin-Löf proof; I find that to be too much work to pay off in most escapades of software engineering, but I find it useful to at least think about what you would need to prove it if you wanted to -- this informs the formal models).

Let's denote by BEAM the subtheory needed for beam. A model of BEAM is an assignment of abstract types to concrete types (relative to some context) and abstract functions to implementations that satisfies the laws. Given two models A and B of BEAM, a morphism f maps types to types and operations to operations in a structure-preserving way. For example, since we have contains : Interval x Interval -> Bool, a necessary property for f to be a morphism would be:

f[contains (a,b)] = f[contains] (f[a], f[b])

(Apparently products and functions are blessed enough that f is assumed to map them to themselves -- in the land of idealistic purity, BEAM would be parametric on the theory of products and functions as well, which would require that f act on them as well. I think I am deriving category theory here. That is consistent with my goals.)

Given a definition of morphisms on BEAM (forming a category of its models), we can state a naturality condition for beam, which we will have if nothing funny is going on in its implementation:

f[beam_A i] = beam_B f[i]

where beam_A and beam_B are the instantiations of beam to the models A and B of BEAM. This condition will be assumed for anything that is defined in terms of beam. These are merely the free theorems for its type.

Stepping back -- each definition has a theory of its assumptions: everything it is defined in terms of, in addition to the properties about them required to prove that the definition is correct (IOW an algebrification of its intended meaning). At least from this angle, theories do have a natural notion of morphism, which boils down to definitions in terms of them being parametric in their assumptions. Perhaps that obvious notion of morphism breaks down as we include properties in the theories (but because of Curry-Howard, it seems reasonable that it would not).

Algebraic Data Types follow as free constructions over theories. Here is a small theory of Intervals with some laws to work with:

Interval : *
interval : (x y : Rational | x <= y) -> Interval 
intersection : Interval x Interval -> Interval
intersection x x = x

The free construction over this should give us approximately this Haskell type:

data Interval 
    = Interval Rational Rational
    | Intersection Interval Interval

However, the law suggests a condition:

for any f : Interval -> A, f (Intersection x x) = f x.

Of course, this is not true of any f, only of "natural" f for a sort of natural that I am not comfortable with yet. Instantiating a definition with a free model over its theory corresonds to looking at the syntax of the definition -- the laws of the theory impose restrictions on how you are allowed to make use of the information you learn from looking at the syntax. This seems like an interesting line of inquiry, but seems like something that does not need to be considered immediately -- Let's table this discussion of free models for now.

Programming

Let's now embark on making these theoretical ideas useful in a programming context. I think it is fairly self-evident that explicitly stating the theory of every definition is intractible; however, to make use of these concepts, it would be useful to be able to view as a programmatic object the theory of a definition. It follows that we should strive to automatically derive the theory of a definition.

Definitions using similar theories are usually made nearby each other (but not always!), so this brings in my desire for Coq-style sections: these introduce vocabulary to be used in a number of definitions. For the toy theory above:

(parameter Interval Type)
(parameter make-interval (-> (* Q Q) Interval)
(parameter intersection (-> (* Interval Interval) Interval))

If these parameters are declared in a block, then any other definitions in the block may mention these names, and as a result they are included in the theory of the definition (including any dependencies; e.g. mentioning intersection incldues Interval). However, if a definition makes no mention of a parameter in scope, it is not included in the theory.

A sufficiently advanced type system would allow you to assume axioms and prove theorems. Let's not venture there for the time being -- I am interested more in modularity than verification.

What of the split function above -- this function is abstract in Integer, Rational, List, but not in Interval. Its signature

split : Integer x Interval -> Interval

is only true for a particular model of Interval, namely Rational x Rational. Well, perhaps we should be more precise: the implementation above is defined in terms of that model; the concept of splitting an interval into equal pieces ought to be universal to any implementation of intervals. So maybe the theory of this split implementation includes Interval = Rational x Rational as a hypothesis. Technically if we were working in Martin-Löf type theory we would need to employ an axiom of this type in order to convert between Interval and Rational x Rational, so including it as an assumption is a tantalizing possibility. Something feels off about it -- it may only be that Rational x Rational is not descriptive enough. Alice has written this version, but Bob, who has a different orientation toward direction, has written another version in which Interval = Rational x Rational but the smaller endpoint is the right component of the pair. We would not want to conflate these models.

This is similar to the state of Haskell in which a typeclass can have a "preferred instance"; e.g. Maybe a is a monoid in at least two ways: mappend = mplus and mappend = liftA2 mappend. The choice was changed from the former to the latter at some point because we started to value composability -- but that caused code to break, and I don't ever want code to break. (My systems desire to change gradually -- they are no good if changing something requires breaking something. This is the same as the shift of foundation goal at the beginning of this development.) Haskell requires wrapping newtypes around things if you want different instances, which is a pain and falls out of the unfortunate but pragmatic decision that a type can only be instance of a typeclass in one way. This is more reasonable that it at first seems if you are thinking of types as sets. Types are algebras -- a type is not only the collection of its values, but the way it is related to other concepts. This is a useful way to think, and leads to most of the power in advanced Haskell programming. But I feel it is wrong -- it is bothersome that we use the same type to refer to the aforementioned monoid and the partiality monad (when all monads themselves are monoids in a different way), but use a different type First to refer to a different monoid with the same values, which is not considered a monad when it could be. Types and algebras are related in Haskell, but their relationship is not crystal-clear.

The easy way out is to do away with typeclasses and pass algebras around as parameters, for example

mconcat :: Monoid a -> [a] -> a

That is, mconcat is defined as a function [a] -> a, but rather than the Haskell way of requiring that a be a monoid -- which is rather blasphemous for mathematical language, in which one would be more inclined to say (a, 0, +) is the monoid -- mconcat is defined for any monoid over values of type a, in which is included the operations 0 and +. This is more composable but also more cumbersome; not only in the way that it should be pretty obvious which ring on the integers we are referring to when we add, but in the way of constructing proof terms for algebras. It is quite a wonderful thing in Haskell that we can write:

foo :: [Int -> Maybe [a]] -> Int -> Maybe [a]
foo = mconcat

rather than

foo = mconcat (funcMonoid (maybeMonoid listMonoid))

and the type checker fills in all that garbage for us. I am willing to consider that that is a happy pinch of sugar that is gained from the unclarity of Haskell's model -- after all, when we write the former, we mean the latter, and this is statically knowable by both the compiler and the programmer. The only benefit is brevity (and agility of thought, since we can see the monoid in question more quickly than we can write it, but only if we have been properly mentally primed -- if this occured deep within a function where the type signature was not right in front of us, it may not be so obvious).

Passing algebras as parameters is a natural way to go for type-centric languages like Agda (and Haskell when typeclasses begin to show their limitations); it essentially demotes the concept of an algebra to a value of a type, we begin to focus our eyes on the type of monoids over a and how we can manipulate them with functions rather than nature of monoids as a structure. However, especially in light of the discussion of morphisms above, this system seems to be focusing differently. Let's remind ourselves that

contains : Interval x Interval -> Bool

induces the naturality condition on f that

f[contains (a,b)] = f[contains] (f[a], f[b]).

This condition arises from the categorical structure of products and functions; products and functions are not just classifications of values that are pairs and computable mappings. The fact that contains is implemented as a value of the form λ(i,j). ... is almost an accident -- a result of our constraint of being situated in a functional programming language rather than a result of the nature of our construction. Indeed, we should be able to think of a Church-pair λf. f a b as just as much a value of a product type as (a,b); we should be able to think of a recursively enumerable list of ordered pairs as just as much a function as a lambda expression (in this latter case, they do have some distinct properties from each other, however). These choices do not affect our theoretical development of the concepts of our program, and thus we should be able to abstract them away.

In summary, passing algebras as parameters is not a suitable design decision for our system, and Interval = Rational x Rational is not a suitable hypothesis for our implementation of split. We require a different way to see algebras than is traditional in functional and dependently-typed programming.

The question of how to situate the implementation of split is essential to the design of the entire system -- after all, we can talk of abstract relationships between concepts ad nauseum (and we have), but at the end of the day I have a MusicXML renderer to write, and it will require at least one line of executable code. Every line of code involves at least one concrete instance of an abstraction -- in light of what we have just talked about, even writing a function that takes an argument invokes a particular model of the notion of function. So I will focus on the following question: What is the relationship between Intervals and their implementation as pairs of Rationals?

When you make a definition that in a traditional view involves specializing a type parameter, you are actually simultaneously defining that parameter. These definitions should not be separated. So you place them in an object, which you can think of as an abstraction boundary. So, for example, the object defining split looks like this:

(object
  (Interval Type 
    (* Rational Rational))
  (split    (-> (* Integer Interval) (List Interval))
    ...)).

It is so called because it defines an object of its category, which in this case is

Interval : *
split : Integer x Interval -> List Interval.

If you wish to extend this object with more operations on the same representation, you can define another object with the same representation for Interval and join them. When two objects that both define the same symbol are joined, the symbols are required to be defined to the same thing.

Types and Signatures

In the previous section, we saw an object which defined some executable code that split an interval into equal parts. But in order to make use of that code, we need to connect it to something. The natural way to do this would just to be to define it into a variable; e.g.

(defparameter split-code
  (object
    ...))

Which is all well and good as a way to embed it into the traditional paradigm (you need to at some point to run it); but what if you want to stay in this module system while defining objects. It's a concrete definition, not an abstract declaration, so naturally it should go in an object:

(object
  (split-code ?? 
    (object
      ...)))

in which ?? refers to its type. It has a type; at this level it is actually not an object as the name would have you believe, but a functor between its assumptions and its definitions. So its type would be

(-> () (signature (Interval Type) (split (* Integer Interval) (List Interval))))

The empty signature in the domain of this functor would be something else if any parameters were declared; for example

(object
  (split-code (-> (signature (Integer Type)) (signature (Interval Type) ...))
    (parameter Integer Type)
    (object ...)))

I think all the objects in a block are merged to form a single functor. I'm not sure how that's useful. But that means that if you want multiple objects of a single abstraction, you should name the abstraction:

(by the way, I'm going to drop the redundant type annotations on object fields and replace them with _; they should be inferrable (or ignorable), at least until we get into Turing-land; we'll address that later)

(object _
  (semigroup (signature (T Type) (x (fun (T T) T))))
  
  (int-plus (() -> semigroup)
    (object
      (T _ Integer)
      (x _ #'+)))
      
  (int-times (() -> semigroup)
    (object
      (T _ Integer)
      (x _ #'*))))

We have begun to see a multi-level distinction between the object language and the meta language. The object language of this system refers to its definitions as objects, but once you unwrap it by a layer it becomes a functor. If we wished to unwrap another level

(object
  (split-code-code ???
    (object
      (split-code ??
        (object
          ...)))))

The outer ??? could be (-> () (signature (split-code (-> () (signature (Interval Type) ...))))). I'm not 100% sure about that one, but it's something like that -- a functor from the empty signature to a functor category of a specific signature. In any case it's a mouthful and it would be nice to have it inferred; it's hard to know whether type inference will be possible -- I think at this level it's possible, until we start having to define things with lambdas (which are in a different category).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment