Consider the following straightforward mutable queue implementation using two stacks:
type 'a adt_queue = {
mutable head : 'a head;
mutable tail : 'a tail;
}
and 'a head = Cons of 'a * 'a head | Head
and 'a tail = Snoc of 'a tail * 'a | Tail
let create () = { head = Head; tail = Tail }
let push t x = t.tail <- Snoc (t.tail, x)
let rec rev_to head = function
| Tail -> head
| Snoc (xs, x) -> rev_to (Cons (x, head)) xs
let rev tail = rev_to Head tail
let pop_opt t =
match t.head with
| Cons (x, xs) ->
t.head <- xs;
Some x
| Head ->
match t.tail with
| Tail -> None
| Snoc _ as tail ->
match rev tail with
| Head -> assert false (* <-- This is annoying! *)
| Cons (x, xs) ->
t.head <- xs;
Some x
You might have noticed the use of the separate _ head
and _ tail
list type
constructors for the head and the tail, respectively. For the above having such
types is not strictly necessary, but it is a stepping stone for the changes
we'll introduce soon.
As pointed out by the comment, one annoying thing there is that at the point we
match on the result of rev tail
we know that it will not be an empty list, or
the constructor Head
in this case, but the OCaml compiler doesn't know that.
Could we use GADTs to avoid that unnecessary case? Well, yes, we can. The problem with GADTs, however, is that they tend to be tricky to use. If you've used them in anger, so to speak, you probably know what I'm talking about. OTOH, if you find them entirely intuitive to use, you might want to consider writing a tutorial or two on them! In the following I'll describe a pattern for using GADTs with polymorphic variants and existentials that allows cases like the above to be handled fairly mechanically. It should be noted, however, that this is not the only way to achieve the effect we want, which is to eliminate the unnecessary case from the above. This is just a pattern that makes it straightforward and mechanical.
Consider the following GADT:
type _ t =
| Foo : [> `Foo ] t
| Bar : [> `Bar ] t
| Baz : [> `Baz ] t
Each of the cases mechanically includes the label as a polymorphic variant that
also allows more tags (i.e. they have the >
). This allows us to specify
functions that only accept a subset of the cases as long as we explicitly type
them and mechanically specify them to only allow the desired tags or less (i.e.
they have the <
):
let foo_or_bar (t : [< `Foo | `Bar ] t) =
match t with
| Foo -> "It was a Foo"
| Bar -> "It was a Bar"
let _ = foo_or_bar Foo
let _ = foo_or_bar Bar
let just_baz (t : [< `Baz ] t) =
match t with
| Baz -> "It is a Baz"
let _ = just_baz Baz
Specifying the bounds like this can be a bit cumbersome and it also requires
polymorphism. We can also specify types that only allow a specific subset by
using existentials. For example, we could define an existential for the subset
that includes Foo
and Bar
:
type foo_or_bar = Foo_or_Bar : [< `Foo | `Bar] t -> foo_or_bar [@@unboxed]
Note we marked the existential type as [@@unboxed]
. This is important for
performance (and sometimes other) reasons. It basically means that the
Foo_or_Bar
constructor is the identity function and using it allocates no
memory.
Just like when specifying the bounds explicitly, having a foo_or_bar
value we
can pattern match on the subset of cases it allows:
let foo_or_bar (Foo_or_Bar t) =
match t with
| Foo -> "It was a Foo"
| Bar -> "It was a Bar"
Nice!
Let's then put the above to use and reimplement the queue in a mechanical way avoiding the unnecessary pattern match for the empty list.
First we define a new set of types for the queue. Just like we defined the
foo_or_bar
existential for the _ t
GADT, for the new GADTs, (_, _) head
and (_, _) tail
, we also define _ head_pack
and _ tail_pack
existentials:
type 'a gadt_queue = {
mutable head : 'a head_pack;
mutable tail : 'a tail_pack;
}
and ('a, _) head =
| Cons : 'a * 'a head_pack -> ('a, [> `Cons ]) head
| Head : ('a, [> `Head ]) head
and 'a head_pack =
| H : ('a, [< `Cons | `Head ]) head -> 'a head_pack [@@unboxed]
and ('a, _) tail =
| Snoc : 'a tail_pack * 'a -> ('a, [> `Snoc ]) tail
| Tail : ('a, [> `Tail ]) tail
and 'a tail_pack =
| T : ('a, [< `Snoc | `Tail ]) tail -> 'a tail_pack [@@unboxed]
Complicated as the above might seem at first glance, one can arrive at the above mechanically based on the previous discussion.
The create
function is entirely straightforward:
let create () = { head = H Head; tail = T Tail }
The only addition is that we used the existential constructors H
and T
for
the head and tail, respectively.
The same goes for the push
operation, we simply tag the new tail with the
existential constructor T
:
let push t x = t.tail <- T (Snoc (t.tail, x))
For the rev
and rev_to
functions we use both explicitly bounded types and
also the existential tags where needed:
let rec rev_to (head : (_, [< `Cons ]) head) = function
| T Tail -> head
| T (Snoc (xs, x)) -> rev_to (Cons (x, H head)) xs
let rev = function (Snoc (tail, x) : (_, [< `Snoc ]) tail) ->
rev_to (Cons (x, H Head)) tail
Finally we can write pop_opt
:
let pop_opt t =
match t.head with
| H (Cons (x, xs)) ->
t.head <- xs;
Some x
| H Head ->
match t.tail with
| T Tail -> None
| T (Snoc _ as tail) ->
t.tail <- T Tail;
match rev tail with
| Cons (x, xs) ->
t.head <- xs;
Some x
All of the above compile with no errors nor warnings and we avoid the
unnecessary pattern match case on rev tail
!
Like said earlier, the nice thing about this approach is that this is fairly mechanical. You just tag the GADT constructors with their names and you explicitly specify the subset of constructors when needed. You can then use existentials to specify subsets as convenience types and to encapsulate the polymorphism.
With that said, it is possible to achieve the same with precision use of GADTs, but it gets tricky.