gatlin · September 19, 2017 23:28
diff --git a/list.lhs b/list.lhs
 "Church-encoded" lists and their interesting properties
 ===

 *This is a literate Haskell program which you can run in ghci.*
 *Also, I already corrected a glaring problem. Let us speak no more of it.*

 One of the beautiful things about computer science is that, fundamentally,
 **data is code** and **code is data**. But what does that mean? And how is it
 useful?

 Below I implement and explain a lazy list data structure using (almost) nothing
 but boring old functions. For the most part the actual code is straightforward.

 *This is often called the Church encoding but this is not strictly accurate.*

 The `L a` type
 ---

 > {-# LANGUAGE Rank2Types #-}

 > import Prelude hiding ( foldr  -- we will be reimplementing these,
 >                       , foldl  -- so I have excluded them.
 >                       , map
 >                       , filter
 >                       , length
 >                       , zip
 >                       , zipWith
 >                       )

 > newtype L a = L { foldr :: forall r. (a -> r -> r) -> r -> r }

 This says: "a value of type `L a` wraps a strange-looking function. Applying
 `foldr` to an `L a` value will extract the strange-looking function."

 To get an intuition for how this is related to lists, consider:

    data List a = Cons a (List a) | Nil

 Then the types of the constructors would be:

    Cons :: a -> List a -> List a
    Nil  :: List a

 Replace `List a` with `r`:

    Cons :: a -> r -> r
    Nil  :: r

 Curious. So an `L` value wraps a function which accepts two functions and
 (presumably) evaluates one or the other. These two functions are oddly
 reminiscent of `Cons` and `Nil`.

 How does it know, though? And why the non-specific type `r`? We'll get there.

 Behind the magic
 ---

 What `newtype` has automatically generated is equivalent to the following:

    type L a = forall r. (a -> r -> r) -> r -> r

    L :: L a -> L a
    L x = x

    foldr :: L a -> (a -> r -> r) -> r -> r
    foldr xs op x = xs op x

 Thus `L` is a narrowly-typed identity function and `foldr` is a
 narrowly-typed application function. Composing them together in expressions
 shouldn't affect program semantics at all but it will give you guarantees about
 type safety. This composition also opens the door to a very special
 optimization.

 But first let's play with them a bit.

 List construction: `cons` and `nil`
 ---

 > cons :: a -> L a -> L a
 > cons x xs = L $ \c e -> c x (foldr xs c e)

 Fully evaluated, this is what a cons looks like:

    cons 1 nil                                  =>
    L (\c e -> c 1 (foldr (L (\c e -> e)) c e)) =>

 > nil :: L a
 > nil = L $ \c e -> e

 Example:

 > list1 :: L Int
 > list1 = cons 1 $ cons 2 $ cons 3 $ cons 4 $ cons 5 $ cons 6 nil

 Opening up a list
 ---

 It's all well and good to construct lists but equally, if not more, important
 is the ability to extract data from a list.

 Just as we construct a list by prepending a head to a tail, we will first
 write a function to split a list into its head and tail:

 > split :: L a -> (Maybe a, L a)
 > split xs = foldr xs f (Nothing, nil) where
 >     f y ys = (Just y, L (\c e ->
 >         maybe e (\x -> c x (foldr (snd ys) c e))
 >                (fst ys)))

 This works because `foldr`'s job is to tear down a list and transform it into
 some other type. In this case, we want to construct a tuple of `Maybe a` and
 `List a`. The last argument of `foldr` is the value to use if the list is
 empty.

 Otherwise we pass a function which will receive two values. One is indeed the
 head of the list, but the other is not quite what you might first expect. It
 may seem like something out of *Bill and Ted's Excellent Adventure* but since
 this operation is recursive, you must assume that the rest of the list has been
 taken care of. **Exercise: why?** Thus, `ys` is already an `(Maybe a, List a)`
 value.

 In Lisp languages, the traditional name for the function which extracts the
 first element of a list is called `car`. Since this doesn't clash with any
 Haskell names it's what I'll name the next function:

 > car :: L a -> a
 > car xs = maybe diverge id (fst $ split xs) where
 >     fix f = f (fix f)
 >     diverge = fix id

 `fix` computes the *fixpoint* of a function; that is, the value `x` for which
 `f x == x`. Diverge computes the fixpoint of the `id` function; the upshot is
 that `diverge` is a value which can take any type. `maybe` demands that we
 supply a final argument but if the list is non-empty we will never actually
 need the last element.

 Keeping with our naming scheme, the function to produce the remainder of the
 list is `cdr`:

 > cdr :: L a -> L a
 > cdr = snd . split

 For giggles, let's expose a "safe" `car` variant: one which returns `Just` an
 element if the list has an element, or `Nothing` otherwise:

 > car' :: L a -> Maybe a
 > car' = fst . split


 Demonstration: a `sum` function
 ---

 Let's get a feel for how this all works by writing a function to sum a
 list of numbers.

 > sum' xs = foldl xs (+) 0

 Evaluated on (cons 1 (cons 2 nil)):

    sum' (cons 1 (cons 2 nil))                              =>
    foldr (cons 1 (cons 2 nil)) (+) 0                       =>
    foldr (L (\c e -> c 1 (foldr (cons 2 nil) c e))) (+) 0  =>
    (\c e -> c 1 (foldr (cons 2 nil) c e)) (+) 0            =>
    (+) 1 ((foldr (cons 2 nil) (+) 0))                      =>
    (+) 1 ((foldr (L (\c e -> c 2 (foldr nil c e))) (+) 0)) =>
    (+) 1 ((\c e -> c 2 (foldr nil c e)) (+) 0)             =>
    (+) 1 ((+) 2 (foldr nil (+) 0))                         =>
    (+) 1 ((+) 2 (foldr (L (\c e -> e)) (+) 0))             =>
    (+) 1 ((+) 2 ((\c e -> e) (+) 0))                       =>
    (+) 1 ((+) 2 0)                                         =>
    (+) 1 2                                                 =>
    3

 There is an important symmetry to understand here: lists are an inductive
 type, and so are operations on lists. When performing some computation on a
 list, you must supply a base case (analogous to `Nil`) and an inductive case
 (analogous to `Cons`).

 The naive encoding of the data type creates an object in memory, a *thing*, and
 builds up a structure. The functional encoding, however, simply provides
 possible execution paths. `L a` values are control flow operators, deciding at
 every element of the list whether to continue or halt.

 Thus, data is code and code is data.

 Fusion
 ---

 As for the optimization, we know that when we see `foldr (L f) x y` we can
 easily rewrite it as `f x y` before we ever generate any code. `foldr` and `L`
 disappear when composed correctly. This is called *fusion*. Haskell provides a
 system to tell the compiler about such pairs of functions and this can result
 in much more efficient programs.

 Because ultimately you're not *doing* anything other than proving properties of
 your program. These proofs can be discarded once verified.

 Other examples
 ---

 For completeness here are a few other traditional list functions implemented in
 our scheme. As an exercise, try to figure out how and why they work.

 > foldl :: L a -> (r -> a -> r) -> r -> r
 > foldl bs f a = foldr bs (\b g x -> g (f x b)) id a

 > map :: (a -> b) -> L a -> L b
 > map f xs = foldr xs (\y ys -> cons (f y) ys) nil

 > filter :: (a -> Bool) -> L a -> L a
 > filter pred xs = foldr xs f nil where
 >     f y ys = L $ \c e -> if (pred y)
 >                            then c y (foldr ys c e)
 >                            else foldr ys c e

 > length :: L a -> Int
 > length xs = foldr xs (\_ -> (+1)) 0

 And here are a few more example uses:

 > isEven :: (Integral n, Eq n, Num n) => n -> Bool
 > isEven x = x `mod` 2 == 0

 > sum_1 = sum' list1 -- => 21
 > sum_2 = sum' . map (*2) $ list1 -- => 42
 > sum_3 = sum' . map (*2) . cdr $ list1 -- => 40
 > sum_4 = sum' . filter isEven $ list1 -- => 12

 A note on recursion.
 ---

 One very interesting property of these functions is that none of them are
 recursive. Yet, we define maps, filters, and folds which necessarily iterate
 the items of a list!

 Instead, these functions are *mutually recursive*. No single function calls
 itself directly, but different compositions of the basic functions cause
 program execution to loop as long as necessary. In this way the very concepts
 of iteration and recursion are defined in terms of list processing functions.

 And with tail call optimization and sibling call optimization, the compiler can
 generate machine code which takes up constant space.

 Now if only there were some language dedicated to LISt Processing, I
 bet it would be pretty powerful ...
	"Church-encoded" lists and their interesting properties
	===

	This is a literate Haskell program which you can run in ghci.
	Also, I already corrected a glaring problem. Let us speak no more of it.

	One of the beautiful things about computer science is that, fundamentally,
	data is code and code is data. But what does that mean? And how is it
	useful?

	Below I implement and explain a lazy list data structure using (almost) nothing
	but boring old functions. For the most part the actual code is straightforward.

	This is often called the Church encoding but this is not strictly accurate.

	The `L a` type
	---

	> {-# LANGUAGE Rank2Types #-}

	> import Prelude hiding ( foldr -- we will be reimplementing these,
	> , foldl -- so I have excluded them.
	> , map
	> , filter
	> , length
	> , zip
	> , zipWith
	> )

	> newtype L a = L { foldr :: forall r. (a -> r -> r) -> r -> r }

	This says: "a value of type `L a` wraps a strange-looking function. Applying
	`foldr` to an `L a` value will extract the strange-looking function."

	To get an intuition for how this is related to lists, consider:

	data List a = Cons a (List a) \| Nil

	Then the types of the constructors would be:

	Cons :: a -> List a -> List a
	Nil :: List a

	Replace `List a` with `r`:

	Cons :: a -> r -> r
	Nil :: r

	Curious. So an `L` value wraps a function which accepts two functions and
	(presumably) evaluates one or the other. These two functions are oddly
	reminiscent of `Cons` and `Nil`.

	How does it know, though? And why the non-specific type `r`? We'll get there.

	Behind the magic
	---

	What `newtype` has automatically generated is equivalent to the following:

	type L a = forall r. (a -> r -> r) -> r -> r

	L :: L a -> L a
	L x = x

	foldr :: L a -> (a -> r -> r) -> r -> r
	foldr xs op x = xs op x

	Thus `L` is a narrowly-typed identity function and `foldr` is a
	narrowly-typed application function. Composing them together in expressions
	shouldn't affect program semantics at all but it will give you guarantees about
	type safety. This composition also opens the door to a very special
	optimization.

	But first let's play with them a bit.

	List construction: `cons` and `nil`
	---

	> cons :: a -> L a -> L a
	> cons x xs = L $ \c e -> c x (foldr xs c e)

	Fully evaluated, this is what a cons looks like:

	cons 1 nil =>
	L (\c e -> c 1 (foldr (L (\c e -> e)) c e)) =>

	> nil :: L a
	> nil = L $ \c e -> e

	Example:

	> list1 :: L Int
	> list1 = cons 1 $ cons 2 $ cons 3 $ cons 4 $ cons 5 $ cons 6 nil

	Opening up a list
	---

	It's all well and good to construct lists but equally, if not more, important
	is the ability to extract data from a list.

	Just as we construct a list by prepending a head to a tail, we will first
	write a function to split a list into its head and tail:

	> split :: L a -> (Maybe a, L a)
	> split xs = foldr xs f (Nothing, nil) where
	> f y ys = (Just y, L (\c e ->
	> maybe e (\x -> c x (foldr (snd ys) c e))
	> (fst ys)))

	This works because `foldr`'s job is to tear down a list and transform it into
	some other type. In this case, we want to construct a tuple of `Maybe a` and
	`List a`. The last argument of `foldr` is the value to use if the list is
	empty.

	Otherwise we pass a function which will receive two values. One is indeed the
	head of the list, but the other is not quite what you might first expect. It
	may seem like something out of Bill and Ted's Excellent Adventure but since
	this operation is recursive, you must assume that the rest of the list has been
	taken care of. Exercise: why? Thus, `ys` is already an `(Maybe a, List a)`
	value.

	In Lisp languages, the traditional name for the function which extracts the
	first element of a list is called `car`. Since this doesn't clash with any
	Haskell names it's what I'll name the next function:

	> car :: L a -> a
	> car xs = maybe diverge id (fst $ split xs) where
	> fix f = f (fix f)
	> diverge = fix id

	`fix` computes the fixpoint of a function; that is, the value `x` for which
	`f x == x`. Diverge computes the fixpoint of the `id` function; the upshot is
	that `diverge` is a value which can take any type. `maybe` demands that we
	supply a final argument but if the list is non-empty we will never actually
	need the last element.

	Keeping with our naming scheme, the function to produce the remainder of the
	list is `cdr`:

	> cdr :: L a -> L a
	> cdr = snd . split

	For giggles, let's expose a "safe" `car` variant: one which returns `Just` an
	element if the list has an element, or `Nothing` otherwise:

	> car' :: L a -> Maybe a
	> car' = fst . split


	Demonstration: a `sum` function
	---

	Let's get a feel for how this all works by writing a function to sum a
	list of numbers.

	> sum' xs = foldl xs (+) 0

	Evaluated on (cons 1 (cons 2 nil)):

	sum' (cons 1 (cons 2 nil)) =>
	foldr (cons 1 (cons 2 nil)) (+) 0 =>
	foldr (L (\c e -> c 1 (foldr (cons 2 nil) c e))) (+) 0 =>
	(\c e -> c 1 (foldr (cons 2 nil) c e)) (+) 0 =>
	(+) 1 ((foldr (cons 2 nil) (+) 0)) =>
	(+) 1 ((foldr (L (\c e -> c 2 (foldr nil c e))) (+) 0)) =>
	(+) 1 ((\c e -> c 2 (foldr nil c e)) (+) 0) =>
	(+) 1 ((+) 2 (foldr nil (+) 0)) =>
	(+) 1 ((+) 2 (foldr (L (\c e -> e)) (+) 0)) =>
	(+) 1 ((+) 2 ((\c e -> e) (+) 0)) =>
	(+) 1 ((+) 2 0) =>
	(+) 1 2 =>
	3

	There is an important symmetry to understand here: lists are an inductive
	type, and so are operations on lists. When performing some computation on a
	list, you must supply a base case (analogous to `Nil`) and an inductive case
	(analogous to `Cons`).

	The naive encoding of the data type creates an object in memory, a thing, and
	builds up a structure. The functional encoding, however, simply provides
	possible execution paths. `L a` values are control flow operators, deciding at
	every element of the list whether to continue or halt.

	Thus, data is code and code is data.

	Fusion
	---

	As for the optimization, we know that when we see `foldr (L f) x y` we can
	easily rewrite it as `f x y` before we ever generate any code. `foldr` and `L`
	disappear when composed correctly. This is called fusion. Haskell provides a
	system to tell the compiler about such pairs of functions and this can result
	in much more efficient programs.

	Because ultimately you're not doing anything other than proving properties of
	your program. These proofs can be discarded once verified.

	Other examples
	---

	For completeness here are a few other traditional list functions implemented in
	our scheme. As an exercise, try to figure out how and why they work.

	> foldl :: L a -> (r -> a -> r) -> r -> r
	> foldl bs f a = foldr bs (\b g x -> g (f x b)) id a

	> map :: (a -> b) -> L a -> L b
	> map f xs = foldr xs (\y ys -> cons (f y) ys) nil

	> filter :: (a -> Bool) -> L a -> L a
	> filter pred xs = foldr xs f nil where
	> f y ys = L $ \c e -> if (pred y)
	> then c y (foldr ys c e)
	> else foldr ys c e

	> length :: L a -> Int
	> length xs = foldr xs (\_ -> (+1)) 0

	And here are a few more example uses:

	> isEven :: (Integral n, Eq n, Num n) => n -> Bool
	> isEven x = x `mod` 2 == 0

	> sum_1 = sum' list1 -- => 21
	> sum_2 = sum' . map (*2) $ list1 -- => 42
	> sum_3 = sum' . map (*2) . cdr $ list1 -- => 40
	> sum_4 = sum' . filter isEven $ list1 -- => 12

	A note on recursion.
	---

	One very interesting property of these functions is that none of them are
	recursive. Yet, we define maps, filters, and folds which necessarily iterate
	the items of a list!

	Instead, these functions are mutually recursive. No single function calls
	itself directly, but different compositions of the basic functions cause
	program execution to loop as long as necessary. In this way the very concepts
	of iteration and recursion are defined in terms of list processing functions.

	And with tail call optimization and sibling call optimization, the compiler can
	generate machine code which takes up constant space.

	Now if only there were some language dedicated to LISt Processing, I
	bet it would be pretty powerful ...