Created
          August 7, 2019 01:41 
        
      - 
      
 - 
        
Save solomon-b/5982ba160c66b427217095f793cd7e5f to your computer and use it in GitHub Desktop.  
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
  | {-# LANGUAGE InstanceSigs #-} | |
| {-# LANGUAGE FlexibleInstances #-} | |
| {-# LANGUAGE TypeSynonymInstances #-} | |
| module Main where | |
| {- | |
| When writing programs in a statically typed language such as Haskell, | |
| there are two concepts we often think about and try to reconcile: | |
| Generic Programming - Generally extensible and repurposable code. (This | |
| can be achieved through a variety of strategies). | |
| Algebraic Datatypes as DSLs - Encoding the semantics of your program | |
| into ADTs. In this view we see the data types as a DSL and the | |
| processing of the data types as the interpretation of the DSL. | |
| These two strategies seek to resolve different problems: | |
| Generic Programming has the goal of more universally applicable code. | |
| Embedded DSLs seeks to constrain, via the typesystem, the space of possible | |
| programs to elminate meaningless but otherwise valid expressions. | |
| So the question Oleg's paper wishes to discuss is how can we write | |
| programs that are both highly extensible and generic, but which are | |
| also able to express invariants and validity constraints. | |
| Oleg names the solution "Typed Extensible Interpreters (of Typed | |
| Languages)" and he offers to implementation strategies which he calls | |
| the Initial and Final encodings. | |
| Initial Encodings represent the terms of your Object Language as | |
| values in an ADT in the Metalanguage and interpreters as functions | |
| that recursively traverse and deconstruct the ADT through pattern | |
| matching. | |
| Final Encoding represents the Object Language by expressions in the | |
| Metalanguage. "An object term is hence represented not by its abstract | |
| syntax but by its denotation in a semantic domain. Abstracting over the | |
| domain gives us a family of interpretations." | |
| So now that i've got that out of the way. I'm going to work through | |
| examples of both encodings for a really simple untyped arithmetic | |
| language. This example is taken directly from Oleg's paper. | |
| -} | |
| --- A Unityped Initial Encoding of an object language: | |
| data Exp = Lit Int | |
| | Neg Exp | |
| | Add Exp Exp | |
| -- An expresion of the object language: | |
| ti1 :: Exp | |
| ti1 = Add (Lit 8) (Neg (Add (Lit 1) (Lit 2))) | |
| -- Interpreter One: An Evaluator | |
| eval' :: Exp -> Int | |
| eval' (Lit n) = n | |
| eval' (Neg e) = - eval' e | |
| eval' (Add e1 e2) = eval' e1 + eval' e2 | |
| -- eval ti1 == 5 | |
| view' :: Exp -> String | |
| view' (Lit n) = show n | |
| view' (Neg e) = "(- " ++ view' e ++ ")" | |
| view' (Add e1 e2) = "(" ++ view' e1 ++ " + " ++ view' e2 ++ ")" | |
| -- view ti1 = (8 + (- (1 + 2))) | |
| {- | |
| An intuition towards Final encoding. | |
| Alternatively, we can embed our langugage in haskell by representing | |
| our terms by their untagged values or by a haskell expression that | |
| computes that value. | |
| We use a type alias called Repr to indicate that this haskell value | |
| is actually a value in our embedded language | |
| -} | |
| type Repr = Int | |
| lit' :: Int -> Repr | |
| lit' n = n | |
| neg' :: Repr -> Repr | |
| neg' e = -e | |
| add' :: Repr -> Repr -> Repr | |
| add' e1 e2 = e1 + e2 | |
| -- An expresion of the object language: | |
| tf1' :: Repr | |
| tf1' = add' (lit' 8) (neg' (add' (lit' 1) (lit' 2))) | |
| {- | |
| Note That this is an ordinary haskell expression which evaluates to 5 tf1 | |
| So this isn't the actual Final encoding, however this will give a bit | |
| of an intuition about it. Also the comparison between this encoding | |
| and Initial encoding outlines the problem at the heart of the question posed in the introduction: | |
| The Expression Problem: | |
| How can we write a program that is extensible over both the grammar | |
| (the datatype cases) and its interpretations (functions over the ADTl) | |
| without needing to rewrite code (or recompile it)? | |
| -} | |
| {- | |
| So lets look at the differences: | |
| Initial is not extensible over the grammer | |
| Sorta Final is not extensible over the interpretations. | |
| Given that I called it 'Sorta Final', the Final Encoding is probably a | |
| variation on it.. | |
| The problem with Sorta Final is that its interpreter is hard coded | |
| into its grammar. We want something with the benefits of sorta final | |
| (an extensible and compositional grammar) but which is polymorphic | |
| over its interpretation. | |
| Typeclasses to the rescue.. | |
| -} | |
| class ExpSYM repr where | |
| lit :: Int -> repr | |
| neg :: repr -> repr | |
| add :: repr -> repr -> repr | |
| instance ExpSYM Int where | |
| lit :: Int -> Int | |
| lit n = n | |
| neg :: Int -> Int | |
| neg e = -e | |
| add :: Int -> Int -> Int | |
| add e1 e2 = e1 + e2 | |
| instance ExpSYM String where | |
| lit :: Int -> String | |
| lit n = show n | |
| neg :: String -> String | |
| neg e = "(-" ++ e ++ ")" | |
| add :: String -> String -> String | |
| add e1 e2 = "(" ++ e1 ++ " + " ++ e2 ++ ")" | |
| tf1 :: ExpSYM repr => repr | |
| tf1 = add (lit 8) (neg (add (lit 1) (lit 2))) | |
| eval :: Int -> Int | |
| eval = id | |
| view :: String -> String | |
| view = id | |
| {- | |
| These interpreter functions seem strange. How can `id` be an interpetation? | |
| Well the actual interpreter is encoded into the typeclass | |
| instance. `eval` and `view` are simply asserting the type of `repr`. | |
| Think of them as interpretation selectors. | |
| But what about extensibility over the grammer? | |
| -} | |
| class ExpSYM repr => MulSYM repr where | |
| mul :: repr -> repr -> repr | |
| instance MulSYM Int where | |
| mul :: Int -> Int -> Int | |
| mul e1 e2 = e1 * e2 | |
| instance MulSYM String where | |
| mul :: String -> String -> String | |
| mul e1 e2 = "(" ++ e1 ++ " * " ++ e2 ++ ")" | |
| tf2 :: MulSYM repr => repr | |
| tf2 = mul (lit 7) tf1 | |
| {- | |
| So now we have recovered extensibility over the grammar. | |
| But there are still some open questions: | |
| 1. Can we recover Pattern Matching? | |
| 2. Can we calculate equality? | |
| 3. Can we apply this to a typed Object Language? | |
| The answer to all 3 is yes, but I'm just going to get into pattern matching today. | |
| -} | |
| {- | |
| Lets talk about Compositionality. The Principle of Compositionality states that: | |
| "The meaning of a complex expression is determined by it's structure | |
| and the meanings of its constituents." | |
| This means that in order to undersand the meaning, the value, of some expression: | |
| eval (Add e1 e2) = eval e1 + eval e2 | |
| We need only know the meaning of the components `e1` and `e2` which we | |
| determine independently. We don't need to know anything about what is | |
| nested inside `e1` or `e2`. Additionally, the meanings of `e1` and | |
| `e2` are determined in isolation from one another, that is to say | |
| irrespective of their context. | |
| Both the Intial and Final interpreters are compositional. | |
| Oleg states: | |
| "The compositional interpretation of a term is epitomized in a `fold`" | |
| In the Initial encoding, we write an explicit fold function. | |
| In the Final encoding we also use a fold, however the fold is "wired in" | |
| into the the metalanguage expressions that denote our Object Language. | |
| Theres no recusion in our final interpreter, it is rather, an (f-)algebra | |
| from the metalanguage to its semantic domain (the object language). | |
| So we know that the evaluation of a fold is context free and that the act | |
| of pattern matching is inherently a contextual transformation. | |
| Because the fold is wired into the interpreter in the Final Encoding, we | |
| have no way to make contextual transformations. All we can do is fold the | |
| expression and arrive at an interpretation (semantic domain). | |
| But Oleg says we have a means of pattern matching, an interhently | |
| contexual transformation. He achieves this by showing that | |
| "contextual" transformations are "context-free" when interpreting | |
| expressions as functions parameterized by a context. | |
| So thats a mouthful. He gives two specific examples of how this can be | |
| done ad-hoc and then shows an isomorphism between Initial and Final | |
| encoding which allows you to generally apply pattern matching on Final | |
| Encoded terms. | |
| "with `Denotational Semantics` you are describing the meanings of pure, | |
| composable expressions in mathematically pure forms such that the | |
| _meanings_ themselves can be composed together to create meanings for | |
| composite expressions?" | |
| -} | |
| {- | |
| Pushing Down Negation. | |
| The first contextual transformation is factoring out negations: | |
| Our sample expression in the object language: | |
| "(8 + (- (1 + 2)))" | |
| Factoring out the negation would yield: | |
| "(8 + ((-1) + (-2)))" | |
| This is a contextual transformation because when we encounter a | |
| negation we must know the value in its subexpression in order to | |
| push the negation. | |
| In order to do this we want to transform our grammar from: | |
| e := int | neg e | add e e | |
| To the more restricted form: | |
| e := factor | add e e | |
| factor := int | neg int | |
| In this second form `neg` is not recursive and only allows an integer | |
| to be negated once. | |
| -} | |
| -- Initial Encoding | |
| pushNeg :: Exp -> Exp | |
| pushNeg e@(Lit _) = e | |
| pushNeg e@(Neg (Lit _)) = e | |
| pushNeg (Neg (Neg e)) = pushNeg e | |
| pushNeg (Neg (Add e1 e2)) = Add (pushNeg (Neg e1)) (pushNeg (Neg e2)) | |
| pushNeg (Add e1 e2) = Add (pushNeg e1) (pushNeg e2) | |
| {- | |
| The nested Pattern Matching is the giveaway that this transformation | |
| must be contextual. | |
| We cannot pattern match per se in the Final Encoding, but we can turn | |
| this contextual transformation into a context free one by treating | |
| expressions as functions parameterized by a context. | |
| -} | |
| data Ctx = Pos' | Neg' | |
| instance ExpSYM repr => ExpSYM (Ctx -> repr) where | |
| lit :: Int -> (Ctx -> repr) | |
| lit n Pos' = lit n -- lit n = \ctx -> if ctx == Pos' then 'lit n else neg (lit n) | |
| lit n Neg' = neg (lit n) | |
| neg :: (Ctx -> repr) -> (Ctx -> repr) | |
| neg e Pos' = e Neg' | |
| neg e Neg' = e Pos' | |
| add :: (Ctx -> repr) -> (Ctx -> repr) -> (Ctx -> repr) | |
| add e1 e2 ctx = add (e1 ctx) (e2 ctx) | |
| {- | |
| In the final encoding, pushNeg is now Structurually Inductive, that | |
| is to say it is clearly a fold with a termination. | |
| -} | |
| pushNegFinal :: (Ctx -> repr) -> repr | |
| pushNegFinal e = e Pos' | |
| -- This contextual transformation is easily extensible as well: | |
| instance MulSYM repr => MulSYM (Ctx -> repr) where | |
| mul e1 e2 Pos' = mul (e1 Pos') (e2 Pos') | |
| mul e1 e2 Neg' = mul (e1 Pos') (e2 Neg') | |
| {- | |
| A more general solution: | |
| We can achieve a more general pattern matching solution by implementing | |
| an isomorphism between Initial and Final Encodings: | |
| -} | |
| instance ExpSYM Exp where | |
| lit = Lit | |
| neg = Neg | |
| add = Add | |
| initialize :: Exp -> Exp | |
| initialize = id | |
| finalize :: ExpSYM repr => Exp -> repr | |
| finalize (Lit n) = lit n | |
| finalize (Neg e) = neg (finalize e) | |
| finalize (Add e1 e2) = add (finalize e1) (finalize e2) | |
| -- Then we compose the iso with our initial encoded interpreter: | |
| pushNeg' = initialize . pushNeg . finalize | |
| -- This allows us to do any sort of contextual transformation on a | |
| -- finally encoded term so long as we are willing to lose | |
| -- extensibility in the processes | 
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment