Skip to content

Instantly share code, notes, and snippets.

@monadplus
Last active January 13, 2020 22:19
Show Gist options
  • Save monadplus/6a3b69c13c5c875c7585e95cff4d4edc to your computer and use it in GitHub Desktop.
Save monadplus/6a3b69c13c5c875c7585e95cff4d4edc to your computer and use it in GitHub Desktop.
GHC: rewrite rules
{-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs #-}
  • Use the debug flag -ddump-simpl-stats to see what rules fired.
  • Use -ddump-rule-firings to show individual rule firing.
  • Use -ddump-rule-rewrite to show the code before and after rewrite.
$ nix-shell --packages 'haskellPackages.ghcWithHoogle (pkgs: with pkgs; [ criterion deepseq])' haskellPackages.profiteur
$ ghc -O2 -ddump-rule-rewrites --force-recomp Example.hs

Rule fired
    Rule: unpack-list
    Module: (GHC.Base)
    Before: GHC.CString.unpackFoldrCString#
              TyArg [GHC.Types.Char]
              ValArg "mappedIterator"#
              ValArg GHC.Types.: @ GHC.Types.Char
              ValArg GHC.Types.[] @ GHC.Types.Char
    After:  (\ (a_alGL :: GHC.Prim.Addr#) ->
               GHC.CString.unpackCString# a_alGL)
              "mappedIterator"#
    Cont:   Stop[BoringCtxt] GHC.Base.String

Syntax

{-# RULES
      "map/map"    forall f g xs.  map f (map g xs) = map (f.g) xs
      "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys
  #-}
  • Name has no effect, only used for reporting.

  • A rule may optionally have a phase-control number (example below). The [2] means that the rule is active in Pashe 2 and subsequent phases. The inverse notation [~2] is also accepted.

{-# RULES
      "map/map" [2]  forall f g xs. map f (map g xs) = map (f.g) xs
  #-}
  • Each term mentioned in a rule must either be in scope (e.g. map) or bound by the forall (e.g. f, g, xs) called pattern variables.

  • A pattern variable may optionally have a type signature:

{-# RULES
      "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
              foldr k z (build g) = g k z
  #-}
  • If ExplicitForAll is enabled, type/kind variables can also be explicitly bound. For example:
{-# RULES "id" forall a. forall (x :: a). id @a x = x #-}

When a type-level explicit forall is present, each type/kind variable mentioned must now also be either in scope or bound by the forall. In particular, unlike some other places in Haskell, this means free kind variables will not be implicitly bound. For example:

"this_is_bad" forall (c :: k). forall (x :: Proxy c) ...
"this_is_ok"  forall k (c :: k). forall (x :: Proxy c) ...

When bound type/kind variables are needed, both foralls must always be included, though if no pattern variables are needed, the second can be left empty. For example:

{-# RULES "map/id" forall a. forall. map (id @a) = id @[a] #-}
  • The left hand side of a rule must consist of a top-level variable applied to arbitrary expressions. For example, this is not OK:
"wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
"wrong2"   forall f.      f True = True
"wrong3"   forall x.      Just x = Nothing

In "wrong1", the LHS is not an application; in "wrong2", the LHS has a pattern variable in the head. In "wrong3", the LHS consists of a constructor, rather than a variable, applied to an argument.

  • A rule does not need to be in the same module as (any of) the variables it mentions, though of course they need to be in scope.

  • All rules are implicitly exported from the module, and are therefore in force in any module that imports the module that defined the rule, directly or indirectly. (That is, if A imports B, which imports C, then C’s rules are in force when compiling A.) The situation is very similar to that for instance declarations.

  • Inside a RULEforall” is treated as a keyword, regardless of any other flag settings. Furthermore, inside a RULE, the language extension ScopedTypeVariables is automatically enabled; see Lexically scoped type variables.

  • Like other pragmas, RULE pragmas are always checked for scope errors, and are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, and must have the same type. However, rules are only enabled if the -fenable-rewrite-rules flag is on (see Semantics).

Semantics

  • Rules are enabled by (implicitly) -O flag or by -fenable-rewrite-rules flag.

  • Rules are regarded as left-to-right rewrite rules. When GHC finds and expression that is a substitution instance of the LHS of a rule, it replaces the expression by the RHS.

  • GHC doesn't verify LHS == RHS (semantically).

  • GHC doesn't verify termination

  • If more than one rule matches a call, GHC will choose one arbitrarily to apply.

  • GHC keeps trying to apply the rules as it optimises the program. The following matches the "map/map" rule:

let s = map f
    t = map g
in
s (t xs)

How rules interact with INLINE/NOINLINE

Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected results

Example:

f x = x
g y = f y
h z = g True

{-# RULES "f" f True = False #-}

If f is inlined, then g y = y. If g is inlined into h, the RULE has no chance to fire.

How rules interact with CONLIKE pragmas

GHC is very cautious about duplicating work. For example, consider

f k z xs = let xs = build g
           in ...(foldr k z xs)...sum xs...
{-# RULES "foldr/build" forall k z g. foldr k z (build g) = g k z #-}

Since xs is used twice, GHC does not fire the foldr/build rule.

How rules interact with class methods

Giving a RULE for a class method is a bad idea:

class C a where
  op :: a -> a -> a

instance C Bool where
  op x y = ...rhs for op at Bool...

{-# RULES "f" op True y = False #-}

In this example, op is not an ordinary top-level function; it is a class method. GHC rapidly rewrites any occurrences of op-used-at-type-Bool to a specialised function, say opBool, where

opBool :: Bool -> Bool -> Bool
opBool x y = ..rhs for op at Bool...

So the RULE never has a chance to fire, for just the same reasons as in How rules interact with INLINE/NOINLINE pragmas.

The solution is to define the instance-specific function yourself, with a pragma to prevent it being inlined too early, and give a RULE for it:

instance C Bool where
  op = opBool

opBool :: Bool -> Bool -> Bool
{-# NOINLINE [1] opBool #-}
opBool x y = ..rhs for op at Bool...

{-# RULES "f" opBool True y = False #-}

List fusion (deforestation)

The RULES mechanism is used to implement fusion (deforestation) of common list functions. If a “good consumer” consumes an intermediate list constructed by a “good producer”, the intermediate list should be eliminated entirely.

Good producers:

  • List comprehensions
  • Enumerations of Int, Integer and Char (e.g. ['a'..'z']).
  • Explicit lists (e.g. [True, False])
  • The cons constructor (e.g 3:4:[])
  • ++
  • map
  • take, filter
  • iterate, repeat
  • zip, zipWith

Good consumers:

  • List comprehensions
  • array (on its second argument)
  • ++ (on its first argument)
  • foldr
  • map
  • take, filter
  • concat
  • unzip, unzip2, unzip3, unzip4
  • zip, zipWith (but on one argument only; if both are good producers, zip will fuse with one but not the other)
  • partition
  • head
  • and, or, any, all
  • sequence_
  • msum

So, for example, the following should generate no intermediate lists:

array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]

Specialisation

Rewrite rules can be used to get the same effect as a feature present in earlier versions of GHC. For example, suppose that:

genericLookup :: Ord a => Table a b   -> a   -> b
intLookup     ::          Table Int b -> Int -> b
{-# RULES "genericLookup/Int" genericLookup = intLookup #-}

This slightly odd-looking rule instructs GHC to replace genericLookup by intLookup whenever the types match.

Data.List.Stream

data Stream a = forall s. Unlifted s =>
                          Stream !(s -> Step a s)  -- a stepper function
                                 !s                -- an initial state
data Step a s = Yield a !s
              | Skip    !s
              | Done

-- | Construct an abstract stream from a list.
stream :: [a] -> Stream a
stream xs0 = Stream next (L xs0)
  where
    {-# INLINE next #-}
    next (L [])     = Done
    next (L (x:xs)) = Yield x (L xs)
{-# INLINE [0] stream #-}

-- | Flatten a stream back into a list.
unstream :: Stream a -> [a]
unstream (Stream next s0) = unfold_unstream s0
  where
    unfold_unstream !s = case next s of
      Done       -> []
      Skip    s' -> expose s' $     unfold_unstream s'
      Yield x s' -> expose s' $ x : unfold_unstream s'
{-# INLINE [0] unstream #-}

--
-- /The/ stream fusion rule
--

{-# RULES
"STREAM stream/unstream fusion" forall s.
    stream (unstream s) = s
  #-}

append :: Stream a -> Stream a -> Stream a
append (Stream next0 s01) (Stream next1 s02) = Stream next (Left s01)
  where
    {-# INLINE next #-}
    next (Left s1)  = case next0 s1 of
                          Done        -> Skip    (Right s02)
                          Skip s1'    -> Skip    (Left s1')
                          Yield x s1' -> Yield x (Left s1')
    next (Right s2) = case next1 s2 of
                          Done        -> Done
                          Skip s2'    -> Skip    (Right s2')
                          Yield x s2' -> Yield x (Right s2')

https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-Stream.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment