{-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs #-}
- Use the debug flag
-ddump-simpl-stats
to see what rules fired. - Use
-ddump-rule-firings
to show individual rule firing. - Use
-ddump-rule-rewrite
to show the code before and after rewrite.
$ nix-shell --packages 'haskellPackages.ghcWithHoogle (pkgs: with pkgs; [ criterion deepseq])' haskellPackages.profiteur
$ ghc -O2 -ddump-rule-rewrites --force-recomp Example.hs
Rule fired
Rule: unpack-list
Module: (GHC.Base)
Before: GHC.CString.unpackFoldrCString#
TyArg [GHC.Types.Char]
ValArg "mappedIterator"#
ValArg GHC.Types.: @ GHC.Types.Char
ValArg GHC.Types.[] @ GHC.Types.Char
After: (\ (a_alGL :: GHC.Prim.Addr#) ->
GHC.CString.unpackCString# a_alGL)
"mappedIterator"#
Cont: Stop[BoringCtxt] GHC.Base.String
{-# RULES
"map/map" forall f g xs. map f (map g xs) = map (f.g) xs
"map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys
#-}
-
Name has no effect, only used for reporting.
-
A rule may optionally have a phase-control number (example below). The [2] means that the rule is active in Pashe 2 and subsequent phases. The inverse notation [~2] is also accepted.
{-# RULES
"map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs
#-}
-
Each term mentioned in a rule must either be in scope (e.g. map) or bound by the forall (e.g. f, g, xs) called pattern variables.
-
A pattern variable may optionally have a type signature:
{-# RULES
"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
foldr k z (build g) = g k z
#-}
- If
ExplicitForAll
is enabled, type/kind variables can also be explicitly bound. For example:
{-# RULES "id" forall a. forall (x :: a). id @a x = x #-}
When a type-level explicit forall
is present, each type/kind variable mentioned must now also be either in scope or bound by the forall. In particular, unlike some other places in Haskell, this means free kind variables will not be implicitly bound. For example:
"this_is_bad" forall (c :: k). forall (x :: Proxy c) ...
"this_is_ok" forall k (c :: k). forall (x :: Proxy c) ...
When bound type/kind variables are needed, both foralls must always be included, though if no pattern variables are needed, the second can be left empty. For example:
{-# RULES "map/id" forall a. forall. map (id @a) = id @[a] #-}
- The left hand side of a rule must consist of a top-level variable applied to arbitrary expressions. For example, this is not OK:
"wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
"wrong2" forall f. f True = True
"wrong3" forall x. Just x = Nothing
In "wrong1", the LHS is not an application; in "wrong2", the LHS has a pattern variable in the head. In "wrong3", the LHS consists of a constructor, rather than a variable, applied to an argument.
-
A rule does not need to be in the same module as (any of) the variables it mentions, though of course they need to be in scope.
-
All rules are implicitly exported from the module, and are therefore in force in any module that imports the module that defined the rule, directly or indirectly. (That is, if A imports B, which imports C, then C’s rules are in force when compiling A.) The situation is very similar to that for instance declarations.
-
Inside a RULE “
forall
” is treated as a keyword, regardless of any other flag settings. Furthermore, inside a RULE, the language extension ScopedTypeVariables is automatically enabled; see Lexically scoped type variables. -
Like other pragmas, RULE pragmas are always checked for scope errors, and are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, and must have the same type. However, rules are only enabled if the -fenable-rewrite-rules flag is on (see Semantics).
-
Rules are enabled by (implicitly)
-O
flag or by-fenable-rewrite-rules
flag. -
Rules are regarded as left-to-right rewrite rules. When GHC finds and expression that is a substitution instance of the LHS of a rule, it replaces the expression by the RHS.
-
GHC doesn't verify LHS == RHS (semantically).
-
GHC doesn't verify termination
-
If more than one rule matches a call, GHC will choose one arbitrarily to apply.
-
GHC keeps trying to apply the rules as it optimises the program. The following matches the
"map/map"
rule:
let s = map f
t = map g
in
s (t xs)
Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected results
Example:
f x = x
g y = f y
h z = g True
{-# RULES "f" f True = False #-}
If f
is inlined, then g y = y
. If g
is inlined into h
, the RULE has no chance to fire.
GHC is very cautious about duplicating work. For example, consider
f k z xs = let xs = build g
in ...(foldr k z xs)...sum xs...
{-# RULES "foldr/build" forall k z g. foldr k z (build g) = g k z #-}
Since xs is used twice, GHC does not fire the foldr/build rule.
Giving a RULE for a class method is a bad idea:
class C a where
op :: a -> a -> a
instance C Bool where
op x y = ...rhs for op at Bool...
{-# RULES "f" op True y = False #-}
In this example, op is not an ordinary top-level function; it is a class method. GHC rapidly rewrites any occurrences of op-used-at-type-Bool to a specialised function, say opBool, where
opBool :: Bool -> Bool -> Bool
opBool x y = ..rhs for op at Bool...
So the RULE never has a chance to fire, for just the same reasons as in How rules interact with INLINE/NOINLINE pragmas.
The solution is to define the instance-specific function yourself, with a pragma to prevent it being inlined too early, and give a RULE for it:
instance C Bool where
op = opBool
opBool :: Bool -> Bool -> Bool
{-# NOINLINE [1] opBool #-}
opBool x y = ..rhs for op at Bool...
{-# RULES "f" opBool True y = False #-}
The RULES mechanism is used to implement fusion (deforestation) of common list functions. If a “good consumer” consumes an intermediate list constructed by a “good producer”, the intermediate list should be eliminated entirely.
Good producers:
- List comprehensions
- Enumerations of Int, Integer and Char (e.g. ['a'..'z']).
- Explicit lists (e.g. [True, False])
- The cons constructor (e.g 3:4:[])
++
map
take, filter
iterate, repeat
zip, zipWith
Good consumers:
- List comprehensions
- array (on its second argument)
++
(on its first argument)foldr
map
take, filter
concat
unzip, unzip2, unzip3, unzip4
zip, zipWith
(but on one argument only; if both are good producers, zip will fuse with one but not the other)partition
head
and, or, any, all
sequence_
msum
So, for example, the following should generate no intermediate lists:
array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
Rewrite rules can be used to get the same effect as a feature present in earlier versions of GHC. For example, suppose that:
genericLookup :: Ord a => Table a b -> a -> b
intLookup :: Table Int b -> Int -> b
{-# RULES "genericLookup/Int" genericLookup = intLookup #-}
This slightly odd-looking rule instructs GHC to replace genericLookup
by intLookup
whenever the types match.