This is one of the most powerful techniques you can use to optimise data structures.
When a constructor field is marked strict, and it is a single-constructor type, then it is possible to ask GHC to unpack the contents of the field directly in its parent. For example, given this:
data T = T {-# UNPACK #-} !(Int,Float)
GHC will represent the type T like this:
data T = T Int Float -- no tuple
Multi-level unpacking:
data T = T {-# UNPACK #-} !S
data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int
will store two unboxed Int#s directly in the T constructor. The unpacker can see through newtypes, too.
This is commonly used to put unboxed Ints directly in a constructor:
data T = T {-# UNPACK #-} !Int
will be represented as
data T = T Int#
where Int# is the unboxed integer type. You don't have to mention Int# in
your program - just putting the {-# UNPACK #-}
directive on the field is
enough to tell GHC that this is the representation you want, and GHC will eliminate the boxing.
Note that {-# UNPACK #-}
isn't the default, for the reason that it isn't always a good idea.
If there is a pattern match on a constructor with an unpacked field, and the value of that field is passed to a non-strict function, GHC has to re-box the value before passing it on. If this re-boxing is common, then unpacking can be slower than not unpacking. The effect can be more acute if the type being unpacked has a lot of components (eg. a 17-tuple).
Unpacking constructor fields should only be used in conjunction with -O
(i.e. -O1
), in order to expose unfoldings to the compiler so the reboxing can be removed as often as possible. For example:
data T = T {-# UNPACK #-} !Float
{-# UNPACK #-} !Float
f :: T -> Float
f (T f1 f2) = f1 + f2
The compiler will avoid reboxing f1 and f2 by inlining + on floats, but only when -O
is on.