Overview of Vectorization support in GHC

With the completion of GSoC 2018, this project has completed an initial step to introduce basic vectorization functions to the Glasgow Haskell Compiler. At the end of this project my branch of GHC (https://github.com/Abhiroop/ghc-1/tree/wip/simd-ncg-support) supports:

Among constructors and de-constructors

broadcastFloatX4# :: Float# -> FloatX4#

broadcastDoubleX2# :: Double# -> DoubleX2#

packFloatX4# :: (# Float#, Float#, Float#, Float# #) -> FloatX4#

packDoubleX2# :: (# Double#, Double# #) -> DoubleX2#

unpackFloatX4# :: FloatX4# -> (# Float#, Float#, Float#, Float# #)

unpackDoubleX2# :: DoubleX2# -> (#Double#, Double# #)

Also it supports all the arithmetic operations like plus, minus, times, quot, rem, negate for the vector types FloatX4#, DoubleX2#. These operations are the atomic units of vectorizations and more complex algorithms can be expressed by composing these functions.

The first two evaluations were spent designing working on AVX and SSE support for the above. The above functions shall emit x86 vector instructions depending on the flags provided.

The third evaluation phase involved redesigning part of the code written above to support -O2 optimizations. Initially -O2 would result in segmentation faults and other errors which were fixed.

The data types like Int8, Int16, Int32, Int64 are all wrappers over the machine dependent Int# unlifted type. To provide deterministic and uniform SIMD support for Ints, a major part of the third evaluation constituted of adding support for Int8#, Word8#, Int16#, Word16#, Int32#, Word32#, Int64#, Word64# types and their respective primops like plus, minus, times, quot, rem, negate etc.

The last few days constituted of designing a basic slice of SIMD support for 8 bit integers.

I have added a very recent work in progress library, which abstracts over the unlifted types and provide general polymorphic SIMD functions. The library can be found here: https://github.com/Abhiroop/lift-vector. It contains examples like polynomial evaluations or dot products using vectors.

Work Link

GHC uses Phabricator to track all of the code changes as well as for code reviews. We provide link to the phab diffs:

Majority of the Float and Double SIMD work is available here: https://phabricator.haskell.org/D4813

Also work on changing some utility functions inside GHC: https://phabricator.haskell.org/D4922

Int16#, Word16# support: https://phabricator.haskell.org/D5006

Int32#, Word32# support: https://phabricator.haskell.org/D5032

Int64#, Word64# support(not yet pushed to Phab): https://github.com/Abhiroop/ghc-1/commit/8511bdf83496464903c3589a2f3b7d2f6d690ec7

[WIP]

8-bit integer SIMD support: https://github.com/Abhiroop/ghc-1/commit/f5ae40f82a4ed5e07b972ad6b303ce649067fc8a

https://github.com/Abhiroop/ghc-1/commit/87ba0edee90916d4f231242b45d3e442169e25d7

Example project on polynomial evaluation (built using : https://github.com/Abhiroop/ghc-1/tree/wip/simd-ncg-support): https://github.com/Abhiroop/polynomial

Building

I have made sure all of my branches build properly. The major branch is this one: https://github.com/Abhiroop/ghc-1/tree/wip/simd-ncg-support

The build instructions are exactly similar to (https://ghc.haskell.org/trac/ghc/wiki/Newcomers) Anyone can build my branch and gain access to the existing vectorization functions.

Future Directions

Full support for 8 bits, 16 bits, 32 bits, 64 bits Int SIMD operations(AVX and SSE)
Introduce YMM register support in GHC and operations on that
Add support for {-# MicroArchitecture #-} pragma
Add support for shuffle operations

Abhiroop/FinalEvaluation.md

Overview of Vectorization support in GHC

Work Link

Building

Future Directions