Category:
Combo: Talk & Workshop
Target:
Intermediate - Expected knowledge of basic Scala or Haskell syntax
(examples will be contrasted in both).
Language:
Haskell & Scala
Author:
Mark Hibberd
NICTA
[email protected]
@markhibberd
Title:
The Art of Incremental Stream Processing
Talk Abstract:
Purely functional, elegant, correct, incremental and composable
stream processing that is CPU and memory efficient. This is our
(worthy) goal, but where do we start?
This problem space is being extensively explored across a variety
of languages and libraries, each with subtly different trade-offs
and not-so subtly different APIs and terminology. However, these
libraries share common goals, and most share common ancestry from
Oleg Kiselyov's original Iteratee work or its Free Monad based
derivatives.
This talk aims to build up an intuition for stream processing in
general by first building up the core concepts and language of
stream processing, and then grounding those by carefully examining
the trade-offs and internals of several productionised
implementations. Of particular interest are the pipes and conduits
libraries from the Haskell community, and scalaz-stream from the
Scala community.
Workshop Abstract:
Building on the concepts presented in "The Art of Incremental
Stream Processing", this workshop will step through a series of
challenges to build our own set of "pipe-able" unix-like tools
as stream processors.
These tools, including: cat, head, tail, grep, nl, cksum, tee and
xargs; provide an excellent proving ground for stream processing
libraries, and will reinforce understand of the underlying
concepts.
Attendees will be provided with a template containing skeleton
code with a choice of one of four starting points for the
challenges:
1. pipes
2. conduit
3. scalaz-stream
4. free-form, just follow the specification with what ever
technology you want
Along with the skeleton, there will be a series of test cases for
each tool that can be used to validate the code, specifically
including edge cases for lazy evaluation & early termination.
To get things started, you will need access to a laptop,
appropriate development tools for you chosen approach, and a clone
of the base repository from:
https://github.com/markhibberd/ylj14-the-art-of-incremental-streaming
Language / library specific instructions are available in the
repository.
Resources:
[1] Repository for workshop code:
https://github.com/markhibberd/ylj14-the-art-of-incremental-streaming
[2] Oleg Kiselyov's collection of Iteratee related resources:
- http://okmij.org/ftp/Streams.html
[3] The "pipes" library in Haskell - emphasis on principled
abstractions and composition:
- http://beta.hackage.haskell.org/package/pipes
[4] The "conduit" library in Haskell - emphasis on speed
and resource management:
- http://beta.hackage.haskell.org/package/conduit
[5] The "scalaz-stream" library in Scala - emphasis on
pure (non-monadic) processors and clean API, based
on machines / Free Monad approaches:
- https://github.com/scalaz/scalaz-stream
[6] The "machines" library in Haskell - minimal Free Monad style
implementation in haskell:
- http://beta.hackage.haskell.org/package/machines
[7] The "machines" library in Scala - Free Monad style
implementation in scala:
- https://github.com/runarorama/scala-machines/
[8] The "iteratee" library in Haskell - Oleg Kiselyov's direct
implementation:
- http://beta.hackage.haskell.org/package/iteratee
[9] The "enumerator" library in Haskell - First attempt at
improvements and better library support based on
Oleg Kiselyov's original work:
- http://beta.hackage.haskell.org/package/iteratee