Skip to content

Instantly share code, notes, and snippets.

@markhibberd
Created April 8, 2014 03:00
Show Gist options
  • Save markhibberd/10085873 to your computer and use it in GitHub Desktop.
Save markhibberd/10085873 to your computer and use it in GitHub Desktop.

Category:

Combo: Talk & Workshop

Target:

Intermediate - Expected knowledge of basic Scala or Haskell syntax
               (examples will be contrasted in both).

Language:

Haskell & Scala

Author:

Mark Hibberd
NICTA
[email protected]
@markhibberd

Title:

The Art of Incremental Stream Processing

Talk Abstract:

Purely functional, elegant, correct, incremental and composable
stream processing that is CPU and memory efficient. This is our
(worthy) goal, but where do we start?

This problem space is being extensively explored across a variety
of languages and libraries, each with subtly different trade-offs
and not-so subtly different APIs and terminology. However, these
libraries share common goals, and most share common ancestry from
Oleg Kiselyov's original Iteratee work or its Free Monad based
derivatives.

This talk aims to build up an intuition for stream processing in
general by first building up the core concepts and language of
stream processing, and then grounding those by carefully examining
the trade-offs and internals of several productionised
implementations. Of particular interest are the pipes and conduits
libraries from the Haskell community, and scalaz-stream from the
Scala community.

Workshop Abstract:

Building on the concepts presented in "The Art of Incremental
Stream Processing", this workshop will step through a series of
challenges to build our own set of "pipe-able" unix-like tools
as stream processors.

These tools, including: cat, head, tail, grep, nl, cksum, tee and
xargs; provide an excellent proving ground for stream processing
libraries, and will reinforce understand of the underlying
concepts.

Attendees will be provided with a template containing skeleton
code with a choice of one of four starting points for the
challenges:

  1. pipes
  2. conduit
  3. scalaz-stream
  4. free-form, just follow the specification with what ever
     technology you want

Along with the skeleton, there will be a series of test cases for
each tool that can be used to validate the code, specifically
including edge cases for lazy evaluation & early termination.

To get things started, you will need access to a laptop,
appropriate development tools for you chosen approach, and a clone
of the base repository from:

   https://github.com/markhibberd/ylj14-the-art-of-incremental-streaming

Language / library specific instructions are available in the
repository.

Resources:

[1] Repository for workshop code:
   https://github.com/markhibberd/ylj14-the-art-of-incremental-streaming

[2] Oleg Kiselyov's collection of Iteratee related resources:
     - http://okmij.org/ftp/Streams.html

[3] The "pipes" library in Haskell - emphasis on principled
    abstractions and composition:
     - http://beta.hackage.haskell.org/package/pipes

[4] The "conduit" library in Haskell - emphasis on speed
    and resource management:
     - http://beta.hackage.haskell.org/package/conduit

[5] The "scalaz-stream" library in Scala - emphasis on
    pure (non-monadic) processors and clean API, based
    on machines / Free Monad approaches:
     - https://github.com/scalaz/scalaz-stream

[6] The "machines" library in Haskell - minimal Free Monad style
    implementation in haskell:
     - http://beta.hackage.haskell.org/package/machines

[7] The "machines" library in Scala - Free Monad style
    implementation in scala:
     - https://github.com/runarorama/scala-machines/

[8] The "iteratee" library in Haskell - Oleg Kiselyov's direct
    implementation:
     - http://beta.hackage.haskell.org/package/iteratee

[9] The "enumerator" library in Haskell - First attempt at
    improvements and better library support based on
    Oleg Kiselyov's original work:
     - http://beta.hackage.haskell.org/package/iteratee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment