Created
August 25, 2018 03:45
-
-
Save camjo/75ca375d6569b790024016b75d3f6e90 to your computer and use it in GitHub Desktop.
Another example API (less complete) based on the beam model where everything is streams and tables
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Following model based on the description of | |
// Apache Beam in the book Streaming Systems by Reuven Lax; Tyler Akidau; Slava Chernyak | |
/** | |
* Tables are data at rest, and act as a container for data to accumulate and be observed over time. | |
* K = Key | |
* V = Value | |
* W = Window | |
* P = Partition | |
*/ | |
type Table[K, V, W, P] | |
/** | |
* Streams are data in motion, and encode a discretized view of the evolution of a table over time. | |
* K = Key | |
* V = Value | |
* W = Window | |
* P = Partition | |
*/ | |
type Stream[K, V, W, P] | |
/** | |
* Applying nongrouping operations to a stream alters the data in the stream while leaving them in motion, | |
* yielding a new stream with possibly different cardinality. | |
* | |
* Stream -> Stream | |
*/ | |
trait NonGroupingOperations { | |
def map[K, V1, V2, W, P](stream: Stream[K, V1, W, P])(f: V1 => V2): Stream[K, V2, W, P] | |
def filter[K, V, W, P](stream: Stream[K, V, W, P])(f: V => Boolean): Stream[K, V, W, P] | |
def assignWindow[K, V, W1, W2, P](stream: Stream[K, V, W1, P])(window: W2): Stream[K, V, W2, P] | |
} | |
/** | |
* Grouping data within a stream brings those data to rest, yielding a table that evolves over time. | |
* | |
* - Windowing incorporates the dimension of event time into such groupings. | |
* | |
* - Merging windows dynamically combine over time, allowing them to reshape themselves in response to the data | |
* observed and dictating that key remain the unit of atomicity/parallelization, with window being a child | |
* component of grouping within that key. | |
* | |
* Stream -> Table | |
*/ | |
trait GroupingOperations { | |
} | |
/** | |
* Triggering data within a table ungroups them into motion, yielding a stream that captures a view of the table’s | |
* evolution over time. | |
* | |
* - Watermarks provide a notion of input completeness relative to event time, which is a useful reference point | |
* when triggering event-timestamped data, particularly data grouped into event-time windows from unbounded | |
* streams. | |
* | |
* - The accumulation mode for the trigger determines the nature of the stream, dictating whether it contains | |
* deltas or values, and whether retractions for previous deltas/values are provided. | |
* | |
* Table -> Stream | |
*/ | |
trait UnGroupingOperations { | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment