Skip to content

Instantly share code, notes, and snippets.

@replay
Created March 18, 2019 09:20
Show Gist options
  • Save replay/3747d4f6d503b77d335753e0cfc08627 to your computer and use it in GitHub Desktop.
Save replay/3747d4f6d503b77d335753e0cfc08627 to your computer and use it in GitHub Desktop.
This example assumes we only keep 3 hours of data in Kafka, to illustrate the problem with 6h chunks
Time --->
| 05:00 | 06:00 | 07:00 | 08:00 | 09:00 | 10:00 | 11:00 | 12:00 |
|---------------------------------------------------------------|
^ ^ Times when 6h of data get flushed into the backend store
^ MT restarts shortly before reaching an hour when it was supposed to flush 6 hours
|-----------------------| Time range that MT can replay from Kafka
|---------------------------| Time range that MT will write as the 6h block from 06:00 - 12:00
|-------------------| Time range that is lost because MT couldn't get it from Kafka anymore
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment