Infoq
- Article by Daniel Bryant https://www.infoq.com/articles/batch-etl-streams-kafka
- Based on the talk by Neha Narkhede https://www.infoq.com/presentations/etl-streams
- Article by Netflix https://www.infoq.com/articles/netflix-migrating-stream-processing
- Netflix Keystone Data pipeline https://www.infoq.com/news/2016/03/netflix-keystone-data-pipeline
- Netflix Mantis - Event Stream processing System https://www.infoq.com/presentations/mantis
Confluent
-
https://www.confluent.io/blog/building-real-time-streaming-etl-pipeline-20-minutes/
-
https://www.confluent.io/blog/how-to-build-a-scalable-etl-pipeline-with-kafka-connect/
Safari
- Kafka Connect - https://www.safaribooksonline.com/library/view/kafka-the-definitive/9781491936153/ch07.html
- Building ETL Pipelines Using Kafka - https://www.safaribooksonline.com/library/view/building-data-streaming/9781787283985/f0219d02-4468-4ff1-867a-ee19aafd99db.xhtml
Druid - used in the Nelix Keystone Data pipeline ... Apache Druid (incubating) is a high performance analytics data store for event-driven data.
provides fast analytical queries, at high concurrency, on both real-time and historical data. Druid is often used to power interactive UIs. a new class of data store that combines ideas from OLAP/analytic databases, timeseries databases, and search systems to enable new use cases.
CDC
- Oracle GoldenGate for Big Data https://www.oracle.com/middleware/data-integration/goldengate/big-data/
- Debezium https://debezium.io/
...