giri-sh · July 22, 2021 11:26
diff --git a/kafka_fundamentals_notes b/kafka_fundamentals_notes
 What is this gist about?
 -- Apache Kafka Fundamentals


 Why do we need Apache Kafka?
 -- Every day data produced in the world is huge. Currently estimated at 2.5 QB (Quintillion Bytes).
 -- Because of the huge data that we generate every day, we need some kind of Queuing theory that can process this data for our systems.


 Types of Queuing Systems?
 -- P2P
 -- Publisher-Subscriber


 What is Apache Kafka?
 -- Kafka is a distributed, reliable and performant streaming platform.
 -- Kafka works on Publisher-Subscriber model.
 -- Kafka has the capability of handling the continuous stream of data.
 -- Kafka supports the transfer of huge data or requests between systems.
 -- Kafka stores data that is published and consumed.


 What is Zookeeper?
 -- Cluster management system for Kafka.
 -- Also acts as an orchestrator for Kafka.
 -- Zookeeper ensemble
 -- Zookeeper is needed to  -
 ---- Elect topic leader
 ---- Resolve deadlock issues


 Kafka Cluster - Collection of brokers.
 Broker - Independent instance of Kafka service. Each broker runs in its own VM. It is also known as a Bootstrap server.
 Topic - Is based on commit log architecture. Mulitple topics can be created in a Kafka cluster.
 Partitions - Topics are divided into partitions. These are created for improving parallel processing. Messages are stored in partitions with incremental offset.


 What are the guarantees that Kafka provides?
 -- Ordering is confirmed in a partition.
 -- The default time for which the data is stored is 7 days. This is customizable.
 -- 
 -- Partition to broker assignment is automatic


 What is a Producer?
 -- System that generates the data.
 -- Uses API to write data to a Kafka cluster.
 -- Uses keys to send the data.


 What are Acknowledgements?
 -- Response that the producer awaits for to confirm that that data produced has been safely stored in Kafka cluster.
 -- 3 modes of acknowledgements. 
 ---- 0 - Fire and forget. 
 ---- 1 - Get ackowledgements from leader.
 ---- All - Get acknowledgement from all (leaders and ISR)


 What is a consumer?
 -- System that reads data from topics.
 -- Multiple consumer groups can consume a particular topic or a partition.


 What are consumer groups?
 -- Group of consumers that is created to achieve a common goal.
 -- Goal can be - to save the data to DB or perform an alerting operation.


 What are Delivery Semantics?
 -- Process by which consumers mark the message in a topic as consumed.
 -- 3 modes of delivery semantics -


 Replication Factor -
 -- Helps preserve the number of copies of topic in a cluster.


 Offset - Incremental integer ID assigned to a broker
	What is this gist about?
	-- Apache Kafka Fundamentals


	Why do we need Apache Kafka?
	-- Every day data produced in the world is huge. Currently estimated at 2.5 QB (Quintillion Bytes).
	-- Because of the huge data that we generate every day, we need some kind of Queuing theory that can process this data for our systems.


	Types of Queuing Systems?
	-- P2P
	-- Publisher-Subscriber


	What is Apache Kafka?
	-- Kafka is a distributed, reliable and performant streaming platform.
	-- Kafka works on Publisher-Subscriber model.
	-- Kafka has the capability of handling the continuous stream of data.
	-- Kafka supports the transfer of huge data or requests between systems.
	-- Kafka stores data that is published and consumed.


	What is Zookeeper?
	-- Cluster management system for Kafka.
	-- Also acts as an orchestrator for Kafka.
	-- Zookeeper ensemble
	-- Zookeeper is needed to -
	---- Elect topic leader
	---- Resolve deadlock issues


	Kafka Cluster - Collection of brokers.
	Broker - Independent instance of Kafka service. Each broker runs in its own VM. It is also known as a Bootstrap server.
	Topic - Is based on commit log architecture. Mulitple topics can be created in a Kafka cluster.
	Partitions - Topics are divided into partitions. These are created for improving parallel processing. Messages are stored in partitions with incremental offset.


	What are the guarantees that Kafka provides?
	-- Ordering is confirmed in a partition.
	-- The default time for which the data is stored is 7 days. This is customizable.
	--
	-- Partition to broker assignment is automatic


	What is a Producer?
	-- System that generates the data.
	-- Uses API to write data to a Kafka cluster.
	-- Uses keys to send the data.


	What are Acknowledgements?
	-- Response that the producer awaits for to confirm that that data produced has been safely stored in Kafka cluster.
	-- 3 modes of acknowledgements.
	---- 0 - Fire and forget.
	---- 1 - Get ackowledgements from leader.
	---- All - Get acknowledgement from all (leaders and ISR)


	What is a consumer?
	-- System that reads data from topics.
	-- Multiple consumer groups can consume a particular topic or a partition.


	What are consumer groups?
	-- Group of consumers that is created to achieve a common goal.
	-- Goal can be - to save the data to DB or perform an alerting operation.


	What are Delivery Semantics?
	-- Process by which consumers mark the message in a topic as consumed.
	-- 3 modes of delivery semantics -


	Replication Factor -
	-- Helps preserve the number of copies of topic in a cluster.


	Offset - Incremental integer ID assigned to a broker