Building Kafka from the Hardware - up

Higher Message Retention ? - Increase disk size
Higher Message Throughput ? - Increase network capacity
Higher Producer Performance ? - Increase Disk I/O speed
Higher Consumer Performance ? - Increase Memory

Critical Configurations (Consumer)

queued.min.messages
fetch.wait.max.ms
socket.blocking.max.ms
fetch.error.backoff.ms

queued.min.messages

Minimum number of messages per topic+partition in the local consumer queue.

fetch.wait.max.ms

Maximum time the broker may wait to fill the response with fetch.min.bytes.

socket.blocking.max.ms

Maximum time a broker socket operation may block.
A lower value improves responsiveness at the expense of slightly higher CPU usage.

fetch.error.backoff.ms

How long to postpone the next fetch request for a topic+partition in case of a fetch error.

Critical Configurations (Producer)

batch.num.messages
queue.buffering.max.ms
socket.blocking.max.ms
compression.codec
request.required.acks

batch.num.messages

Maximum number of messages batched in one MessageSet.

queue.buffering.max.ms

Maximum time, in milliseconds, for buffering data on the producer queue.

compression.codec

Compression codec to use for compressing message sets: none, gzip or snappy.

socket.blocking.max.ms

Maximum time a broker socket operation may block.
A lower value improves responsiveness at the expense of slightly higher CPU usage.

request.required.acks

This field indicates how many acknowledgements the leader broker must receive from ISR (in-sync-replicas) brokers before responding to the request: 0=broker does not send any response, 1=broker will wait until the data is written to local log before sending a response, -1=broker will block until message is committed by all in sync replicas (ISRs) or broker's in.sync.replicas setting before sending response. 1=Only the leader broker will need to ack the message.

Understanding the Kafka Producer

A batch is ready when one of the following is true:

batch.num.messages is reached (size based batching)
queue.buffering.max.ms is reached (time based batching)
Another batch to the same broker is ready (piggyback)
flush() or close() is called internally by the client

In general, more batching results in:

Better compression ratio => Higher throughput
Higher latency (not nice but its reasonable trade-off)

compression.codec

Compression is usually dominant part of the producer.send()
The speed of different compression types differs A LOT
For now it seems like using snappy or lz4 provides the best performance in terms of time to compress the batch.

request.required.acks

Defines different durability level for producing messages.

acks	Throughput	Latency	Durability
0	high	low	No guarantee
1	medium	medium	only leader
-1	low	high	ISR

marijus-ravickas/tuning librdkafka performance.md

Building Kafka from the Hardware - up

Critical Configurations (Consumer)

queued.min.messages

fetch.wait.max.ms

socket.blocking.max.ms

fetch.error.backoff.ms

Critical Configurations (Producer)

batch.num.messages

queue.buffering.max.ms

compression.codec

socket.blocking.max.ms

request.required.acks

Understanding the Kafka Producer

compression.codec

request.required.acks