Skip to content

Instantly share code, notes, and snippets.

@arpan57
Last active April 29, 2021 00:05
Show Gist options
  • Save arpan57/957e8e2c66cf31a6044346b153e973b4 to your computer and use it in GitHub Desktop.
Save arpan57/957e8e2c66cf31a6044346b153e973b4 to your computer and use it in GitHub Desktop.
Steps to get Kafka metrics to ELK Dashboard
For observability of Kafka cluster through Elastic Search and Kibana, install and configure metricbeat.
Following is tested on 3 node Kafka Cluster installed on a single EC2 host, ELK installed on another EC2 host. Did not configure any authentication/authorization for the ease of setup.
Steps for Metricbeat
==========================================
1. Referring to the documentation link from references download the tarball or RPM
2. Extract/install the RPM on the Kafka host
3. Post installation, update the metricbeat.yml (usually /etc/metricbeat/metricbeat.yml)
update the host entry under setup.kibana to point to Kibana endpoint
setup.kibana:
host: "mykibanahost:5601"
update the host entry under output.elasticsearch to point to elasticsearch endpoint
output.elasticsearch:
hosts: ["myEShost:9200"]
4. Enable Kafka module for metricbeat on the Kafka host ```metricbeat modules enable kafka```
5. Download Jolokia JVM agent jar (jolokia-jvm-1.6.2-agent.jar) from : https://jolokia.org/download.html
6. Place it in a location can be accessed by Kafka startup script.
7. Update <KAFKA_HOME>/bin/kafka-server-start.sh script add KAFKA_JMX_OPTS similar to given below:
```
export KAFKA_JMX_OPTS="
-javaagent:/location/of.jolokia-jvm-agent.jar=port=8778,host=localhost \
-Dcom.sun.management.jmxremote=true \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false \
-Djava.rmi.server.hostname=localhost \
-Dcom.sun.management.jmxremote.host=localhost \
-Dcom.sun.management.jmxremote.port=9888 \
-Dcom.sun.management.jmxremote.rmi.port=9888 \
-Djava.net.preferIPv4Stack=true"
```
Where 8778 is the port where Jolokia will export JMX metrics and can be accessed via HTTP interface.
8. Confirm using curl -s http://localhost:8778/jolokia/version | jq
9. Edit the kafka.yml configurations for metricbeat (/etc/metricbeat/modules.d/kafka.yml)
Documentation: https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-kafka.html
10. Update the kafka.yml -> hosts property to point to Kafka broker IP OR DNS/port and uncomment the modules for broker/consumer/producer metrics and point to correct broker IP and JOLOKIA ports as shown below.
```
# Kafka metrics collected using the Kafka protocol
- module: kafka
metricsets:
- partition
- consumergroup
period: 10s
hosts: ["localhost:9092"]
# Metrics collected from a Kafka broker using Jolokia
- module: kafka
metricsets:
- broker
period: 10s
hosts: ["localhost:8778"]
# Metrics collected from a Java Kafka consumer using Jolokia
- module: kafka
metricsets:
- consumer
period: 10s
hosts: ["localhost:8778"]
# Metrics collected from a Java Kafka producer using Jolokia
- module: kafka
metricsets:
- producer
period: 10s
hosts: ["localhost:8778"]
```
11. Setup using ```$ metricbeat setup -e```
12. Open Kibana default home page -> Kibana visualize & Analyze -> Add Data -> Metrics tab -> Kafka metrics -> click on Check Data -> Expected message 'Data successfully received from this module' -> Discover.
@arpan57
Copy link
Author

arpan57 commented Apr 28, 2021

Some important metrics :

- Is cluster up/healthy?
Cluster up or not can be retrieved using two metrics (Attached screenshot) heartbeat/uptime out-of-the-box beats. + Number of brokers on Kafka Dashboard.
Using heartbeat
From Kafka Default Dashboard

-Are brokers online/offline?
Found using a portlet on the Kafka dashboard

-Number of messages & avg message size
Not possible to extract using JMX metrics. However, the number of messages for a topic can be found using a custom script. For the avg. message size => topic size on the disk/number of messages - this might be overkill.

-Number of failed messages - Failed to read/write messages
Kafka JMX metrics can be used

kafka.broker.request.fetch.failed
kafka.broker.request.produce.failed

-Storage/memory
Underlying host-related metrics can be obtained by Obvservability metrics
image
image

-Number of clients using the cluster (producers & consumers)
The number of consumer groups can be found using the Kafka dashboard available in Kibana. For the number of producers, I don't see a direct way. Checking the number of consumer groups is available in Kibana's Kafka dashboard
image

@arpan57
Copy link
Author

arpan57 commented Apr 29, 2021

For getting the number of messages in a topic :

<KAFKA_HOME>/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9093 --topic <TOPIC_NAME> --time -1 | tr ":" " " | awk '{ sum += $3 } END { print sum }'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment