Skip to content

Instantly share code, notes, and snippets.

@codingforentrepreneurs
Last active April 16, 2024 21:02
Show Gist options
  • Save codingforentrepreneurs/aef0968829883110e24b107f7278255f to your computer and use it in GitHub Desktop.
Save codingforentrepreneurs/aef0968829883110e24b107f7278255f to your computer and use it in GitHub Desktop.
Kafka Installation Bootstrap Script to run on ubuntu machines for the Coding with Kafka Course

Kafka Install Script

This script is reference for the Coding with Kafka ebook. Here's how you can use it:

For zookeeper:

ssh root@zookeeper-ip "sudo chmod +x /tmp/kafka-bootstrap-script.sh"
ssh root@zookeeper-ip "sudo /tmp/kafka-bootstrap-script.sh zookeeper1"
ssh root@zookeeper-ip "cat /data/zookeeper/myid"
ssh root@zookeeper-ip "ls /opt/kafka/"

For kafka:

ssh root@kafka-ip "sudo chmod +x /tmp/kafka-bootstrap-script.sh"
ssh root@kafka-ip "sudo /tmp/kafka-bootstrap-script.sh kafka1"
ssh root@kafka-ip "ls /opt/kafka/"

Here's what the script is doing:

  • Set the hostname based on the keyword passed (e.g. zookeeper1 or kafka1 in the example above)
  • Create a user tars and add it to the sudo group, this user will be used for running Kafka and Zookeeper services as background services
  • Create all necessary directories for Zookeeper and Kafka configuration
  • Install Java and other required packages to run Zookeeper and Kafka
  • Update the file limits to allow for 100,000 file descriptors; Kafka may open a lot of files so we want to increase the limit.
  • Update the memory swap to 1; doing this ensures we leverage RAM as much as possible before using swap space which is stored on the much slower disk. This setting helps us use a Linode 4GB instance but is another indicator as to why 8GB+ is needed.
  • Download Kafka from the official Apache Kafka website and extract it to /opt/kafka/. /opt/kafka/bin/ will hold all the Kafka scripts we'll need to run Zookeeper, run Kafka, use the Zookeeper shell, create topics, produce and consume topics, and more.
#!/bin/bash
# Public gist available at:
# https://gist.github.com/codingforentrepreneurs/aef0968829883110e24b107f7278255f
# Check if an argument is provided
if [ "$#" -ne 1 ]; then
echo "Usage: $0 new_hostname"
exit 1
fi
# Set the hostname
sudo hostnamectl set-hostname "$1"
# Confirmation message
echo "Hostname has been changed to $1"
# Set the the Zookeeper Instance ID from
# Check if 'zookeeper' is in the hostname
if [[ "${1,,}" == *"zookeeper"* ]]; then
# Extract the numerical ID from the hostname, remove leading zeros
host_id=$(echo "$1" | grep -o '[0-9]*' | sed 's/^0*//')
# ensure /data/zookeeper/ exists
sudo mkdir -p /data/zookeeper
# Write the ID to the Zookeeper myid file
echo "$host_id" | sudo tee /data/zookeeper/myid
# Display the contents of the myid file
cat /data/zookeeper/myid
else
echo "The new hostname does not contain 'zookeeper', no Zookeeper ID changes made."
fi
# Create user "tars"
sudo useradd -r -s /sbin/nologin tars
sudo usermod -aG sudo tars
echo "tars ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/tars
# Define all required directories
directories=(
/data/my-config
/var/log/zookeeper
/var/log/kafka
/opt/kafka
/tmp/zookeeper
/data/zookeeper
/data/kafka
)
# Loop through each directory
for dir in "${directories[@]}"; do
# Create the directory with sudo, avoiding errors if it already exists
sudo mkdir -p "$dir"
# Change the ownership to 'tars' user and group, recursively
sudo chown -R tars:tars "$dir"
done
# Install Java and Required packages
sudo apt-get update && sudo apt-get -y install wget ca-certificates zip net-tools vim nano tar netcat openjdk-8-jdk
# Add file limits configs - allow to open 100,000 file descriptors
echo "* hard nofile 100000" | sudo tee --append /etc/security/limits.conf
echo "* soft nofile 100000" | sudo tee --append /etc/security/limits.conf
# update memory swap
sudo sysctl vm.swappiness=1
echo 'vm.swappiness=1' | sudo tee --append /etc/sysctl.conf
# Download Kafka (including Zookeeper) from
# https://kafka.apache.org/downloads
curl https://dlcdn.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz -o kafka.tgz
tar -xvzf kafka.tgz
mv kafka_*/* /opt/kafka/
rm kafka.tgz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment