Skip to content

Instantly share code, notes, and snippets.

@vikasbajaj
Last active July 15, 2021 07:06
Show Gist options
  • Save vikasbajaj/7c4ed1b01d6b0d91ba7531f4e04d1f10 to your computer and use it in GitHub Desktop.
Save vikasbajaj/7c4ed1b01d6b0d91ba7531f4e04d1f10 to your computer and use it in GitHub Desktop.
Lab-3: MSK with Glue Schema Registery
-------------------------------------
1. Go to Cloud9 console and open your environment IDE
2. In a Cloud 9 terminal use the following command to ssh into Kafka EC2 instance
Note: change the IP address with Kafka EC2 instance private IP address running in your AWS account
ssh -i msk-workshop-pem.pem [email protected]
--------run all the steps in Kafka EC2 Instance terminal/shell---------------
3. set the following environment variables
Note: change the Cluster ARN with your MSK cluster ARN
export CLUSTER_ARN=<your msk cluster arn>
export BS=$(aws kafka get-bootstrap-brokers --cluster-arn $CLUSTER_ARN --output text)
export ZK=$(aws kafka describe-cluster --cluster-arn $CLUSTER_ARN --output json | jq ".ClusterInfo.ZookeeperConnectString" | tr -d \")
4. Run the following cli command from /home/ec2-user/kafka to create the required topic
./bin/kafka-topics.sh --bootstrap-server $BS \
--create --topic listings_topic \
--partitions 3 \
--replication-factor 2
./bin/kafka-topics.sh --bootstrap-server $BS \
--create --topic listings_topic_json \
--partitions 3 \
--replication-factor 2
5. Verify topic creation
./bin/kafka-topics.sh --bootstrap-server $BS --list
./bin/kafka-topics.sh --bootstrap-server $BS --topic listings_topic --describe
./bin/kafka-topics.sh --bootstrap-server $BS --topic listings_topic_json --describe
6. Copy Kafka Bootstrap server address by running the following command
echo $BS
# copy the first address e.g.
b-6.mskworkshopcluste.xxxxx.c2.kafka.ap-southeast-2.amazonaws.com:9092
7. Clone github repo https://github.com/vikasbajaj/msk-kafka-workshop.git in your EC2 instance in /home/ec2-user directory
cd ~
git clone https://github.com/vikasbajaj/kafka-workshop.git
cd msk-kafka-workshop/uber-jars
Glue Schema Registry
-------------------
1. Go to Glue service Console, click on "Schema registries" in the left pane
2. Click "Add Registry"
3. Specify a registry name = demo-workshop-registry
(use the same name for this workshop)
4. Click "Add Registry"
Producer
--------
1. Run Kafka Producer which produces messages (aligns with AVRO schema)
Go to back to Cloud9 terminal, and run the following from Kafka EC2 Instance shell
pwd (should be msk-kafka-workshop/uber-jars)
// to start the producer, you need to pass at least one bootstrap server as an in argument as shown in the command below
java -jar producer-sr-avro-jar-with-dependencies.jar b-6.mskworkshopcluste.xxxxx.c2.kafka.ap-southeast-2.amazonaws.com:9092
2. Producer should start sending the messages, some messages will fail. I have intentionally added an extra field in some messages
Consumer
--------
1. Open a new terminal window in Cloud 9 and ssh into Kafka EC2 instance
Note: change the IP address with Kafka EC2 instance private IP address running in your AWS account
ssh -i msk-workshop-pem.pem [email protected]
--------run all the steps in Kafka EC2 Instance terminal/shell---------------
2. set the following environment variables
Note: change the Cluster ARN with your MSK cluster ARN
export CLUSTER_ARN=<your msk cluster arn>
export BS=$(aws kafka get-bootstrap-brokers --cluster-arn $CLUSTER_ARN --output text)
export ZK=$(aws kafka describe-cluster --cluster-arn $CLUSTER_ARN --output json | jq ".ClusterInfo.ZookeeperConnectString" | tr -d \")
3. From /home/ec2-user/msk-kafka-workshop/uber-jars execute the following command to start the consumer
cd /home/ec2-user/msk-kafka-workshop/uber-jars
echo $BS
copy first broker address
//to start the consumer, you need to pass at least one bootstrap server as an argument as shown in the command below
java -jar consumer-sr-avro-jar-with-dependencies.jar b-6.mskworkshopcluste.xxxxx.c2.kafka.ap-southeast-2.amazonaws.com:9092
4. you should see the messages in plaintext
----------JSON schema-----------------------
1. CTRL + C Producer
2. CTRL + C Consumer
3. from the Producer terminal
pwd
(should be /home/ec2-user/msk-kafka-workshop/uber-jars)
echo $BS
copy first broker address
//to start the consumer, you need to pass at least one bootstrap server as an argument as shown in the command below
java -jar producer-sr-json-jar-with-dependencies.jar b-6.mskworkshopcluste.xxxxx.c2.kafka.ap-southeast-2.amazonaws.com:9092
//you should see sent message 0
4. from the Consumer terminal
pwd
(should be /home/ec2-user/msk-kafka-workshop/uber-jars)
echo $BS
copy first broker address
//to start the consumer, you need to pass at least one bootstrap server as an argument as shown in the command below
java -jar consumer-sr-json-jar-with-dependencies.jar b-6.mskworkshopcluste.xxxxx.c2.kafka.ap-southeast-2.amazonaws.com:9092
//you should see the message.
5. Go to Glue console and click on the schema registry, you should see two schemas registered (AVRO and Json)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment