Skip to content

Instantly share code, notes, and snippets.

View vinodkc's full-sized avatar

Vinod KC vinodkc

  • Databricks
  • Mountainview
  • 16:41 (UTC -07:00)
View GitHub Profile

Hi there 👋

Spark Listener Demo

This demonstrates Spark Job, Stage and Tasks Listeners

1) Start spark-shell

Welcome to
      ____              __
 / __/__ ___ _____/ /__

Spark Structured Streaming HWC integration

1) Setup Kafka topic

cd /usr/hdp/current/kafka-broker/bin/

./kafka-topics.sh --create --zookeeper c420-node2.coelab.cloudera.com:2181 --replication-factor 2 --partitions 3 --topic ss_input

Login to LLAP host node

A) Test with Spark-shell

step 1:

cd /tmp
wget https://raw.githubusercontent.com/dbompart/hive_warehouse_connector/master/hwc_info_collect.sh
chmod +x  hwc_info_collect.sh

Spark on Docker - HDP3 YARN

  1. Kerberize the cluster

  2. Enable CGroup from yarn and restart

To enable cgroups on an Ambari cluster, select YARN > Configs on the Ambari dashboard, then click CPU Isolation under CPU. Click Save, then restart all cluster components that require a restart

I got mount failure error: /sys/fs/cgroup/cpu/yarn Solution , run below command on all node manager hosts:

@vinodkc
vinodkc / HDP3-Spark structured streaming.md Kafka integration
Last active September 19, 2020 15:45
HDP3 - Spark structured streaming Kafka integration
A) Spark structured streaming Kafka integration - SASL_PLAINTEXT
1) Prerequisites
[consumer-user@c220-node1 sslss]$ ll
-rw------- 1 consumer-user root 144 Apr 21 08:56 consumer-user.keytab
-rw-rw-r-- 1 consumer-user consumer-user 229 Apr 21 09:40 kafka_client_jaas.conf
[consumer-user@c220-node1 sslss]$ cat kafka_client_jaas.conf
KafkaClient {
@vinodkc
vinodkc / kafka-cheat-sheet.md
Created March 1, 2019 15:06 — forked from ursuad/kafka-cheat-sheet.md
Quick command reference for Apache Kafka

Kafka Topics

List existing topics

bin/kafka-topics.sh --zookeeper localhost:2181 --list

Describe a topic

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic mytopic

Purge a topic

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic mytopic --config retention.ms=1000

... wait a minute ...