Last active
March 30, 2021 07:28
-
-
Save rao-abdul-mannan/30708e85ac56d5d8e2f3b019d4e96f05 to your computer and use it in GitHub Desktop.
This gist explains how to connect jupyterhub with Spark2 on CDH 5.13 Cluster
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### This gist explains how to connect jupyterhub with Spark2 on CDH 5.13 Cluster. | |
Following below instructions, Spark can be configured with Jupyterhub on any cluster, standalone or locally etc | |
- Install jupyterhub by following instructions on official repo https://github.com/jupyterhub/jupyterhub | |
- Once installed, before configuring spark2 kernel first locate the jupyter's kernels directory. | |
On centos7, its under /usr/share/jupyter/kernels/ | |
- Assuming all jupyterhub kernel's are in /usr/share/jupyter/kernels/ | |
mkdir /usr/share/jupyter/kernels/pyspark2 | |
- Create pyspark2 kernel | |
touch /usr/share/jupyter/kernels/pyspark2/kernel.json | |
- Add following content | |
vi /usr/share/jupyter/kernels/pyspark2/kernel.json | |
{ | |
"argv": [ | |
"python3.6", | |
"-m", | |
"ipykernel_launcher", | |
"-f", | |
"{connection_file}" | |
], | |
"display_name": "Python3.6 + Pyspark(Spark 2.2.0)", | |
"language": "python", | |
"env": { | |
"PYSPARK_PYTHON": "/usr/bin/python3.6", | |
"SPARK_HOME": "/opt/cloudera/parcels/SPARK2/lib/spark2", | |
"HADOOP_CONF_DIR": "/etc/spark2/conf/yarn-conf", | |
"HADOOP_CLIENT_OPTS": "-Xmx2147483648 -XX:MaxPermSize=512M -Djava.net.preferIPv4Stack=true", | |
"PYTHONPATH": "/opt/cloudera/parcels/SPARK2/lib/spark2/python/lib/py4j-0.10.4-src.zip:/opt/cloudera/parcels/SPARK2/lib/spark2/python/", | |
"PYTHONSTARTUP": "/opt/cloudera/parcels/SPARK2/lib/spark2/python/pyspark/shell.py", | |
"PYSPARK_SUBMIT_ARGS": " --master yarn --deploy-mode client pyspark-shell" | |
} | |
} | |
- Update env settings according to your setup if it is different. | |
- Start jupyterhub & test build spark2 applications |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment