Last active
February 4, 2024 15:58
-
-
Save jjstill/8099669931cdfbb90ce6f4c307971514 to your computer and use it in GitHub Desktop.
Running Spark job on local kubernetes (minikube)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Starting minikube with 8Gb of memory and 3 CPUs | |
minikube --memory 8192 --cpus 3 start | |
# Creating separate Namespace for Spark driver and executor pods | |
kubectl create namespace spark | |
# Creating ServiceAccount and ClusterRoleBinding for Spark | |
kubectl create serviceaccount spark-serviceaccount --namespace spark | |
kubectl create clusterrolebinding spark-rolebinding --clusterrole=edit --serviceaccount=spark:spark-serviceaccount --namespace=spark | |
# Spark home dir | |
cd $SPARK_HOME | |
# Asking local environment to use Docker daemon inside the Minikube | |
eval $(minikube docker-env) | |
# Building Docker image from provided Dockerfile | |
docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile . | |
# Submitting SparkPi example job | |
# $KUBERNETES_MASTER can be taken from output of kubectl cluster-info | |
bin/spark-submit --master k8s://$KUBERNETES_MASTER --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.namespace=spark --conf spark.kubernetes.driver.pod.name=spark-pi-driver --conf spark.kubernetes.container.image=spark:latest --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-serviceaccount local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar | |
# Printing Spark driver's log | |
kubectl logs spark-pi-driver --namespace spark | |
# When the application completes, the executor pods terminate and are cleaned up, | |
# but the driver pod persists logs and remains in "completed" state. | |
# Deleting spark-pi-driver pod | |
kubectl delete pod spark-pi-driver --namespace spark | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
How to use SparkLauncher programmatically submit spark job to minikube? Any example is appreciated.