Skip to content

Instantly share code, notes, and snippets.

@parj
Created January 19, 2024 06:58
Show Gist options
  • Save parj/4c26e48e1167360c2be562f773841a2a to your computer and use it in GitHub Desktop.
Save parj/4c26e48e1167360c2be562f773841a2a to your computer and use it in GitHub Desktop.
Spark on Kuberentes
# (optional) if using minikube switch docker context
eval $(minikube docker-env)
# Build Spark
bin/docker-image-tool.sh -r docker.io/myrepo -t v3.5 -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
# Expose K8S API server (minikube)
kubectl proxy --port=8080
#Create spark service account and give permissions
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: spark
namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: default
name: pod-mgr
rules:
- apiGroups: ["rbac.authorization.k8s.io", ""] # "" indicates the core API group
resources: ["pods", "configmaps", "persistentvolumeclaims", "services"]
verbs: ["get", "watch", "list", "create", "delete", "deletecollection"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-mgr-spark
namespace: default
subjects:
- kind: ServiceAccount
name: spark
namespace: default
roleRef:
kind: ClusterRole
name: pod-mgr
apiGroup: rbac.authorization.k8s.io
---
EOF
#Spark submit
./bin/spark-submit \
--master k8s://http://localhost:8080 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=myrepo/spark-py:v3.5 \
local:///opt/spark/examples/jars/spark-examples_2.12-3.5.0.jar \
1000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment