We will use the maifest way of installing Kubeflow -https://github.com/kubeflow/manifests
Create a Kind cluster with Service Account Signing key for API Server for Kubeflow to work (Istio Needs it) like below
cat <<EOF | kind create cluster --name=kubeflow --kubeconfig /home/alexpunnen/kindclusters/mycluster.yaml --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: ClusterConfiguration
apiServer:
extraArgs:
"service-account-issuer": "kubernetes.default.svc"
"service-account-signing-key-file": "/etc/kubernetes/pki/sa.key"
EOF
Save the config spec somewhere
manifests$ kind get kubeconfig --name kubeflow > ~/.kube/config
Clone the Kubeflow Manifests Repo and InstalL kubeflow by the advanced/manifests method
git clone https://github.com/kubeflow/manifests
cd manifests
Use the Install all together as one
while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
You may get erros in Istio pods / stuck in ContainerCreating - delete the pods and it should start again
MountVolume.SetUp failed for volume "istiod-ca-cert" : configmap "istio-ca-root-cert" not found
kubectl -n istio-system delete pod cluster-local-gateway-7bf6b98855-mxgft istio-ingressgateway-78bc678876-4bzgn
pod "cluster-local-gateway-7bf6b98855-mxgft" deleted
pod "istio-ingressgateway-78bc678876-4bzgn" deleted
alexpunnen@pop-os:~/manifests$ kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
authservice-0 1/1 Running 0 5m19s
cluster-local-gateway-7bf6b98855-ngqz8 1/1 Running 0 3s
istio-ingressgateway-78bc678876-glw9n 1/1 Running 0 3s
istiod-755f4cc457-ndlwp 1/1 Running 0 5m19s
There is a small bug in one of the manifest file for mysql kubeflow/manifests#2065
Correct that so that mysql and related pods come up
alex@pop-os:~/kubeflow/manifests$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
namespace: kubeflow
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20G
EOF
Kind clusters uses Process Namespace Sharing instead of Docker; So apply that manifests instead of docker; Else you will get a socket invalid error in your Pods
kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl delete -f -
kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user-pns | kubectl apply -f -
Now you can use an ingress-controller or LoadBalancer or port forward to access the Jupyter notebook
The simplest way port forwarding
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
Open your browser and visit http://localhost:8080. You should get the Dex login screen. Login with the default user's credential. The default email address is [email protected] and the default password is 12341234.
However LoadBalancer way is also very simple and in the long run better
Follow these instructions to install and confiture MetalLB
https://kind.sigs.k8s.io/docs/user/loadbalancer/
After that you can Patch your istio-ingressgateway service to type LoadBalancer like below; and access your cluster with the docker network IP - http://172.18.255.200/
~/manifests$ kubectl get svc/istio-ingressgateway -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway NodePort 10.96.140.228 <none> 15021:32149/TCP,80:31999/TCP,443:31895/TCP,31400:32732/TCP,15443:32015/TCP 28h
~/manifests$ kubectl patch svc/istio-ingressgateway -n istio-system -p '{"spec": {"type": "LoadBalancer"}}'
service/istio-ingressgateway patched
~/manifests$ kubectl get svc/istio-ingressgateway -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway LoadBalancer 10.96.140.228 172.18.255.200 15021:32149/TCP,80:31999/TCP,443:31895/TCP,31400:32732/TCP,15443:32015/TCP 28h
Create a NoteBook called test2 https://i.imgur.com/JdKHamv.png Note - If you are running in GCP leave the CPU request as 0
For Jupyter access to KFP Pipelien to work you need to add the following Envoy Filter kubeflow/pipelines#4976 (comment)
else you will get an error on kfp.Client Connect
client = kfp.Client()
print(client.list_experiments())
Internal error: Unauthenticated: Request header error: there is no user identity header.
Use the below for Jupyter workbook to workbook// note the namespace and notebook name; and change it as per your context.
cat << EOF | kubectl apply -f -
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: bind-ml-pipeline-nb-kubeflow-user-example-com
namespace: kubeflow
spec:
selector:
matchLabels:
app: ml-pipeline
rules:
- from:
- source:
principals: ["cluster.local/ns/kubeflow-user-example-com/sa/default-editor"]
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: kubeflow-user-example-com
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
workloadSelector:
labels:
notebook-name: test2
EOF
You can upload the following notebook https://colab.research.google.com/drive/1f_p4EVKReT57J4Maz4vRfhccJ_qVv03W?usp=sharing to Jupyter and test
Note that I faced multiple problems with the V2 Beta version of KFP pipeline; kubeflow/pipelines#6390, but V1 version works fine
Note a fully working instance will have the following pods in running state
alexpunnen@pop-os:~/kindclusters$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
auth dex-6f4f4fd769-s99tz 1/1 Running 1 20m
cert-manager cert-manager-7dd5854bb4-7nf5q 1/1 Running 0 20m
cert-manager cert-manager-cainjector-64c949654c-fln2v 1/1 Running 0 20m
cert-manager cert-manager-webhook-6bdffc7c9d-vm7t8 1/1 Running 0 20m
istio-system authservice-0 1/1 Running 0 20m
istio-system cluster-local-gateway-7bf6b98855-ngqz8 1/1 Running 0 15m
istio-system istio-ingressgateway-78bc678876-glw9n 1/1 Running 0 15m
istio-system istiod-755f4cc457-ndlwp 1/1 Running 0 20m
knative-eventing eventing-controller-6b4cc547b9-dt9xk 1/1 Running 0 20m
knative-eventing eventing-webhook-7497957865-lzb5x 1/1 Running 0 20m
knative-eventing imc-controller-c8d86c869-h6xt4 1/1 Running 0 20m
knative-eventing imc-dispatcher-7bf75b8999-dv8hx 1/1 Running 0 20m
knative-eventing mt-broker-controller-5596fd9c9-wd4cj 1/1 Running 0 20m
knative-eventing mt-broker-filter-8c699b678-4z9fb 1/1 Running 0 20m
knative-eventing mt-broker-ingress-f8b9b6cfc-nncw5 1/1 Running 0 20m
knative-serving activator-7d554f9d67-nz4j9 2/2 Running 1 18m
knative-serving autoscaler-549ccd665f-f5wcv 2/2 Running 1 18m
knative-serving controller-c548cfcff-xjv4p 2/2 Running 1 18m
knative-serving istio-webhook-68fddcc567-hbqrq 2/2 Running 1 18m
knative-serving networking-istio-5664b9fb9c-dr9pm 2/2 Running 1 18m
knative-serving webhook-6644fdc69-prnh2 2/2 Running 1 18m
kube-system coredns-f9fd979d6-4vwgf 1/1 Running 0 95m
kube-system coredns-f9fd979d6-r7b4k 1/1 Running 0 95m
kube-system etcd-kubeflow-control-plane 1/1 Running 0 95m
kube-system kindnet-nv7n8 1/1 Running 0 95m
kube-system kube-apiserver-kubeflow-control-plane 1/1 Running 0 95m
kube-system kube-controller-manager-kubeflow-control-plane 1/1 Running 0 95m
kube-system kube-proxy-g5m4w 1/1 Running 0 95m
kube-system kube-scheduler-kubeflow-control-plane 1/1 Running 0 95m
kubeflow-user-example-com ml-pipeline-ui-artifact-767659f9df-mcksg 2/2 Running 0 4m52s
kubeflow-user-example-com ml-pipeline-visualizationserver-6ff9f47c6b-54ktk 2/2 Running 0 4m52s
kubeflow admission-webhook-deployment-f5d8f47f8-hb9fm 1/1 Running 0 18m
kubeflow cache-deployer-deployment-6dbb64ddcd-6t9h6 2/2 Running 1 18m
kubeflow cache-server-f84f6bdcc-6qn6h 2/2 Running 0 18m
kubeflow centraldashboard-5fb844d56d-lhb2b 1/1 Running 0 18m
kubeflow jupyter-web-app-deployment-bdfb5d69f-c8df6 1/1 Running 0 18m
kubeflow katib-controller-7b98cd6865-7rvb9 1/1 Running 0 18m
kubeflow katib-db-manager-7689947dc5-m28bj 1/1 Running 2 18m
kubeflow katib-mysql-586f79b694-ssl98 1/1 Running 0 18m
kubeflow katib-ui-64fbdf4d94-m5lqr 1/1 Running 0 18m
kubeflow kfserving-controller-manager-0 2/2 Running 0 18m
kubeflow kubeflow-pipelines-profile-controller-6cfd6bf9bd-jj5sv 1/1 Running 0 18m
kubeflow metacontroller-0 1/1 Running 0 18m
kubeflow metadata-envoy-deployment-95b58bbbb-m9xkg 1/1 Running 0 18m
kubeflow metadata-grpc-deployment-7cb87744c7-75zw8 2/2 Running 3 18m
kubeflow metadata-writer-76b6b98985-ktbmd 2/2 Running 1 18m
kubeflow minio-5b65df66c9-w5skh 2/2 Running 0 18m
kubeflow ml-pipeline-84858dd97b-7l7ww 2/2 Running 1 18m
kubeflow ml-pipeline-persistenceagent-6ff46967ff-k5kwn 2/2 Running 1 18m
kubeflow ml-pipeline-scheduledworkflow-66bdf9948d-k2ccl 2/2 Running 0 18m
kubeflow ml-pipeline-ui-867664b965-gzm9b 2/2 Running 0 18m
kubeflow ml-pipeline-viewer-crd-64dddf4597-vnxgp 2/2 Running 1 18m
kubeflow ml-pipeline-visualizationserver-7f88f8b84b-qwzgq 2/2 Running 0 18m
kubeflow mpi-operator-d5bfb8489-dltng 1/1 Running 0 18m
kubeflow mxnet-operator-6cffc568b7-p9n2p 1/1 Running 0 18m
kubeflow mysql-f7b9b7dd4-5wl62 2/2 Running 0 18m
kubeflow notebook-controller-deployment-c88b44b79-g2vrl 1/1 Running 0 18m
kubeflow profiles-deployment-5c94fd8fbf-b9vlt 2/2 Running 0 18m
kubeflow pytorch-operator-56bffbbd86-rllxr 2/2 Running 0 18m
kubeflow tensorboard-controller-controller-manager-d7c68d6df-dvc5h 3/3 Running 1 18m
kubeflow tensorboards-web-app-deployment-59ff4c7bd8-852k9 1/1 Running 0 18m
kubeflow tf-job-operator-859885c8c4-hxbgn 1/1 Running 0 18m
kubeflow volumes-web-app-deployment-6457c9bcfc-hdf5k 1/1 Running 0 18m
kubeflow workflow-controller-7b44676dff-lqfxx 2/2 Running 1 18m
kubeflow xgboost-operator-deployment-c6ddb584-878t4 2/2 Running 1 18m
local-path-storage local-path-provisioner-78776bfc44-lqfvv 1/1 Running 0 95m
Tried with the latest commit and it is failing.
If you would like to provide help you are welcome
See: https://kubeflow.slack.com/archives/C7REE0ETX/p1707764991199189
Thanks