Installing kubeflow on localmachine is not a simple task. Documentation on the official website might be outdated. At the time of writing, the solutions suggested include miniKF and microk8s. The later sets up GPU passthrough effortlessly.
- Install microk8s
- Enable microk8s features
- Update kube-apiserver flags
- Restart microk8s
- Create kubeconfig
- Manually install kubeflow
- Access kubeflow dashboard
- Pulling large images
- Kubeflow pipelines
The following gist highlights how to install kubeflow on a single linux (ubuntu) workstation using microk8s
$ sudo snap install microk8s --classic --channel=1.19/stable$ microk8s enable dns dashboard storage gpuAddend the following to /var/snap/microk8s/current/args/kube-apiserver:
--service-account-signing-key-file=${SNAP_DATA}/certs/serviceaccount.key
--service-account-issuer=kubernetes.default.svcistio configuration is outside of kubeflow.
Allow the use of trustworthy JWTs. See kubeflow's readme Not doing so will lead to istio pods hanging with the follow error see git issue
Newer version of kubeflow (>= 1.0) has abandoned Ambassador, K8S API gateway, for istio's ingress gateway source
Ingress --> Envoy --> Ambassador --> Other services
(JWT valdation)Ingress --> istio-ingressgateway --> Other services$ microk8s stop
$ microk8s startsudo microk8s.kubectl config view --raw > $HOME/.kube/configexport KUBECONFIG=$HOME/.kube/configAt the point of writing,
microk8s enable kubeflowis not working for me.
$ wget https://github.com/kubeflow/kfctl/releases/download/v1.2.0/kfctl_v1.2.0-0-gbc038f9_linux.tar.gz- add path to
kfctlbinary to your$PATH - export the following env variables
export BASE_DIR=/opt/
export KF_NAME=<your_kubeflow_deployment_name>
# Set the path to the base directory where you want to store one or more
# Kubeflow deployments. For example, /opt/.
# Then set the Kubeflow application directory for this deployment.
export KF_DIR=${BASE_DIR}/${KF_NAME}
# Set the configuration file to use when deploying Kubeflow.
# The following configuration installs Istio by default. Comment out
# the Istio components in the config file to skip Istio installation.
# See https://github.com/kubeflow/kubeflow/pull/3663
export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"change to
https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_istio_dex.v1.2.0.yamlif you want to have admin access (kudos to @kosehy)
$ mkdir $HOME/kf_installation_temp && cd $HOME/kf_installation_temp$ kfctl apply -V -f $CONFIG_URI$ kubectl port-forward svc/istio-ingressgateway 8081:80 -n istio-systemIf you're pulling large images into microk8s, you might want to extend the pull-progress duration.
Add the following to /var/snap/microk8s/current/args/kubelet
--image-pull-progress-deadline="10m"Restarting microk8s might not be enough, you'll have to reboot the entire machine.
Pods from runs created by argo's workflow controller (underlying engine powering kfp) cannot be created unless the container runtime switched from remote (microk8s default) to docker (see github thread)
Edit the flags in /var/snap/microk8s/current/args/kubelet:
# --container-runtime=remote
# --container-runtime-endpoint=${SNAP_COMMON}/run/containerd.sock
--container-runtime=dockerAfter writing your pipeline, there are two ways to create a run based on your pipeline depending on whether if the pipeline is going to be reused. ie. for production or for experimentation
- Production: (1) Submit pipeline and (2) create a (recurring or one off) run based on this pipeline.
- Directly submitting a run via the kfp python SDK from a notebook (mostly for experimentation)
For creating runs directly from notebooks to kubeflow pipelines, one needs to authenticate as an authorized user to submit jobs.
The following binds notebook-server workloads launched in the wesley namespace and using the default-editor service account (this would be the namespace where the jupyter notebooks are deployed) with the ServiceRole ml-pipeline-service, this belongs to kubeflow.
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
name: bind-ml-pipeline-nb-wesley-namespace
namespace: kubeflow
spec:
roleRef:
kind: ServiceRole
name: ml-pipeline-services
subjects:
- properties:
source.principal: cluster.local/ns/wesley/sa/default-editorTo create this ServiceRole Binding, run
kubectl apply -f <filename.yaml>
See github thread, this is still a workaround until the contributors address this properly.
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: wesley
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
workloadSelector:
labels:
notebook-name: wesley-
kubeflow-useridshould be the email address of the owner for the notebook server's namespace.$ kubectl get ns wesley -o yaml
-
notebook-name, check the labels for the notebook pod:$ kubectl get pods --show-labels
Thank you for the very good instruction. π
This looks better as follows