Introduction

Installing kubeflow on localmachine is not a simple task. Documentation on the official website might be outdated. At the time of writing, the solutions suggested include miniKF and microk8s. The later sets up GPU passthrough effortlessly.

Install microk8s
Enable microk8s features
Update kube-apiserver flags
- Reason
- Additional Background
  - Ambassador
  - Istio ingress gateway
Restart microk8s
Create kubeconfig
Manually install kubeflow
Access kubeflow dashboard
Pulling large images
Kubeflow pipelines
- Kubelet runtime
- Creating runs for kubeflow pipelines
  - Direct runs creation from notebook via kfp SDK without submitting a pipeline first.
    - Bind notebook workloads with the ml-pipeline service role
    - Attach email address to header of HTTP requests from this workload

The following gist highlights how to install kubeflow on a single linux (ubuntu) workstation using microk8s

Install microk8s

$ sudo snap install microk8s --classic --channel=1.19/stable

Enable microk8s features

$ microk8s enable dns dashboard storage gpu

Update kube-apiserver flags

Addend the following to /var/snap/microk8s/current/args/kube-apiserver:

--service-account-signing-key-file=${SNAP_DATA}/certs/serviceaccount.key
--service-account-issuer=kubernetes.default.svc

istio configuration is outside of kubeflow.

Reason

Allow the use of trustworthy JWTs. See kubeflow's readme Not doing so will lead to istio pods hanging with the follow error see git issue

Additional Background

Newer version of kubeflow (>= 1.0) has abandoned Ambassador, K8S API gateway, for istio's ingress gateway source

Ambassador

Ingress -->        Envoy        --> Ambassador --> Other services
              (JWT valdation)

Istio ingress gateway

Ingress  --> istio-ingressgateway --> Other services

Restart microk8s

$ microk8s stop
$ microk8s start

Create kubeconfig

sudo microk8s.kubectl config view --raw > $HOME/.kube/config

export KUBECONFIG=$HOME/.kube/config

Manually install kubeflow

At the point of writing, microk8s enable kubeflow is not working for me.

Download kfctl

$ wget https://github.com/kubeflow/kfctl/releases/download/v1.2.0/kfctl_v1.2.0-0-gbc038f9_linux.tar.gz

Prepare your environment

add path to kfctl binary to your $PATH
export the following env variables

export BASE_DIR=/opt/
export KF_NAME=<your_kubeflow_deployment_name>
# Set the path to the base directory where you want to store one or more
# Kubeflow deployments. For example, /opt/.
# Then set the Kubeflow application directory for this deployment.
export KF_DIR=${BASE_DIR}/${KF_NAME}

# Set the configuration file to use when deploying Kubeflow.
# The following configuration installs Istio by default. Comment out
# the Istio components in the config file to skip Istio installation.
# See https://github.com/kubeflow/kubeflow/pull/3663
export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

change to https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_istio_dex.v1.2.0.yaml if you want to have admin access (kudos to @kosehy)

Create empty directory

$ mkdir $HOME/kf_installation_temp && cd $HOME/kf_installation_temp

$ kfctl apply -V -f $CONFIG_URI

Access kubeflow dashboard

$ kubectl port-forward svc/istio-ingressgateway 8081:80 -n istio-system

Pulling large images

If you're pulling large images into microk8s, you might want to extend the pull-progress duration. Add the following to /var/snap/microk8s/current/args/kubelet

--image-pull-progress-deadline="10m"

Restarting microk8s might not be enough, you'll have to reboot the entire machine.

Kubeflow pipelines

Kubelet runtime

Pods from runs created by argo's workflow controller (underlying engine powering kfp) cannot be created unless the container runtime switched from remote (microk8s default) to docker (see github thread)

Edit the flags in /var/snap/microk8s/current/args/kubelet:

# --container-runtime=remote
# --container-runtime-endpoint=${SNAP_COMMON}/run/containerd.sock
--container-runtime=docker

Creating runs for kubeflow pipelines

After writing your pipeline, there are two ways to create a run based on your pipeline depending on whether if the pipeline is going to be reused. ie. for production or for experimentation

Production: (1) Submit pipeline and (2) create a (recurring or one off) run based on this pipeline.
Directly submitting a run via the kfp python SDK from a notebook (mostly for experimentation)

Direct runs creation from notebook via kfp SDK without submitting a pipeline first.

For creating runs directly from notebooks to kubeflow pipelines, one needs to authenticate as an authorized user to submit jobs.

Bind notebook workloads with the ml-pipeline service role

The following binds notebook-server workloads launched in the wesley namespace and using the default-editor service account (this would be the namespace where the jupyter notebooks are deployed) with the ServiceRole ml-pipeline-service, this belongs to kubeflow.

apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-wesley-namespace
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/wesley/sa/default-editor

To create this ServiceRole Binding, run kubectl apply -f <filename.yaml>

Attach email address to header of HTTP requests from this workload

See github thread, this is still a workaround until the contributors address this properly.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: add-header
  namespace: wesley
spec:
  configPatches:
  - applyTo: VIRTUAL_HOST
    match:
      context: SIDECAR_OUTBOUND
      routeConfiguration:
        vhost:
          name: ml-pipeline.kubeflow.svc.cluster.local:8888
          route:
            name: default
    patch:
      operation: MERGE
      value:
        request_headers_to_add:
        - append: true
          header:
            key: kubeflow-userid
            value: [email protected]
  workloadSelector:
    labels:
      notebook-name: wesley

kubeflow-userid should be the email address of the owner for the notebook server's namespace.
```
 $ kubectl get ns wesley -o yaml
```
notebook-name, check the labels for the notebook pod:
```
 $ kubectl get pods --show-labels
```

During applying the servicerolebinding.yaml,
I have the error message below:

gpu@gpu1:~/Downloads$ microk8s kubectl apply -f servicerolebinding.yaml
Error from server: error when creating "servicerolebinding.yaml": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: empty subjects are not allowed. Found an empty subject at index 0

After add the double space at the beginning of the source.principal part in servicerolebinding.yaml I can apply yaml file!

apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-wesley-namespace
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/wesley/sa/default-editor

Updated!! Thank you 🙇

etheleon/how-to-install-kubeflow1.2.md

Introduction

Install microk8s

Enable microk8s features

Update kube-apiserver flags

Reason

Additional Background

Ambassador

Istio ingress gateway

Restart microk8s

Create kubeconfig

Manually install kubeflow

Download kfctl

Prepare your environment

Create empty directory

Access kubeflow dashboard

Pulling large images

Kubeflow pipelines

Kubelet runtime

Creating runs for kubeflow pipelines

Direct runs creation from notebook via kfp SDK without submitting a pipeline first.

Bind notebook workloads with the ml-pipeline service role

Attach email address to header of HTTP requests from this workload

ynott commented Jan 5, 2021

Uh oh!

alessandroferrari commented Jan 6, 2021

Uh oh!

tritran-cotai commented Jan 7, 2021 •

edited

Loading

Uh oh!

kosehy commented Jan 13, 2021

Uh oh!

etheleon commented Jan 13, 2021

Uh oh!

etheleon commented Jan 13, 2021

Uh oh!

etheleon/how-to-install-kubeflow1.2.md

Introduction

Install microk8s

Enable microk8s features

Update kube-apiserver flags

Reason

Additional Background

Ambassador

Istio ingress gateway

Restart microk8s

Create kubeconfig

Manually install kubeflow

Download kfctl

Prepare your environment

Create empty directory

Access kubeflow dashboard

Pulling large images

Kubeflow pipelines

Kubelet runtime

Creating runs for kubeflow pipelines

Direct runs creation from notebook via kfp SDK without submitting a pipeline first.

Bind notebook workloads with the ml-pipeline service role

Attach email address to header of HTTP requests from this workload

ynott commented Jan 5, 2021

Uh oh!

alessandroferrari commented Jan 6, 2021

Uh oh!

tritran-cotai commented Jan 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kosehy commented Jan 13, 2021

Uh oh!

etheleon commented Jan 13, 2021

Uh oh!

etheleon commented Jan 13, 2021

Uh oh!

tritran-cotai commented Jan 7, 2021 •

edited

Loading