Skip to content

Instantly share code, notes, and snippets.

@dmartinol
Last active December 1, 2024 11:55
Show Gist options
  • Save dmartinol/c13544fcc597dc0ebd2715c7a755e7e5 to your computer and use it in GitHub Desktop.
Save dmartinol/c13544fcc597dc0ebd2715c7a755e7e5 to your computer and use it in GitHub Desktop.
Kubeflow installation on Mac M3 Pro

Install and Configure Kubeflow on Mac M3 Pro

Last update: Sep 10, 2024

Goal

This gist provides a validated procedure to install the complete Kubeflow platform on a Mac with an ARM chip (M3 Pro).

As of September 2024, no packaged distributions are available for this setup, so we'll be using Kubeflow Manifests to deploy the software.

Additionally, we'll also install a default Model Registry instance and validate the setup.

TL;DR: Kubeflow manifests work as-is but they require a careful configuration of the Docker container platform to match the system prerequisites.

Hardware specs

  • Chip: Apple M3 Pro
  • Memory: 36GB
  • Storage 1TB

Solution overview

Kubeflow will be installed on a local Kubernetes cluster using Minikube running on a Docker engine with Colima as the container runtime.

Install and configure prerequisites

Docker via Colima

Install Docker engine and the Colima container runtime:

brew install docker
brew install colima

👇 Don't Miss these Key Launch and Setup Instructions

Start Colima matching the memory and CPU requirements:

colima start --memory 32 --cpu 20

Once started, connect to the VM to update the kernel subsystem configuration:

colima ssh
$ sudo su -
root@colima:~# echo "fs.inotify.max_user_instances=2280" >> /etc/sysctl.conf
root@colima:~# echo "fs.inotify.max_user_watches=1255360" >> /etc/sysctl.conf
root@colima:~# exit
logout
$ exit
logout

Apply the previous changes:

colima restart

Verify the engine configuration:

$ docker info | grep -E 'CPU|Memory'
 CPUs: 20
 Total Memory: 31.27GiB

Minikube

Install the Minikube Kubernetes cluster and launch the instance:

brew install minikube
minikube start --memory=32g --cpus=20

Verify the cluster configuration:

$ minikube ssh -- "cat /proc/meminfo" | grep MemTotal
MemTotal:       32793224 kB
$ minikube ssh -- "nproc"
20

Additionally, enable the metrics server to collect and aggregate resource metrics:

minikube addons enable metrics-server

kubectl CLI

Install the kubectl and kustomize CLI:

brew install kubectl
brew install kustomize

Install Kubeflow platform

Clone the latest stable branch of the Kubeflow manifests (as of today, it's v1.9.0):

git clone https://github.com/kubeflow/manifests.git
cd manifests
git checkout v1.9.0

Install with a single command using the example configuration:

docker login

kubectl create secret generic regcred \
    --from-file=.dockerconfigjson=$HOME/.docker/config.json \
    --type=kubernetes.io/dockerconfigjson

kustomize build example | kubectl apply -f -

Validate the deployment using the provided instructions.

Validate installation

Forward the Istio's Ingress-Gateway port, then open the Kubeflow console and login as [email protected]/12341234.

Install additional Model Registry instance

Follow Install Kubeflow Model Registry to install the example Model Registry instance. In short:

cd apps/model-registry/upstream
kubectl apply -k overlays/db
kubectl apply -k options/istio

👇 Patch Instructions

The installation is expected to fail because the default image of the model-registry-db deployment is not available for ARM64 architecture:

$ kubectl -n kubeflow get pods -l component=db
NAME                                 READY   STATUS             RESTARTS   AGE
model-registry-db-7c4bb9f76f-x7rw4   0/1     ImagePullBackOff   0          76m

$ minikube ssh
docker@minikube:~$ docker pull mysql:8.0.3
8.0.3: Pulling from library/mysql
no matching manifest for linux/arm64/v8 in the manifest list entries

Patch the deployment for Mac M3 with another image used by other Kubeflow components and restart the model-registry-deployment deployment:

kubectl patch deployment model-registry-db -n kubeflow \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/ml-pipeline/mysql:8.0.26"}]'
kubectl -n kubeflow rollout restart deployment model-registry-deployment

Finally, we can validate the deployment with:

kubectl wait --for=condition=available -n kubeflow deployment/model-registry-deployment --timeout=2m
kubectl logs -n kubeflow deployment/model-registry-deployment

Troubleshooting

Failed to call webhook

Symptom: The output of kubectl get event -n kubeflow contains warning like:

Error creating: Internal error occurred: failed calling webhook "namespace.sidecar-injector.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/inject?timeout=10s": dial tcp 10.96.188.113:443: connect: connection refused

Solution: Try to run this command once again:

kustomize build example | kubectl apply -f -

No Successfully pulled image message

Symptom: The output of kubectl get event -n kubeflow contains messages like Pulling image ABC" without the corresponding completion message like Successfully pulled image ABC`.

Solution: Pull the image manually from the minikube SSH console:

minikube ssh
$ docker pull ABC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment