Created
May 31, 2020 06:18
-
-
Save hsinhoyeh/c4614ea020616a8edbb675aa0b44827b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// ref: https://minikube.sigs.k8s.io/docs/tutorials/nvidia_gpu/ | |
// os: ubuntu 18.04 LTE | |
// hardware: GCE instance with GPU(nvidia-tesla0k80) | |
1. install nvidia driver | |
``` | |
https://gist.github.com/hsinhoyeh/495752aaf252bebdd2f3b51011dc060f | |
``` | |
2. install docker 19.03 or later | |
``` | |
apt-get update | |
apt-get install -y apt-transport-https ca-certificates curl software-properties-common | |
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - | |
add-apt-repository \ | |
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \ | |
$(lsb_release -cs) \ | |
stable" | |
apt-get update | |
apt-get install docker-ce docker-ce-cli containerd.io | |
usermod -aG docker $USER | |
``` | |
3. install nvidia docker plugin | |
``` | |
// ref: https://github.com/NVIDIA/nvidia-docker | |
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) | |
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - | |
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list | |
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit | |
sudo systemctl restart docker | |
``` | |
4. config nvidia as default driver | |
sudo apt-get install nvidia-container-runtime | |
vim /etc/docker/daemon.json | |
``` | |
{ | |
"default-runtime": "nvidia", | |
"runtimes": { | |
"nvidia": { | |
"path": "/usr/bin/nvidia-container-runtime", | |
"runtimeArgs": [] | |
} | |
} | |
} | |
``` | |
5. restart docker daemon | |
``` | |
sudo systemctl restart docker | |
``` | |
6. run test on gpu enabled container | |
``` | |
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi | |
``` | |
7. start minikube | |
``` | |
https://gist.github.com/hsinhoyeh/c5f60b4cbe41a1e6478ae5ea10f47497 | |
``` | |
8. install nvidia k8s plugins | |
``` | |
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/master/nvidia-device-plugin.yml | |
``` | |
9. | |
``` | |
kubectl get nodes -ojson | jq .items[].status.capacity | |
``` | |
9-1. test by running gpu pod | |
``` | |
apiVersion: v1 | |
kind: Pod | |
metadata: | |
name: cuda-vector-add | |
spec: | |
restartPolicy: OnFailure | |
containers: | |
- name: cuda-vector-add | |
image: "k8s.gcr.io/cuda-vector-add:v0.1" | |
resources: | |
limits: | |
nvidia.com/gpu: 1 | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment