Skip to content

Instantly share code, notes, and snippets.

@Sysa
Last active September 10, 2024 13:33
Show Gist options
  • Save Sysa/b225e200090bdf9202fb63b13a81a48d to your computer and use it in GitHub Desktop.
Save Sysa/b225e200090bdf9202fb63b13a81a48d to your computer and use it in GitHub Desktop.
kubectl_cheatsheet
Set default namespace:
kubectl config set-context --current --namespace=service-ds-recs-model-inference-search-83int
Set the cluster:
kubectl config set-cluster NAME
Check Helm history of releases:
helm history -n NAMESPACE RELEASE_NAME
Switch context:
kubectl config get-contexts
kubectl config set current-context MY-CONTEXT
AKS get creds (to ~/.kube/config):
az aks get-credentials --resource-group service-np-eun-ds-aks-rg --name service-np-eun-ds-aks --admin
See metrics (CPU/Memory):
kubectl top pods
See container logs (if it dies fast):
sudo kubectl logs addmodel-service-789b55d4f8-7l8mh --previous -n azureml-as-np-euw-ds-aml2
Get all env variables from pods for default namespace:
for pod in $(sudo kubectl get po --output=jsonpath={.items..metadata.name}); do echo $pod && sudo kubectl exec -it $pod env; done
Get pods in namespace:
sudo kubectl get pods -n azureml-as-np-euw-ds-aml2
Check env variables:
sudo kubectl exec addmodel-service-8d9669966-nt8vh -n azureml-as-np-euw-ds-aml2 -- env
See all in namespace:
sudo kubectl get all -n azureml-as-np-euw-ds-aml2
Get deployments:
sudo kubectl get deployments -n azureml-as-np-euw-ds-aml2
Delete/kill deployment (+replicasets + pods):
sudo kubectl delete deployment addmodel-service -n azureml-as-np-euw-ds-aml2
Follow the logs in pod:
sudo kubectl logs -f pod/versiona-784f8c5f59-6hwft -n azureml-as-np-euw-ds-aml2
Get last events in namespace, sorted by time:
sudo kubectl get events -n azureml-as-np-euw-ds-aml2 --sort-by='.lastTimestamp'
Export deployment configuration:
sudo kubectl get deployment versiona -n azureml-as-np-euw-ds-aml2 -o yaml > deployment_add_model.yaml
In case of broken deployment (will interrupt the service!):
sudo kubectl replace --force -f deployment_add_model.yaml
To see diff between exported configuration and working item:
sudo kubectl diff -f ./deployment_add_model.
To patch deployment in place (without exporting it):
sudo kubectl patch deployment.apps/addmodel-service -p '{"spec": {"strategy": {"rollingUpdate": {"maxUnavailable": 0}}}}'
To patch HPA scaling:
kubectl patch hpa model-api-hpa-1324697 -n euw-02-perftest -p {"spec":{"minReplicas": 30, "maxReplicas": 40}} --kubeconfig /home/AzDevOps/.kubeconfig
Rollout status and history
sudo kubectl rollout status deployment.apps/addmodel-service -n azureml-as-np-euw-ds-aml2
sudo kubectl rollout history deployment.apps/addmodel-service -n azureml-as-np-euw-ds-aml2
Cleanup the namespace:
sudo kubectl delete all --all -n azureml-as-np-euw-ds-aml2
Remove pods in default namespace:
sudo kubectl get pods --no-headers=true | awk '/77c75b657f/{print $1}' | xargs sudo kubectl delete pod
1 pod per 1 node allocation check:
sudo kubectl get pods -n azureml-as-np-euw-ds-aml2 -o wide | awk '{print $7}' | sort | uniq -d
----
AKS additional tweaks:
```yaml
deployment:
strategy:
rollingUpdate:
maxSurge: 10%
maxUnavailable: 0
...
terminationGracePeriodSeconds: 60 # default value: 30
...
lifecycle:
preStop:
exec:
command:
- sleep
- "60"
```
Docker image registry login from AZ CLI:
az acr login -n asnpd
Docker re-tag:
docker pull repo1cr.io/dependencies/tritonserver-search:22.05
docker list images
docker image tag repo1cr.io/dependencies/tritonserver-search:22.05 repo2.io/dependencies/tritonserver-search:22.05-tf2-text
docker push repo2.io/dependencies/tritonserver-search:22.05-tf2-text
fix docker compose on AML Compute instance:
mv etc/docker/daemon.json etc/docker/daemon_previous.json
systemctl daemon-reload
service docker restart
docker compose up
OR:
to re-install it:
snap remove docker
sudo apt-get update
sudo apt-get install moby-engine -f
sudo apt-get install moby-runc
sudo apt-get install moby-containerd
sudo apt-get install moby-engine
to add docker compose:
sudo apt-get install docker-compose-plugin
sudo service --status-all
sudo mv /etc/docker/daemon.json /etc/docker/daemon_old.json
sudo dockerd # just to check it works or no and see logs immediately
sudo service docker restart
docker compose up --build
https://stackoverflow.com/a/77486729/2957102
To re-install it (remove snap version and get moby engine):
```
sudo snap remove docker
sudo apt-get update
sudo apt-get install moby-engine -f
sudo apt-get install moby-runc
sudo apt-get install moby-containerd
sudo apt-get install moby-engine
```
To add docker compose (if you need one): `sudo apt-get install docker-compose-plugin`
To check/start the service:
```
sudo service --status-all # checks status of all services, find docker here
sudo dockerd # just to check it works or no and see logs immediately
sudo service docker restart # to start/restart the service
```
for some cases when it says errors about `daemon.json`:
```
sudo mv /etc/docker/daemon.json /etc/docker/daemon_old.json
```
Start your container with docker compose plugin (make sure you have required dockerfile/docker-compose files):
```
docker compose up --build
```
----
Helm charts (>3.7.0):
helm registry login asnpd.azurecr.io
OR az acr login -n asnpd
az acr repository show-tags -n asnpd --repository charts/inference-search-server-infra
az acr repository list --name asnpd
helm pull oci://asnpd.azurecr.io/dependencies/triton-server --version 0.1.41
untar charts: tar -xzvf triton-server-0.1.41.tgz
do changes to charts
helm package triton-server:
helm push triton-server-0.1.41-search.tgz oci://asnpd.azurecr.io/dependencies/triton-server
az acr repository list --name asnpd
az acr repository show --name asnpd --repository dependencies/triton-server/triton-server
az acr repository show-tags --name asnpd --repository dependencies/triton-server/triton-server
to delete:
az acr repository delete --name asnpd --image dependencies/triton-server/triton-server:0.1.41-search
----
Helm 3.6.3 (<3.7.0):
helm registry login asnpd.azurecr.io
pull:
helm chart pull asnpd.azurecr.io/dependencies/triton-server:0.1.41
helm chart export asnpd.azurecr.io/dependencies/triton-server:0.1.41 --destination triton-server
push:
helm dependencies update triton-server/triton-server
helm package --version=0.1.41-search-3.6.3-helm triton-server/triton-server
Successfully packaged chart and saved it to: ...-core-aks/triton-server-0.1.41-search-3.6.3-helm.tgz
helm chart save triton-server-0.1.41-search-3.6.3-helm.tgz asnpd.azurecr.io/dependencies/triton-server
helm chart push asnpd.azurecr.io/dependencies/triton-server:0.1.41-search-3.6.3-helm
---
istio / kiali
istio UI dashboard via kiali
kubectl port-forward svc/kiali 20001:20001 -n istio-system
---
UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
kubectl get secret -A | grep recspoc | tail -n 1
kubectl delete secret sh.helm.release.v1.recspoc-v1-2297354.v1 -n ds-recs-model-inference-recspoc
--- helm:
helm ls --namespace asos-ds-recs-model-inference-category-83int
restart a 'metrics' container in pod:
kubectl exec -it -n <yournamespace> <podname> -c metrics -- /bin/sh -c "kill 1"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment