Last active
September 10, 2024 13:33
-
-
Save Sysa/b225e200090bdf9202fb63b13a81a48d to your computer and use it in GitHub Desktop.
kubectl_cheatsheet
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Set default namespace: | |
kubectl config set-context --current --namespace=service-ds-recs-model-inference-search-83int | |
Set the cluster: | |
kubectl config set-cluster NAME | |
Check Helm history of releases: | |
helm history -n NAMESPACE RELEASE_NAME | |
Switch context: | |
kubectl config get-contexts | |
kubectl config set current-context MY-CONTEXT | |
AKS get creds (to ~/.kube/config): | |
az aks get-credentials --resource-group service-np-eun-ds-aks-rg --name service-np-eun-ds-aks --admin | |
See metrics (CPU/Memory): | |
kubectl top pods | |
See container logs (if it dies fast): | |
sudo kubectl logs addmodel-service-789b55d4f8-7l8mh --previous -n azureml-as-np-euw-ds-aml2 | |
Get all env variables from pods for default namespace: | |
for pod in $(sudo kubectl get po --output=jsonpath={.items..metadata.name}); do echo $pod && sudo kubectl exec -it $pod env; done | |
Get pods in namespace: | |
sudo kubectl get pods -n azureml-as-np-euw-ds-aml2 | |
Check env variables: | |
sudo kubectl exec addmodel-service-8d9669966-nt8vh -n azureml-as-np-euw-ds-aml2 -- env | |
See all in namespace: | |
sudo kubectl get all -n azureml-as-np-euw-ds-aml2 | |
Get deployments: | |
sudo kubectl get deployments -n azureml-as-np-euw-ds-aml2 | |
Delete/kill deployment (+replicasets + pods): | |
sudo kubectl delete deployment addmodel-service -n azureml-as-np-euw-ds-aml2 | |
Follow the logs in pod: | |
sudo kubectl logs -f pod/versiona-784f8c5f59-6hwft -n azureml-as-np-euw-ds-aml2 | |
Get last events in namespace, sorted by time: | |
sudo kubectl get events -n azureml-as-np-euw-ds-aml2 --sort-by='.lastTimestamp' | |
Export deployment configuration: | |
sudo kubectl get deployment versiona -n azureml-as-np-euw-ds-aml2 -o yaml > deployment_add_model.yaml | |
In case of broken deployment (will interrupt the service!): | |
sudo kubectl replace --force -f deployment_add_model.yaml | |
To see diff between exported configuration and working item: | |
sudo kubectl diff -f ./deployment_add_model. | |
To patch deployment in place (without exporting it): | |
sudo kubectl patch deployment.apps/addmodel-service -p '{"spec": {"strategy": {"rollingUpdate": {"maxUnavailable": 0}}}}' | |
To patch HPA scaling: | |
kubectl patch hpa model-api-hpa-1324697 -n euw-02-perftest -p {"spec":{"minReplicas": 30, "maxReplicas": 40}} --kubeconfig /home/AzDevOps/.kubeconfig | |
Rollout status and history | |
sudo kubectl rollout status deployment.apps/addmodel-service -n azureml-as-np-euw-ds-aml2 | |
sudo kubectl rollout history deployment.apps/addmodel-service -n azureml-as-np-euw-ds-aml2 | |
Cleanup the namespace: | |
sudo kubectl delete all --all -n azureml-as-np-euw-ds-aml2 | |
Remove pods in default namespace: | |
sudo kubectl get pods --no-headers=true | awk '/77c75b657f/{print $1}' | xargs sudo kubectl delete pod | |
1 pod per 1 node allocation check: | |
sudo kubectl get pods -n azureml-as-np-euw-ds-aml2 -o wide | awk '{print $7}' | sort | uniq -d | |
---- | |
AKS additional tweaks: | |
```yaml | |
deployment: | |
strategy: | |
rollingUpdate: | |
maxSurge: 10% | |
maxUnavailable: 0 | |
... | |
terminationGracePeriodSeconds: 60 # default value: 30 | |
... | |
lifecycle: | |
preStop: | |
exec: | |
command: | |
- sleep | |
- "60" | |
``` | |
Docker image registry login from AZ CLI: | |
az acr login -n asnpd | |
Docker re-tag: | |
docker pull repo1cr.io/dependencies/tritonserver-search:22.05 | |
docker list images | |
docker image tag repo1cr.io/dependencies/tritonserver-search:22.05 repo2.io/dependencies/tritonserver-search:22.05-tf2-text | |
docker push repo2.io/dependencies/tritonserver-search:22.05-tf2-text | |
fix docker compose on AML Compute instance: | |
mv etc/docker/daemon.json etc/docker/daemon_previous.json | |
systemctl daemon-reload | |
service docker restart | |
docker compose up | |
OR: | |
to re-install it: | |
snap remove docker | |
sudo apt-get update | |
sudo apt-get install moby-engine -f | |
sudo apt-get install moby-runc | |
sudo apt-get install moby-containerd | |
sudo apt-get install moby-engine | |
to add docker compose: | |
sudo apt-get install docker-compose-plugin | |
sudo service --status-all | |
sudo mv /etc/docker/daemon.json /etc/docker/daemon_old.json | |
sudo dockerd # just to check it works or no and see logs immediately | |
sudo service docker restart | |
docker compose up --build | |
https://stackoverflow.com/a/77486729/2957102 | |
To re-install it (remove snap version and get moby engine): | |
``` | |
sudo snap remove docker | |
sudo apt-get update | |
sudo apt-get install moby-engine -f | |
sudo apt-get install moby-runc | |
sudo apt-get install moby-containerd | |
sudo apt-get install moby-engine | |
``` | |
To add docker compose (if you need one): `sudo apt-get install docker-compose-plugin` | |
To check/start the service: | |
``` | |
sudo service --status-all # checks status of all services, find docker here | |
sudo dockerd # just to check it works or no and see logs immediately | |
sudo service docker restart # to start/restart the service | |
``` | |
for some cases when it says errors about `daemon.json`: | |
``` | |
sudo mv /etc/docker/daemon.json /etc/docker/daemon_old.json | |
``` | |
Start your container with docker compose plugin (make sure you have required dockerfile/docker-compose files): | |
``` | |
docker compose up --build | |
``` | |
---- | |
Helm charts (>3.7.0): | |
helm registry login asnpd.azurecr.io | |
OR az acr login -n asnpd | |
az acr repository show-tags -n asnpd --repository charts/inference-search-server-infra | |
az acr repository list --name asnpd | |
helm pull oci://asnpd.azurecr.io/dependencies/triton-server --version 0.1.41 | |
untar charts: tar -xzvf triton-server-0.1.41.tgz | |
do changes to charts | |
helm package triton-server: | |
helm push triton-server-0.1.41-search.tgz oci://asnpd.azurecr.io/dependencies/triton-server | |
az acr repository list --name asnpd | |
az acr repository show --name asnpd --repository dependencies/triton-server/triton-server | |
az acr repository show-tags --name asnpd --repository dependencies/triton-server/triton-server | |
to delete: | |
az acr repository delete --name asnpd --image dependencies/triton-server/triton-server:0.1.41-search | |
---- | |
Helm 3.6.3 (<3.7.0): | |
helm registry login asnpd.azurecr.io | |
pull: | |
helm chart pull asnpd.azurecr.io/dependencies/triton-server:0.1.41 | |
helm chart export asnpd.azurecr.io/dependencies/triton-server:0.1.41 --destination triton-server | |
push: helm dependencies update triton-server/triton-server | |
helm package --version=0.1.41-search-3.6.3-helm triton-server/triton-server | |
Successfully packaged chart and saved it to: ...-core-aks/triton-server-0.1.41-search-3.6.3-helm.tgz | |
helm chart save triton-server-0.1.41-search-3.6.3-helm.tgz asnpd.azurecr.io/dependencies/triton-server | |
helm chart push asnpd.azurecr.io/dependencies/triton-server:0.1.41-search-3.6.3-helm | |
--- | |
istio / kiali | |
istio UI dashboard via kiali | |
kubectl port-forward svc/kiali 20001:20001 -n istio-system | |
--- | |
UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress | |
kubectl get secret -A | grep recspoc | tail -n 1 | |
kubectl delete secret sh.helm.release.v1.recspoc-v1-2297354.v1 -n ds-recs-model-inference-recspoc | |
--- helm: | |
helm ls --namespace asos-ds-recs-model-inference-category-83int | |
restart a 'metrics' container in pod: | |
kubectl exec -it -n <yournamespace> <podname> -c metrics -- /bin/sh -c "kill 1" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment