Skip to content

Instantly share code, notes, and snippets.

@cpuguy83
Last active November 13, 2019 03:59
Show Gist options
  • Save cpuguy83/75f2f1c6556aa99118eef2830952c844 to your computer and use it in GitHub Desktop.
Save cpuguy83/75f2f1c6556aa99118eef2830952c844 to your computer and use it in GitHub Desktop.

This script helps to clean up vmss nodes that may have Azure disks attached to it that should not be due to bugs in Kubernetes. The specific case for this is that a disk has been re-attached to by Kubernetes when it should not have been.

This DOES NOT detect a bad node/disk, only assists in cleaning it up.

Usage

$ ./vmssfix.sh NODE_NAME PV_NAME

This will trigger a detach of the pvc from the vmss instance in Azure. It will not perform any destructive action without user confirmation.

#!/bin/bash
set -e -o pipefail
node="$1"
pvc="$2"
usage() {
echo $0 NODE PVC
}
if [ -z "${node}" ]; then
echo "node must not be empty"
usage
exit 1
fi
if [ -z "${pvc}" ]; then
echo "pvc must not be empty"
usage
exit 1
fi
provider_id="$(kubectl get node -o json ${node} | jq -r .spec.providerID | awk -F'azure://' '{ print $2 }')"
instance_id="$(basename ${provider_id})"
if [ -z "${instance_id}" ]; then
echo "could not find vmss instance id, bailing out"
exit 1
fi
rg="$(echo ${provider_id} | awk -F'/resourceGroups/' '{ print $2 }' | awk -F'/' '{ print $1 }')"
if [ -z "${rg}" ]; then
echo "could not find resource group name, bailing out"
exit 1
fi
vmss_name="$(echo ${provider_id} | awk -F'/virtualMachineScaleSets/' '{ print $2 }' | awk -F '/' '{ print $1 }')"
if [ -z "${vmss_name}" ]; then
echo "could not find vmss name, bailing out"
exit 1
fi
disks="$(az vmss show -g ${rg} -n ${vmss_name} --instance-id ${instance_id} -o json | jq -r '.storageProfile.dataDisks[] | "\(.managedDisk.id) \(.lun)"')"
echo
IFS=$'\n' x=($disks)
for i in ${x[@]}; do
if [[ ! "${i}" =~ "${pvc}" ]]; then
continue
fi
lun="$(echo $i | awk '{ print $2}' )"
echo az vmss disk detach --ids ${provider_id} --lun ${lun}
read -r -p "Are you sure? [y/N] " response
case "$response" in
[yY][eE][sS]|[yY])
az vmss disk detach --ids ${provider_id} --lun ${lun}
;;
*)
echo skipping
continue
;;
esac
done
@sylus
Copy link

sylus commented Nov 11, 2019

Thank you so much for this :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment