- Turn off Adblocks
Last active
August 17, 2024 09:53
-
-
Save HoKim98/c50515e16d391e7c73b7d6dae9d66622 to your computer and use it in GitHub Desktop.
Resolve some issues while operating K8S on Bare-Metal
- Backup your ETCD data to the safe area.
- Open the
etcd.env
file on one of your ETCD cluster nodes and append below.ETCD_FORCE_NEW_CLUSTER=true
ETCD_INITIAL_CLUSTER=(remove the broken nodes)
- Restart
etcd
service. - Check whether
etcd
service is running.- Check whether the broken nodes are removed from the member list.
- Remove the
ETCD_FORCE_NEW_CLUSTER
flag and restartetcd
service again. - Wait some minutes and check whether your kubernetes cluster is recovered.
- Restarting
kubelet
is recommended: it will recover broken core k8s services. - Restarting your provisioning services are recommeded.
- Rebooting the nodes will resolve most of the issues about containers.
- Restarting
- Backup your data to the safe area
- ETCD: /opt/etcd/ /etc/etcd /etc/etcd.env
- Control Plane: /etc/kubernetes /var/lib/kubelet
- Rook Ceph: /var/lib/rook
- Drain the nodes
- Reinstall the OS
- Rook Ceph: DO NOT WIPE THE DATA VOLUME
- Restore the data and reinstall the K8S
- Undrain the nodes
- Add an ETCD node to existing kubernetes ETCD cluster.
etcdctl member add [new-node-name] --peer-urls=https://[new-node-ip]:2380
- You may use cert files to grant the command like below:
--cacert /etc/etcd/ssl/ca.pem
--cert /etc/etcd/ssl/admin-[old-node-k8s-name].pem
--key /etc/etcd/ssl/admin-[old-node-k8s-name]-key.pem
- Update
/etc/kubernetes/manifests/kube-apiserver.yaml
.--etcd-servers=https://[new-node-ip]:2379
- The kubernetes manifest directory may be differ (i.e. Kubespray)
- Restart
kubelet
service.systemctl restart kubelet.service
- Wait some seconds and check the K8S cluster is running.
- Remove the old ETCD node from your cluster.
etcdctl member remove [old-node-id]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment