On-Prem RKE2 api-server HA with Kube-VIP

               ,        ,  _______________________________
   ,-----------|'------'|  |                             |
  /.           '-'    |-'  |_____________________________|
 |/|             |    |    
   |   .________.'----'    _______________________________
   |  ||        |  ||      |                             |
   \__|'        \__|'      |_____________________________|

|‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾|
|________________________________________________________|

|‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾|
|________________________________________________________|

What does this accomplish?

On-premise Kubernetes installations are unable to take advantage of cloud-native services like dynamic load-balancers. In order to ensure highly-available clusters, one must deploy a solution that will enable the Kubernetes API-server to be accessible in the event of node failure. While traditionally this would be accomplished using an on-premise load-balancer such as k8s-deployed metal-lb/nginx, these solutions would not work our case because the api-scheduler would not be available to schedule such deployments... therefore, chicken and the egg.

What is Kube-VIP?

The kube-vip project provides High-Availability and load-balancing for both inside and outside a Kubernetes cluster

Learn more here

TLDR?

watch this video by Adrian

Instructions

Prereqs

In order to proceed with this guide, you will need the following:

DNS server or modification of /etc/hosts with the node hostnames and rke2 master HA hostname
firewalld turned off

Assumptions

In this guide, I will be setting up a 3-node HA RKE2 cluster. I use the .lol domain but swap out for the domain of your choosing.

Host	Type	IP	Notes
rke2a	VM	10.0.1.2	etcd
rke2b	VM	10.0.1.3	etcd
rke2c	VM	10.0.1.4	etcd
rke2master	Virtual-IP	10.0.1.5	You will define this IP on your own. Make sure that it is not currently allocated to a node (and remove from DHCP allocation)

If you do not have a DNS server available/configured, the /etc/hosts file on each node will need to include the following.

rke2a 10.0.1.2
rke2b 10.0.1.3
rke2c 10.0.1.4
rke2master 10.0.1.5

1- Bootstrap the Master (rke2a)

The secret behind on-prem HA is kube-vip. We are going to modify their recommended approach with k3s and use it for rke2.

Pre-Installation of RKE2

export RKE2_VIP_IP=10.0.1.5 # IMPORTANT: Update this with the IP that you chose.

# create RKE2's self-installing manifest dir
mkdir -p /var/lib/rancher/rke2/server/manifests/

# Install the kube-vip deployment into rke2's self-installing manifest folder
curl -sL kube-vip.io/k3s |  vipAddress=${RKE2_VIP_IP} vipInterface=eth0 sh | sudo tee /var/lib/rancher/rke2/server/manifests/vip.yaml
# Find/Replace all k3s entries to represent rke2
sed -i 's/k3s/rke2/g' /var/lib/rancher/rke2/server/manifests/vip.yaml

# create the rke2 config file
mkdir -p /etc/rancher/rke2
touch /etc/rancher/rke2/config.yaml
echo "tls-san:" >> /etc/rancher/rke2/config.yaml 
echo "  - ${HOSTNAME}.lol" >> /etc/rancher/rke2/config.yaml
echo "  - ${HOSTNAME}" >> /etc/rancher/rke2/config.yaml
echo "  - rke2master.lol" >> /etc/rancher/rke2/config.yaml
echo "  - rke2master" >> /etc/rancher/rke2/config.yaml

## Optional but recommended
# k9s - ncurses-based k8s dashboard
wget https://github.com/derailed/k9s/releases/download/v0.24.2/k9s_Linux_x86_64.tar.gz -O /tmp/k9s.tgz ; cd /tmp; tar zxvf k9s.tgz ; chmod +x ./k9s; mv ./k9s /usr/local/bin

# update path with rke2-binaries
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc ; echo 'export PATH=${PATH}:/var/lib/rancher/rke2/bin' >> ~/.bashrc ; echo 'alias k=kubectl' >> ~/.bashrc ; source ~/.bashrc ;

Install RKE2:

curl -sfL https://get.rke2.io | sh -
systemctl enable rke2-server.service
systemctl start rke2-server.service
sleep 90 #wait ~90 seconds for rke2 to be ready
kubectl get nodes -o wide # should show as ready

Testing the API server on the Virtual-IP (rke2master.lol)

In order for us to ensure that the virtual-ip is serving up the api-server, run the following commands:

mkdir -p $HOME/.kube
export VIP=rke2master
sudo cat /etc/rancher/rke2/rke2.yaml | sed 's/127.0.0.1/'$VIP'/g' > $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

KUBECONFIG=~/.kube/config kubectl get nodes -o wide

This will actually use the virtual-ip that kube-vip created for us at 10.0.1.5 and ensure that the rke2 api is successfully being served on that virtual-ip.

Adding additional RKE2 master nodes (rke2b, rke2c)

Well in order for our cluster to be highly available, naturally we are going to need to have additional masters.

On each of the additional masters, run the following commands:

# IMPORTANT: replace the following with the value of /var/lib/rancher/rke2/server/token from rke2a
export TOKEN=""

mkdir -p /etc/rancher/rke2
touch /etc/rancher/rke2/config.yaml
echo "token: ${TOKEN}" >> /etc/rancher/rke2/config.yaml
echo "server: https://rke2master.lol:9345" >> /etc/rancher/rke2/config.yaml
echo "tls-san:" >> /etc/rancher/rke2/config.yaml
echo "  - ${HOSTNAME}.lol" >> /etc/rancher/rke2/config.yaml
echo "  - ${HOSTNAME}" >> /etc/rancher/rke2/config.yaml
echo "  - rke2master.lol" >> /etc/rancher/rke2/config.yaml
echo "  - rke2master" >> /etc/rancher/rke2/config.yaml

curl -sfL https://get.rke2.io | sh -
systemctl enable rke2-server.service
systemctl start rke2-server.service

That's it! As you can see in the config file above, we are actually referencing the virtual-ip/host as the rke2 server and not a host's specific ip/host as we want the reference to move with the availability of etcd.

Coming across these notes and curious if anyone has faced the scenario where additional server nodes are unable to join the cluster due to a connection refused on the RKE2_VIP_IP at port 9345?

Firewall ports for RKE2 with Cillium (for a Linux only k8s cluster - probably the best networking option) 👍

Protocol	    Source	            Destination	        Description
TCP	 2379	    RKE2 server nodes	RKE2 server nodes	etcd client port
TCP	 2380	    RKE2 server nodes	RKE2 server nodes	etcd peer port
TCP	 2381	    RKE2 server nodes	RKE2 server nodes	etcd metrics port

TCP	 6443	    RKE2 agent nodes	RKE2 server nodes	Kubernetes API
TCP	 9345	    RKE2 agent nodes	RKE2 server nodes	RKE2 supervisor API

TCP	 10250	      All RKE2 nodes    All RKE2 nodes	    kubelet metrics
TCP	 30000-32767  All RKE2 nodes    All RKE2 nodes	    NodePort port range
TCP	 4240	      All RKE2 nodes    All RKE2 nodes	    Cilium CNI health checks
TCP	 4244	      All RKE2 nodes    All RKE2 nodes	    Cilium CNI Hubble Observability
UDP      8472	      All RKE2 nodes    All RKE2 nodes	    Cilium CNI VXLAN
ICMP     8/0	      All RKE2 nodes    All RKE2 nodes	    Cilium CNI health checks

bgulla/rke2_kubevip.md

On-Prem RKE2 api-server HA with Kube-VIP

What does this accomplish?

What is Kube-VIP?

TLDR?

Instructions

Prereqs

Assumptions

1- Bootstrap the Master (rke2a)

Pre-Installation of RKE2

Install RKE2:

Testing the API server on the Virtual-IP (rke2master.lol)

Adding additional RKE2 master nodes (rke2b, rke2c)

MrAmbiG commented Jul 27, 2022

esakarya commented Jul 27, 2022 •

edited

Loading

MrAmbiG commented Jul 27, 2022

busyboy77 commented Nov 12, 2022

brandtkeller commented Dec 14, 2022 •

edited

Loading

itoffshore commented Aug 18, 2024 •

edited

Loading

bgulla/rke2_kubevip.md

On-Prem RKE2 api-server HA with Kube-VIP

What does this accomplish?

What is Kube-VIP?

TLDR?

Instructions

Prereqs

Assumptions

1- Bootstrap the Master (rke2a)

Pre-Installation of RKE2

Install RKE2:

Testing the API server on the Virtual-IP (rke2master.lol)

Adding additional RKE2 master nodes (rke2b, rke2c)

MrAmbiG commented Jul 27, 2022

esakarya commented Jul 27, 2022 • edited Loading

MrAmbiG commented Jul 27, 2022

busyboy77 commented Nov 12, 2022

brandtkeller commented Dec 14, 2022 • edited Loading

itoffshore commented Aug 18, 2024 • edited Loading

esakarya commented Jul 27, 2022 •

edited

Loading

brandtkeller commented Dec 14, 2022 •

edited

Loading

itoffshore commented Aug 18, 2024 •

edited

Loading