aojea/KIND_Networking.md

Networking scenarios [Linux Only]

KIND runs Kubernetes cluster in Docker, and leverages Docker networking for all the network features: port mapping, IPv6, containers connectivity, etc.

Docker Networking

KIND uses a docker user defined network.

It creates a bridge named kind

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
3a6a365daaaa        bridge              bridge              local
8405b115c9c1        host                host                local
df31e026ef86        kind                bridge              local
edf13138d1c3        none                null                local

it creates a new bridge in the host, you can get more details using docker network inspect

$ $ docker network inspect kind
[
    {
        "Name": "kind",
        "Id": "df31e026ef8641d1a106aa4a0fc4f0c2129049afc69346a6293457d2564cb007",
        "Created": "2020-07-29T00:36:06.115488848+02:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": true,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.19.0.0/16",
                    "Gateway": "172.19.0.1"

Docker also creates iptables NAT rules on the Docker host that masquerade the traffic from the containers connected to the kind bridge to connect to the outside world.

Kubernetes Networking

The Kubernetes network model implies end to end connectivity without NAT between Pods.

By default, KIND uses its own CNI plugin, Kindnet, that install the corresponding routes and iptables rules on the cluster nodes.

Services

Kubernetes Services are an abstract way to expose an application running on a set of Pods as a network service.

The are different types of Services:

Cluster IP
NodePort
LoadBalancer
Headless
ExternalName

In Linux hosts, you can access directly the Cluster IP address of the services just adding one route to the configured serviceSubnet parameters via any of the nodes that belong to the cluster, so there is no need to use NodePort or LoadBalancer services.

Multiple clusters

As we explained before, all KIND clusters are sharing the same Docker network, that means that all the cluster nodes have direct connectivity.

If we want to spawn multiple clusters and provide Pod to Pod connectivity between different clusters, first we have to configure the cluster networking parameters to avoid address overlapping.

Example: Kubernetes multi-region

Let's take an example emulating 2 clusters: A and B.

For cluster A we are going to use the following network parameters:

cat <<EOF | kind create cluster --name clustera --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  podSubnet: "10.110.0.0/16"
  serviceSubnet: "10.115.0.0/16"
nodes:
- role: control-plane
- role: worker
EOF

And Cluster B:

cat <<EOF | kind create cluster --name clusterb --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  podSubnet: "10.220.0.0/16"
  serviceSubnet: "10.225.0.0/16"
nodes:
- role: control-plane
- role: worker
EOF

All the nodes in each cluster will have routes to the podsSubnets assigned to the nodes of the same cluster. If we want to provide Pod to Pod connectivity between different clusters we just have to do the same in each node.

We can obtain the routes using kubectl:

$ kubectl --context kind-clustera get nodes -o=jsonpath='{range .items[*]}{"ip route add "}{.spec.podCIDR}{" via "}{.status.addresses[?(@.type=="InternalIP")].address}{"\n"}{end}'
ip route add 10.110.0.0/24 via 172.17.0.4
ip route add 10.110.1.0/24 via 172.17.0.3
ip route add 10.110.2.0/24 via 172.17.0.2

$kubectl --context kind-clusterb get nodes -o=jsonpath='{range .items[*]}{"ip route add "}{.spec.podCIDR}{" via "}{.status.addresses[?(@.type=="InternalIP")].address}{"\n"}{end}'
ip route add 10.120.0.0/24 via 172.17.0.7

Then we just need to install the routes obtained from cluterA in each node of clusterB and vice versa, it can be automated with a script like this:

for c in "clustera clusterb"; do
  for n in $(kind get nodes --name ${c}); do
  # Add static routes to the pods in the other cluster
  docker exec ${n} ip route add <POD_SUBNET> via <NODE_IP>
  # Add static route to the service in the other cluster
  # We just need to add one route only for services
  docker exec ${n} ip route add <SCV_SUBNET> via <NODE_IP>
  ...
done

Example: Emulate external VMs

By default Docker will attach all containers to the docker0 bridge:

$ docker run -d --network kind --name alpine alpine tail -f /dev/null
8b94e9dabea847c004ce9fd7a69cdbc82eb93e31857c25c0a8872706efb08a4d
$ docker exec -it alpine ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0

That means that Pods will be able to reach other Docker containers that does not belong to any KIND cluster, however, the Docker container will not be able to answer to the Pod IP address until we install the corresponding routes.

We can solve it installing routes in the new containers to the Pod Subnets in each Node, as we explained in previous section.

Example: Multiple network interfaces and Multi-Home Nodes

There can be scenarios that requite multiple interfaces in the KIND nodes to test multi-homing, VLANS, CNI plugins, etc.

Typically, you will want to use loopback addresses for communication. We can configure those loopback addresses after the cluster has been created, and then modify the Kubernetes components to use them.

When creating the cluster we must add the loopback IP address of the control plane to the certificate SAN (the apiserver binds to "all-interfaces" by default):

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
# add the loopback to apiServer cert SANS
kubeadmConfigPatchesJSON6902:
- group: kubeadm.k8s.io
  kind: ClusterConfiguration
  patch: |
    - op: add
      path: /apiServer/certSANs/-
      value: my-loopback

In order to create the network interfaces, you can use tools like koko to create new networking interfaces on the KIND nodes, you can check several examples of creating complex topologies with containers in this repo https://github.com/aojea/frr-lab.

Other alternative is using Docker user defined bridges:

LOOPBACK_PREFIX="1.1.1."
MY_BRIDGE="my_net2"
MY_ROUTE=10.0.0.0/24
MY_GW=172.16.17.1
# Create 2nd network
docker network create ${MY_BRIDGE}
# Configure nodes to use the second network
for n in $(kind get nodes); do
  # Connect the node to the second network
  docker network connect ${MY_BRIDGE} ${n}
  # Configure a loopback address
  docker exec ${n} ip addr add ${LOOPBACK_PREFIX}${i}/32 dev lo
  # Add static routes
  docker exec ${n} ip route add ${MY_ROUTE} via {$MY_GW}
done

After the cluster has been created, we have to modify, in the control-plane node, the kube-apiserver --advertise-address flag in the static pod manifest in /etc/kubernetes/manifests/kube-apiserver.yaml (once you write the file it restarts the pod with the new config):

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=172.17.0.4

and then change the node-ip flag for the kubelets on all the nodes:

root@kind-worker:/# more /var/lib/kubelet/kubeadm-flags.env 
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false --node-ip=172.17.0.4"

Finally restart the kubelets to use the new configuration with systemctl restart kubelet.

It's important to note that calling kubeadm init / join again on the node will override /var/lib/kubelet/kubeadm-flags.env. An alternative is to use /etc/default/kubelet.

jonassteinberg1 · 2021-05-09T22:51:30Z

I tested the "Multiple Clusters" section and spun an nginx pod on each cluster. When trying to curl b from a with b's nginx listening on 80 a 500 was being returned. Then, for in the midst of troubleshooting, I changed b's listening port to 1234 and then curl b from a and boom: 200! Why is this?

aojea · 2021-05-10T07:44:08Z

it is weird to receive a 500, that means that there is connectivity, is it possible you are reaching a different server?

Nevertheless, I've created a plugin to automate this process https://github.com/aojea/kind-networking-plugins/blob/main/multicluster/README.md

You can see a full demo in the kubecon presentation https://github.com/aojea/kind-networking-plugins/blob/main/README.md