flowchart LR
subgraph router0
router0:net0("net0 (unnumbered)")
router0:net1("net1 (unnumbered)")
end
router0:net0---tor0:net0
router0:net1---tor1:net0
subgraph rack0
subgraph tor0
tor0:net0("net0 (unnumbered)")
tor0:net1("net1 (10.0.1.1/24)")
tor0:net2("net2 (10.0.2.1/24)")
end
subgraph kind-node0
kind-node0:net0("net0 (10.0.1.2/24)")
kind-node0:lxc0("lxc0")
end
subgraph kind-node1
kind-node1:net0("net0 (10.0.2.2/24)")
kind-node1:lxc0("lxc0")
end
subgraph netshoot-pod0
netshoot-pod0:eth0("eth0")
end
subgraph netshoot-pod1
netshoot-pod1:eth0("eth0")
end
tor0:net0-.-tor0:net1
tor0:net0-.-tor0:net2
tor0:net1---kind-node0:net0
tor0:net2---kind-node1:net0
kind-node0:net0-.-kind-node0:lxc0
kind-node1:net0-.-kind-node1:lxc0
kind-node0:lxc0---netshoot-pod0:eth0
kind-node1:lxc0---netshoot-pod1:eth0
end
subgraph rack1
subgraph tor1
tor1:net0("net0 (unnumbered)")
tor1:net1("net1 (10.0.3.1/24)")
tor1:net2("net2 (10.0.4.1/24)")
end
subgraph kind-node2
kind-node2:net0("net0 (10.0.3.2/24)")
kind-node2:lxc0("lxc0")
end
subgraph kind-node3
kind-node3:net0("net0 (10.0.4.2/24)")
kind-node3:lxc0("lxc0")
end
subgraph netshoot-pod2
netshoot-pod2:eth0("eth0")
end
subgraph netshoot-pod3
netshoot-pod3:eth0("eth0")
end
tor1:net0-.-tor1:net1
tor1:net0-.-tor1:net2
tor1:net1---kind-node2:net0
tor1:net2---kind-node3:net0
kind-node2:net0-.-kind-node2:lxc0
kind-node3:net0-.-kind-node3:lxc0
kind-node2:lxc0---netshoot-pod2:eth0
kind-node3:lxc0---netshoot-pod3:eth0
end
- Create an environment
make deploy
- Confirm Cilium is running with native routing mode and Pod to Pod communication is not working
# Show Cilium configuration
cat values.yaml
# Push 'd' on the mtr window and keep this MTR running on the window for later use.
SRC_POD=$(kubectl get pods -o wide | grep control-plane | awk '{ print($1); }')
DST_IP=$(kubectl get pods -o wide | grep worker3 | awk '{ print($6); }')
kubectl exec -it $SRC_POD -- mtr $DST_IP
- Show neighboring status of routers and confirm Cilium nodes are not peering with them
make show-neighbors
- Show how CiliumBGPPeeringPolicy looks like
# Important point: It is Per-rack configuration (not per-node).
cat cilium-bgp-peering-policies.yaml
- Apply policy and see MTR result changed
# Expected behavior is MTR shows the route like Pod->kind-control-plane->tor0->router0->tor1->kind-worker3->Pod
make apply-policy
Create/Destroy/Reload the Lab
# Create
make deploy
# Destroy
make destroy
# Reload
make reload
Apply/Delete BGP policies
# Apply
make apply-policy
# Delete
make delete-policy
Show neighboring state of all nodes
make show-neighbors
Show RIB of all router nodes
make show-rib
Show FIB of all nodes
make show-fib
The essential part is a network-mode
definition of serverX nodes in the topo.yaml
. container:<containerX>
is an option to make a container which shares a network namespace with containerX
. By using this, we can define a containerlab nodes that shares the network namespace with Kind nodes. By defining links to those containers, we can wire veth to the Kind nodes.
Now, the Kind nodes connected to the containerlab network, but their default route still points to the Docker network bridge. To route the packets to the containerlab side, we need to modify default route. We also need to assign IP address to net0
interfaces in this case.
Since we modify the default route of Kind nodes, nodes are losing the connectivity to the outside world (they are relying on the Docker's NAT in regular case). To solve this problem, we need to prepare the NAT node in containerlab network side. In this example router0 is in charge of that (it advertises default route with BGP).
Currently, Containerlab has a tricky spec that for container:<containerX>
which containerX
must be the name of the node inside the containerlab manifest (e.g. router0, tor0, etc...). So, just specifying Kind node's container name in containerX
doesn't work. To overcome this limitation, we need special naming convention for Kind nodes. Please don't change the Kind's cluster name and Containerlab's lab name. If you want to change that, please ask me.