Skip to content

Instantly share code, notes, and snippets.

@danehans
Forked from pmichali/Calico and KubeAdm
Created February 3, 2017 17:30
Show Gist options
  • Save danehans/b857616eaf9124e8004a1fda55c936de to your computer and use it in GitHub Desktop.
Save danehans/b857616eaf9124e8004a1fda55c936de to your computer and use it in GitHub Desktop.
Provisioning to use Calico
1/20/2017
First, tearing down the cluster, by doing this on each node:
kubeadm reset
rm -rf .kube
systemctl start kubelet.service
Did yum -y update on all nodes as well. Downloaded http://docs.projectcalico.org/v1.6/getting-started/kubernetes/installation/hosted/kubeadm/calico.yaml and then ran "kubectl create -f calico.yaml".
Starting up cluster (with just the master node).
kubeadm init
Have:
kubeadm join --token=482742.64ff98647d01bf2c 10.87.49.77
kubectl taint nodes --all dedicated-
kubectl apply -f calico.yaml
kubectl get pods --all-namespaces
Issue: It is unable to download calico/ctl v0.23.1. It does not exist. Options are to use v1.0.0 (and use same for calico/node, to keep them similar?) or drop to v0.23.0 (and either keep calico/node at v0.23.1 or set it to v.23.0 too). Trying v1.0.0. Stopped cluster and then restarting, after updating calico.yaml.
Have:
kubeadm join --token=086198.58f8a53b0c2391be 10.87.49.77
Looks like calico/configure is not starting up (message about calicoctl unknown command pool). Also dns pod not coming up, says CNI config is uninitialized. Will try v0.23.0 for both node and ctl…
Have:
kubeadm join --token=b3be10.7356d594b9de79a2 10.87.49.77
Looks like everything I running…
[root@devstack-77 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-etcd-4zkxh 1/1 Running 0 10m
kube-system calico-node-c8pkm 2/2 Running 0 10m
kube-system calico-policy-controller-2swdt 1/1 Running 0 10m
kube-system dummy-2088944543-c5xlk 1/1 Running 0 12m
kube-system etcd-devstack-77 1/1 Running 0 11m
kube-system kube-apiserver-devstack-77 1/1 Running 1 11m
kube-system kube-controller-manager-devstack-77 1/1 Running 0 11m
kube-system kube-discovery-1769846148-8fp6j 1/1 Running 0 12m
kube-system kube-dns-2924299975-r04zb 4/4 Running 0 12m
kube-system kube-proxy-3lmlh 1/1 Running 0 12m
kube-system kube-scheduler-devstack-77 1/1 Running 0 11m
No IPv6. Need to add IPAM setting to calico.yaml, with "assign_ipv6": "true". Doing that and then restarting the cluster (not sure if I can just reapply).
Have:
kubeadm join --token=a68a48.184e49eb7ec5588e 10.87.49.77
Note: Issue 455 is closed, they pushed a tag v0.23.1 out for calico/ctl.
Looks like DNS pod is not coming up, and is showing this error:
14m 1s 372 {kubelet devstack-77} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2924299975-zxr50_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-2924299975-zxr50_kube-system(6fb8a3ae-df38-11e6-ae00-003a7d69f73c)\" using network plugins \"cni\": No configured Calico ools; Skipping pod"
01/23/2017
Issue is that IPv6 I not set up on the nodes. Added this to /etc/sysconfig/network:
NETWORKING_IPv6=yes
For the bridge interface (using br_mgmt), added:
IPV6INIT=yes
IPV6ADDR=2001:1::71/64
Used 71,77,78 for the three systems.
Next, brought up cloud and started Ubuntu (w/o IPv6) in two containers and checked ping.
kubeadm join --token=228e0d.1b18edb2e1fe7159 10.87.49.77
Did web.yml and redis-master-deployment.yml and then connected to latter and pinged the former with IPv4 address. Now, enabling IPv6 and trying again.
Note: net-tools for route command, tcpdump installed too. Have "ip -6 addr add <address> dev <dev>" to add an IP address. Have "ip -6 route" or "route -A inet6" with package installed.
kubeadm join --token=3b13ce.433093a70c2e47a0 10.87.49.77
The DNS pod is not running and we see the error:
5m 3s 28 {kubelet devstack-77} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2924299975-1m2c5_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-2924299975-1m2c5_kube-system(e76de7f4-e192-11e6-86d8-003a7d69f73c)\" using network plugins \"cni\": No configured Calico pools; Skipping pod"
Q: Do I need to use the br_mgmt IP addresses?
Q: Any other config?
Trying the following (can try IPv6 later and maybe other settings, like DNS?):
kubeadm init --api-advertise-addresses=172.18.96.26
Control plane never became ready. Instead, trying this:
kubeadm init --pod-network-cidr=2001:1::/64
Seems stuck creating test deployment. Trying adding "subnet" with value "2001:2::/64" to calico.yaml IPAM area and then creating cluster. Still seeing DNS not coming up.
Trying Again 1/26/2017
Seeing notes on reference page will try these steps. First, set global IP on each node, with 2001:1::#/64, where # is 71,77,78. Did this on br_mgmt interface (can try on br_api, if this doesn't work). Verified pings.
Bringing up cluster with calico yaml file, and only the assign_ipv6 flag set (no subnet).
kubeadm init
Have:
kubeadm join --token=0f1c24.963cd0640a394e46 10.87.49.77
kubectl taint nodes --all dedicated-
kubectl apply -f calico.yaml
kubectl get pods --all-namespaces
The DNS pod is showing this:
2m 2s 11 {kubelet devstack-77} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-2924299975-nqm5n_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-2924299975-nqm5n_kube-system(cff5cc0e-e3fd-11e6-a9ac-003a7d69f73c)\" using network plugins \"cni\": No configured Calico pools; Skipping pod"
Getting calicoctl using these instructions: https://github.com/projectcalico/calicoctl
git clone https://github.com/projectcalico/calicoctl.git $GOPATH/src/github.com/projectcalico/calicoctl
Install glide:
mkdir $GOPATH/bin
curl https://glide.sh/get | sh
cd /root/go/src/github.com/projectcalico/calicoctl
glide install -strip-vendor
make binary
cd $GOPATH
go build src/github.com/projectcalico/calicoctl/calicoctl/calicoctl.go
mv calicoctl bin/
cd
~/go/bin/calicoctl node --ip=10.87.49.77 --ipv6=2001:1::77 --libnetwork
Set the Calico datastore access information in the environment variables or
or supply details in a config file.
Already have Calico node container running. How do I do the required changes? They mention about creating pools, but there is not pool command for calicoctl.
Note: See this page (http://docs.projectcalico.org/v1.6/getting-started/kubernetes/installation/) has different steps to install calicoctl.
1/27/2017
Trying setting assign_ipv6 in addition to ipv6 in config file.
kubeadm join --token=6d8305.5794a9835accd11a 10.87.49.77
Info from lrobson and epeterson…
Need newer calico.yaml file from http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/hosted/kubeadm/.
They use BIRD, a software routing daemon that does BGP peering for nodes. http://bird.network.cz/
BGP issues: BGP won't accept IPv6 router IDs. Need IPv4 config, even if doing IPv6 BGP.
BIRD caveats: IPv6 ECMP doesn't work right in BIRD. IPv6 loopbacks also don't work correctly.
BIRD6 templates in https://github.com/projectcalico/calicoctl/tree/master/calico_node/filesystem/etc/calico/confd/templates
epeterson [11:48 AM]
I think that configuring v6 would just work as long as you got the prefixes the node was supposed to have into calico’s etcd. Not 100% sure of the exact logic the template mechanism uses to generate templates but unless there’s something that’s parsing specific strings, it would work
The tl;dr on v6netes is that the large majority of k8s users are doing things on cloud platforms. AWS didn’t support v6 until very recently, azure and GCE still don’t. I’m looking for a PR where a bunch of the issues are being discussed.
[12:00]
One fundamental problem that is being worked for next release is that k8s doesn’t support multiple networks or address families.
[12:00]
Well, not next release, 1.6 is a stability release. But 1.7 multiple networks is on the agenda.
Lrobson says need calico-ipam set to assign IPv6 addresses (http://docs.projectcalico.org/v2.0/reference/cni-plugin/configuration):
cni_network_config: |-
{
"name": "k8s-pod-network",
"type": "calico",
"etcd_endpoints": "__ETCD_ENDPOINTS__",
"log_level": "info",
"ipam": {
"type": "calico-ipam",
"assign_ipv6": "true"
},
"policy": {
"type": "k8s",
"k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
"k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/__KUBECONFIG_FILENAME__"
And… add pool here:
# The default IP Pool to be created for the cluster.
# Pod IP addresses will be assigned from this pool.
ippool.yaml: |
apiVersion: v1
kind: ipPool
metadata:
cidr: 2001:1::/64
spec:
ipip:
enabled: true
nat-outgoing: true
Trying yaml file w/o changes. All pods came up. Trying with changes above:
kubeadm join --token=849d9a.7a5aeb672e07a210 10.87.49.77
The DNS pod is not coming up. It fails with a message saying:
Failed to get IPv6 addresses for container veth
It keeps restarting. Went into calico-node container (Using docker exec-it <container-name> sh), and then looked at:
cat /etc/calico/confd/config/bird6_ipam.cfg
cat /etc/calico/confd/config/bird6_aggr.cfg
There is a bird.cfg and bird6.cfg in /etc/calico/confd/config/ area. On this setup. There is some interesting things. It shows the router ID as 34.34.34.6, which is on the t interface and not br_api. I don't have global IPv6 on that I/F. May have just used that IP.
So, DNS container is not getting an IPv6 address. See:
[root@devstack-77 ~]# ~/go/bin/calicoctl node status
Calico process is running.
IPv4 BGP status
No IPv4 peers found.
IPv6 BGP status
No IPv6 peers found.
Eric Peterson, gave me a bird.cfg file to use. Need to create container/VM with BIRD so that have something for cluster to BGP peer with, so that calico nodes receive pod routes from other nodes.
Q: Do I create a VM or container with BIRD running to provide the BGP peer info?
Q: Could we use nexus9k as BGP peer?
Q: Who's versed in BGP?
1/30/2017 BIRD
Created a Dockerfile in ~/bird. Downloaded the Fedora binary for bird (http://bird.network.cz/?download&tdir=redhat/) to Mac and copied to ds-77. Added to Docker file. Issue, install failed as needs ncurses-lib-6.0 and Centos has 5.0 as latest.
Could not add bird repo to yum.repos.d/ as it uses FTP and cannot seem to access the site from within the lab.
Trying Ubuntu (http://bird.network.cz/?download&tdir=debian/), and have this working, using this Dockerfile:
COPY ./apt.key /
RUN apt-key add /apt.key
RUN apt-get update -y && apt-get upgrade -y
RUN apt-get install -y bird
CMD /bin/bash
Did "docker build ." and then "docker run -it <IMAGE#> bash". Need to label the image, I guess.
Need to configure Bird. Eric says, because inside of a container, may need an external static NAT for docker0 IP to the internal Bird IP. His example had static route, because Bird host was on a secondary subnet and needed a route to primary VLAN IP to peer with it (BGP needs next hop reachability -doesn't use route table to establish peering).
Q: Should I use VM instead? Could I use Bird running on host?
Q: With Bird in container, it is using docker0 I/F. Does it need IPv6 enabled? Does docker need IPv6 enabled? How?
In case need VM, can use virsh:
yum install qemu-kvm qemu-img
yum install virt-manager libvirt libvirt-python libvirt-client
yum groupinstall virtualization-client virtualization-platform virtualization-tools
restart libvirtd
systemctl status libvirtd
Or can use Virtualbox. Info? Decide on method, if do this.
Issue with auto detect IP for calico with multiple nodes (https://github.com/projectcalico/calicoctl/issues/873).
Ref: http://docs.projectcalico.org/v2.0/reference/private-cloud/l3-interconnect-fabric
May be able to remove other networks from host, and calico may autodetect, on the remaining network. Instead of picking a network, and getting 34.34.34.6, as it did in my case with the t interface.
ozdanborne [2:45 PM]
@pcm I think a few options are:
- setup your interfaces so calico autodetects the right one
- build a custom calicoctl.go / calico/node which autodetects the right one
- update the `BGPPeer`s after they come up.
a less "quick" solution would be for us to start putting together that `--can-reach` support (edited)
Going to try the first option tomorrow… setting up interfaces so that auto-detect works.
1/31/2017
Took IP/Mask off of t and br_mgmt interfaces(how to shutdown too?), and added IPv6 address 2001:2::77/64 to brapi interface.Did the same on other nodes. Starting up kubeadm to see what happens.
kubeadm join --token=10f438.23104a16fdf4dd69 10.87.49.77
Did taint and applied calico.yaml (after changed pool to 2001:2::/64. The calico-node container still has 34.34.34.6 IP and 2001:1::/64 for filter. Will tear down and then remove container images from docker and then retry. Had to delete stopped containers to get rid of the image.
kubeadm join --token=cdee1f.03c5873344ab69c1 10.87.49.77
Same issue with IP used. Copied some config files in ~ to ~/save, in case they are being used. Removed containers:
docker rm `docker ps --format "table {{.ID}}" -a | tail -n +2`
Removed images and retrying kubeadm init.
kubeadm join --token=125b8a.5858c1c63bd9f5a4 10.87.49.77
It still is using 34.34.34.6. Trying Ipv6 for calico.yaml and rechecking.
kubeadm join --token=ff89b8.fa557582bb0ff77e 10.87.49.77
Looks like maybe info is in etcd (my theory). Deleting all containers, deleting all images, and deleting files in /var/lib/etcd/. Rebooted system and restarting up kubeadm. Instead of rebooting, may be able to do "ip route flush proto bird". Instead of deleting dir, may be able to do "curl -sL http://<IP_OF_ETCD>:2379/v2/keys/calico?recursive=true -XDELETE". Try some time.
Still see 34.34.34.6 for ID, and 2001:1:: in bird6.cfg files.
Lrobson/caseydavenport indicate that IP can be set manually. Can either delete DaemonSet that starts calico-node, and manually run calico-node with IP (http://docs.projectcalico.org/v2.0/reference/calicoctl/commands/node/run), or use calicoctl to edit the IP on the node (http://docs.projectcalico.org/v2.0/reference/calicoctl/resources/node). Suggested that I do:
calicoctl get nodes -o yaml > nodes.txt
calicoctl apply -f nodes.txt
Tried. No node info listed. Need to set ETCD_AUTHORITY env variable to 10.87.49.77:6666 and then run it. Or can set this in /etc/calico/calicoctl.cfg:
apiVersion: v1
kind: calicoApiConfig
metadata:
spec:
datastoreType: "etcdv2"
etcdEndpoints: "http://10.87.49.77:6666"
Ref: http://docs.projectcalico.org/v2.0/reference/calicoctl/setup/etcdv2
Applied change, but container still shows 34.34.34.6. Should be in datastore, so will restart the calico-node. Did docker restart and the node has the correct IP now.
Cleaning containers, reset kubeadm, clear etcd, delete .kube dir, flushed route on all nodes, and will restart with IPv6 configuration. Doing init:
kubeadm join --token=b29e05.9d010309f5113284 10.87.49.77
Copied calico.yaml.v6 config file to calico.yaml and doing rest:
kubectl taint nodes --all dedicated-
kubectl apply -f calico.yaml
kubectl get pods --all-namespaces
Started up. See IPPools 192.168.0.0/16, 2001:1::/64, and 2001:2::/64. In calico-node, the router IP is 10.87.49.77 (correct). In bird6_aggr.cfg, it has blackhole, accept, and reject for 2001:1:…. In bird6_ipam.cfg, it has entries for both 2001:1::/64 and 2001:2::/64.
Did "calicoctl delete ippool 2001:1::/64" and now that is gone. The bird6_aggr.cfg has the 2001:2::… addresses now, and bird6_ipam.cfg now has the 2001:2::/64 network. So, it seems OK now.
Q: Is it OK on the other nodes?
Q: Why is DNS pod not able to get IPv6 address?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment