Skip to content

Instantly share code, notes, and snippets.

@soellman
Last active July 22, 2019 14:10
Show Gist options
  • Save soellman/d52cb24080b1982a9b0e to your computer and use it in GitHub Desktop.
Save soellman/d52cb24080b1982a9b0e to your computer and use it in GitHub Desktop.
Easy Kubernetes on CoreOS

Easy Kubernetes Installation on CoreOS

At Timeline Labs, we are continuously looking at new technologies to see what fits our needs. We are especially excited about Kubernetes from Google to manage our services atop Docker and CoreOS.

This process for installing Kubernetes on CoreOS uses Flannel for Kubernetes networking and should be cloud provider agnostic. To deploy the Kubernetes master functionality into the cluster, it uses fleetctl.

Thanks to Kelsey Hightower and his blog posts! They served as a great starting point for this process.

How do I get this running?

Add the cloud config below to your own and bring up your cluster using a CoreOS version with Docker 1.3 (currently v472.0.0 in alpha). During that initial boot, the download-kubernetes and download-flannel units will download binaries from the latest project release and use those. Once you can fleetctl list-machines, go ahead and fleetctl start the three Kubernetes master units in the fleet section below. At this point, you can ssh to the machine where the Kubernetes master units deployed, and go to town with the kubecfg command.

If you want to use private Docker repos, kubelet will read your credentials from /home/core/.dockercfg.

Rebuilding Flannel and Kubernetes

If you want to use a newer version than the default versions reflected in /etc/flannel-version and /etc/kubernetes-version, the build-flannel and build-kubernetes units will do the heavy lifting for you. They build from the master branch of their respective git repositories, and will put the built SHA into /etc/flannel-version and /etc/kubernetes-version files.

To build a new version of either Flannel or Kubernetes, just touch the files called /etc/kubernetes-build and/or /etc/flannel-build, and restart the build service with systemctl:

sudo touch /etc/flannel-build
sudo systemctl restart build-flannel

The build units will restart Flannel or the Kubernetes client units, but the Kubernetes master units will need to be stopped and started with fleetctl:

for service in apiserver controller-manager scheduler ; do
  fleetctl stop ${service}
  fleetctl start ${service}
done

Kubernetes and Cluster Changes

Kubernetes, being still in the early stages of development, doesn't yet handle changes to the cluster seamlessly. If you find that the master is not seeing all your cluster members as minions (kubecfg list minions), then restart the apiserver service (fleetctl stop apiserver ; fleetctl start apiserver).

If you've had a machine go and reboot itself for updates, and now some of your pods are inexplicably in the "waiting" state, then try restarting the scheduler. Expect instability! Kubernetes is still being developed.

And if one of your nodes doesn't seem to be operating correctly, restart the proxy and kubelet units on that node (sudo systemctl restart proxy; sudo systemctl restart kubelet).

Caveats

The download-kubernetes unit is brittle and may break if the Kubernetes releases offer a differently formatted file in the future. The download-flannel unit pulls from a hosted tarball that I've compiled from the latest release.

Additionally, since the build units pull from the master branch, the interfaces to Flannel and Kubernetes may change and break the units. For instance, a Kubernetes commit shortly after v0.4 broke the apiserver.service. BEWARE!

Cloud Config

Add these units and this file to your cloud config.

#cloud-config

coreos:
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
    - name: kubernetes.target
      command: start
      enable: true
      content: |
        [Unit]
        Description=Kubernetes

        [Install]
        WantedBy=multi-user.target
    - name: flannel.service
      enable: true
      content: |
        [Unit]
        Description=Kubernetes Mesh Networking
        After=etcd.service

        [Service]
        EnvironmentFile=-/etc/environment
        ExecStartPre=/bin/bash -c "/usr/bin/etcdctl ls / >/dev/null"
        ExecStartPre=-/usr/bin/etcdctl mk /coreos.com/network/config '{"Network":"10.244.0.0/16"}'
        ExecStart=/usr/bin/bash -c "exec /opt/bin/flanneld -ip-masq=true -iface=${COREOS_PRIVATE_IPV4}"
        Restart=always
        RestartSec=2

        [Install]
        WantedBy=kubernetes.target
    - name: cadvisor.service
      enable: true
      content: |
        [Unit]
        Description=cAdvisor Service
        Documentation=https://github.com/google/cadvisor
        After=docker.service
        Requires=docker.service

        [Service]
        TimeoutStartSec=20m
        Restart=always
        ExecStartPre=-/usr/bin/docker kill cadvisor
        ExecStartPre=-/usr/bin/docker rm -f cadvisor
        ExecStartPre=/usr/bin/docker pull google/cadvisor:latest
        ExecStart=/usr/bin/docker run --name cadvisor \
          --volume=/var/run:/var/run:rw \
          --volume=/sys:/sys:ro \
          --volume=/var/lib/docker/:/var/lib/docker:ro \
          --publish=4194:4194 \
          google/cadvisor:latest --logtostderr --port=4194
        ExecStop=/usr/bin/docker stop -t 2 cadvisor
        User=core

        [Install]
        WantedBy=kubernetes.target
    - name: proxy.service
      enable: true
      content: |
        [Unit]
        Description=Kubernetes Proxy
        Documentation=https://github.com/GoogleCloudPlatform/kubernetes

        [Service]
        ExecStart=/opt/bin/proxy --etcd_servers=http://127.0.0.1:4001 --logtostderr=true
        Restart=always
        RestartSec=2

        [Install]
        WantedBy=kubernetes.target
    - name: kubelet.service
      enable: true
      content: |
        [Unit]
        Description=Kubernetes Kubelet
        Documentation=https://github.com/GoogleCloudPlatform/kubernetes

        [Service]
        EnvironmentFile=/etc/environment
        WorkingDirectory=/home/core
        ExecStart=/bin/bash -c "exec /opt/bin/kubelet \
          --address=0.0.0.0 \
          --port=10250 \
          --hostname_override=$COREOS_PRIVATE_IPV4 \
          --etcd_servers=http://127.0.0.1:4001 \
          --logtostderr=true"
        Restart=always
        RestartSec=2

        [Install]
        WantedBy=kubernetes.target
    - name: download-flannel.service
      enable: true
      content: |
        [Unit]
        ConditionFileNotEmpty=!/etc/flannel-version
        Description=Build Flannel Binary
        Documentation=https://github.com/coreos/flannel
        Wants=network-online.target
        Before=flannel.service
        After=network-online.target

        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=-/usr/bin/mkdir -p /opt/bin
        ExecStart=/bin/bash -c "curl -sL https://s3.amazonaws.com/soellman/flannel-binaries-latest.tgz | tar -C / -zxf -"

        [Install]
        WantedBy=kubernetes.target
    - name: build-flannel.service
      enable: true
      content: |
        [Unit]
        ConditionPathExists=/etc/flannel-build
        Description=Build Flannel
        Wants=network-online.target
        After=network-online.target

        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=/usr/bin/mkdir -p /opt/bin
        ExecStart=/usr/bin/touch /etc/flannel-version
        ExecStart=/usr/bin/docker run --name build-flannel --rm \
          -v /opt/bin:/opt/bin \
          -v /etc/flannel-version:/etc/flannel-version \
          google/golang:stable /bin/bash -c \
          "set -e ; git clone --depth=1 https://github.com/coreos/flannel ; cd flannel ; ./build ; cp -f bin/flanneld /opt/bin ; git rev-parse master > /etc/flannel-version"
        ExecStart=/bin/rm /etc/flannel-build
        ExecStart=/usr/bin/systemctl restart flannel

        [Install]
        WantedBy=kubernetes.target
    - name: download-kubernetes.service
      enable: true
      content: |
        [Unit]
        ConditionFileNotEmpty=!/etc/kubernetes-version
        Description=Build Kubernetes Binaries
        Documentation=https://github.com/GoogleCloudPlatform/kubernetes
        Wants=network-online.target
        After=network-online.target
        Before=kubelet.service
        Before=proxy.service

        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=-/usr/bin/mkdir -p /opt/bin
        ExecStart=/usr/bin/mkdir /tmp/k8s-dl
        ExecStart=/bin/bash -c "curl -sL $(curl -sL https://api.github.com/repos/GoogleCloudPlatform/kubernetes/releases | grep browser_download_url | awk '{ print $2 }' | sed 's/\"//g') | tar -C /tmp/k8s-dl -zxf -"
        ExecStart=/bin/bash -c "cp -f /tmp/k8s-dl/kubernetes/platforms/linux/amd64/* /opt/bin"
        ExecStart=/usr/bin/rm -rf /tmp/k8s-dl
        ExecStart=/bin/bash -c "/opt/bin/kubecfg -version | awk '{ print $2 }' > /etc/kubernetes-version"

        [Install]
        WantedBy=kubernetes.target
    - name: build-kubernetes.service
      enable: true
      content: |
        [Unit]
        ConditionPathExists=/etc/kubernetes-build
        Description=Build Kubernetes Binaries
        Documentation=https://github.com/GoogleCloudPlatform/kubernetes
        Wants=network-online.target
        After=network-online.target

        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=-/usr/bin/mkdir -p /opt/bin
        ExecStart=/usr/bin/touch /etc/kubernetes-version
        ExecStart=/usr/bin/docker run --name build-kubernetes --rm \
          -v /opt/bin:/opt/bin \
          -v /etc/kubernetes-version:/etc/kubernetes-version \
          google/golang:stable /bin/bash -c \
            "set -e ; git clone --depth=1 https://github.com/GoogleCloudPlatform/kubernetes ; cd kubernetes ; make ; cp -f _output/go/bin/* /opt/bin ; git rev-parse master > /etc/kubernetes-version"
        ExecStart=/bin/rm /etc/kubernetes-build
        ExecStart=/usr/bin/systemctl restart kubelet
        ExecStart=/usr/bin/systemctl restart proxy
        ExecStart=-/usr/bin/systemctl restart apiserver
        ExecStart=-/usr/bin/systemctl restart controller-manager
        ExecStart=-/usr/bin/systemctl restart scheduler

        [Install]
        WantedBy=kubernetes.target


write_files:
  - path: /etc/systemd/system/docker.service.d/10-flannel.conf
    owner: root:root
    permissions: 0644
    content: |
      [Unit]
      Wants=flannel.service
      After=flannel.service

      [Service]
      EnvironmentFile=/run/flannel/subnet.env
      ExecStartPre=-/usr/bin/ifconfig docker0 down
      ExecStartPre=-/usr/sbin/brctl delbr docker0
      ExecStart=
      ExecStart=/usr/bin/docker -d -s=btrfs -H fd:// --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU} --iptables=false --ip-masq=false
      Restart=on-failure
      RestartSec=2

Fleet Units

apiserver.service:

[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=/bin/bash -c "/opt/bin/apiserver \
  --address=127.0.0.1 \
  --port=8080 \
  --etcd_servers=http://127.0.0.1:4001 \
  --machines=$(fleetctl list-machines --no-legend | awk -vORS=, '{ print $2 }' | sed 's/,$/\\n/') \
  --logtostderr=true"
Restart=always
RestartSec=2

controller-manager.service:

[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=/opt/bin/controller-manager \
  --master=127.0.0.1:8080 \
  --logtostderr=true
Restart=always
RestartSec=2

[X-Fleet]
X-ConditionMachineOf=apiserver.service

scheduler.service:

[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=/opt/bin/scheduler \
  --master=127.0.0.1:8080 \
  --logtostderr=true
Restart=always
RestartSec=2

[X-Fleet]
X-ConditionMachineOf=apiserver.service
@soellman
Copy link
Author

I've updated this to reflect new changes. Thanks!

@soellman
Copy link
Author

Updated again to download binaries on first boot but still giving the option to build new versions.

@soellman
Copy link
Author

Updated again, requiring Docker 1.3 (--ip-masq=false in flannel dropin for docker)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment