Definitions

customer = a user other than a cluster operations agent
cluster operations = collection of agents that are used to provide a viable cluster

Current Vulnerabilities

Hostpath volume mounts

Hostpath volume mounts are rarely needed by a customer pod. At the same time, kubelet needs read a certificate to authenticate against the cluster. We do not want someone to create a creative pod request that will map in /etc/kubernetes and thus expose to the individual kubelet's certificate. If such occurred, the individual would be able to act as if it was kubelet, which includes reading of any secret stored in the cluster

It is not possible right now to pass in certificates as environmental variables to kubelet, so more could be done later to better secure kubelet's credentials

Lingering etcd credentials

A while back, flannel would require access to an etcd server in order to coordinate node pod-network CIDRs across a cluster. It made sense at the time to use the same etcd cluster that was being used to store kubernetes state. These credentials are no longer needed, as kraken by default uses annotations in kubernetes to coordinate CIDRs. However these credentials are still stored on worker nodes, which leave an attack vector to gain access to etcd

Open access to etcd clusters

Currently any node in the VPC has network access to etcd services. This is not needed.

On a similar topic, it may be wise to eliminate direct ssh access to etcd nodes. We do have a bastion host solution, unsure what the defaults should be

Same CA is used for etcd client-server, etcd peer-peer, and kubernetes

etcd uses TLS certificates for both encryption as well as authentication and authorization. Therefore a certificate created for kubelet has the same privileges as a client certificate for etcd. This should not happen.

core@ip-10-0-71-114 /etc/kubernetes/ssl $ etcdctl -D etcd.cyklops-erie.internal --cert-file worker.pem --key-file worker-key.pem --ca-file ca.pem cluster-health
member fb047f448bf5302 is healthy: got healthy result from https://10.0.110.161:2379
member 3b02f67718b53a9e is healthy: got healthy result from https://10.0.101.39:2379
member 4b33e6190005caeb is healthy: got healthy result from https://10.0.177.107:2379
member 851a2c9699ffae83 is healthy: got healthy result from https://10.0.0.121:2379
member 9b7da484fd9af966 is healthy: got healthy result from https://10.0.207.147:2379
cluster is healthy

Privileged mode is dangerous

Privileged mode allows a container to run almost effectively as root. This means raw block devices can be read, thus subverting much of a node's security. However, privileged mode is needed for some containers, including kubelet and notably, docker in docker. However, vast majority of pods do not require privileged mode

No restriction on node label affinity

Currently any pod can specify an affinity for a specific node label. This means a pod can decide it prefers to run on a control plane node. That should not be possible for customer concerns. For that matter, it can mean "dedicated nodes" are not truly dedicated nodes.

No restriction on tolerances

Currently there are no restrictions placed on what tolerations are acceptable. A common practice is to taint control plane nodes with a specific taint indicating that they are a control plane node. However an individual could indicate that they tolerate all taints, or more specifically the taint used to indicate a control plane node. This is not acceptable.

Remedies

Network access changes

Only control plane nodes need access to etcd and therefore we should now restrict access to only control plane nodes. A further review would need to be done if and when self hosted etcd occurs

CA changes

We should use separate CA chains for each concern. This way a certificate generated for kubernetes authentication cannot be swapped in as a certificate for etcd interaction and vice-versa.

Removal of client etcd certificates on worker nodes

There is no need for them, drop them.

Implementation of `PodNodeSelector` admission controller

Implementing this admission controller will allow us to manage what node labels a pod can indicate on a per namespace level. There is a cluster-wide specification available, however it must be configured in a physical file and delivered to each master node. This would allow us to lock down node pools better.

We would need to become better at namespace management if we use this

Implementation of `PodTolerationRestriction` admission controller

Implementing this admission controller will allow us to indicate what tolerations pods can indicate on a per namespace level. There are both default tolerations as well as a whitelist. There is a cluster-wide option, however the documentation does not tell you how. More research would be needed to validate a cluster-wide option. This would allow us to lock down node pools better.

We would need to become better at namespace management if we use this

Implementation of `PodSecurityPolicy`

This would require both an admission controller implementation as well as a firm understanding in how to deliver rights to various concerns (customer and cluster operations) By default we probably should be disallowing privileged containers and almost certainly hostpath volume types, but there may be other logical restrictions we should implement.

That being said, we do need to allow for privileged containers for things like ci/cd processes that use docker. Docker in Docker still requires privileged mode, however it may eventually be dropped as runc can now be rootless Choice comment as well

venezia/kraken-security-october-2017.md

Definitions

Current Vulnerabilities

Hostpath volume mounts

Lingering etcd credentials

Open access to etcd clusters

Same CA is used for etcd client-server, etcd peer-peer, and kubernetes

Privileged mode is dangerous

No restriction on node label affinity

No restriction on tolerances

Remedies

Network access changes

CA changes

Removal of client etcd certificates on worker nodes

Implementation of `PodNodeSelector` admission controller

Implementation of `PodTolerationRestriction` admission controller

Implementation of `PodSecurityPolicy`

mattfarina commented Oct 13, 2017

venezia/kraken-security-october-2017.md

Definitions

Current Vulnerabilities

Hostpath volume mounts

Lingering etcd credentials

Open access to etcd clusters

Same CA is used for etcd client-server, etcd peer-peer, and kubernetes

Privileged mode is dangerous

No restriction on node label affinity

No restriction on tolerances

Remedies

Network access changes

CA changes

Removal of client etcd certificates on worker nodes

Implementation of PodNodeSelector admission controller

Implementation of PodTolerationRestriction admission controller

Implementation of PodSecurityPolicy

mattfarina commented Oct 13, 2017

Implementation of `PodNodeSelector` admission controller

Implementation of `PodTolerationRestriction` admission controller

Implementation of `PodSecurityPolicy`