layout | title | date | categories | tags | image | author | published |
---|---|---|---|---|---|---|---|
post |
Kubernetes Honey Token |
2021-01-19 |
blog |
kubernetes security honey token cloud jwt serviceaccount |
/img/previews/honeyjar.jpeg |
Brad Geesaman |
true |
While publicly disclosed Kubernetes-related security breaches have thankfully been infrequent, we've seen environments where it's possible that an attacker would likely go unnoticed for a long time if they gained access to the cluster and persisted using non-distructive means. Implementing defense-in-depth, thorough logging, and detailed metrics can be complex time consuming. In the meantime, what if there was an easy, high-confidence way to be alerted if a malicious entity was present?
If you are familiar with the concept of Canary Tokens, they are "a free, quick, painless way to help defenders discover they've been breached (by having attackers announce themselves.)". Lenny Zeltser blogged about setting up and using honeytokens, and it really helped solidify the concept and approach. That got us thinking about how we might be able to apply this concept to Kubernetes clusters.
We came up with the following criteria to help guide our thinking about this problem:
- The "token" should be extremely quick and easy to implement and also to remove cleanly. Under 5 minutes is the goal.
- Introducing the "token" into an environment should not interact negatively with scale, availability, or operations.
- It's use should indicate with extremely high confidence that malicious activity just happened, so it should be configured to avoid false positives and accidental triggers.
- It's placement should catch the most frequently used reconnaissance, discovery, and enumeration techniques used during post-exploitation efforts.
- The malicious entity should ideally never know that they triggered an alert by using the "token".
Therefore, we propose using an artisanally crafted Kubernetes Service Account token as a Honeytoken.
Referencing the Kubernetes Threat Matrix blog post from the Azure Security Center team in combination with our own security testing and CTF experience working inside Kubernetes clusters:
The Credential Access > List K8s Secrets
and Credential Access > Access container service account
tactics are both highly likely to be used or attempted during most post-exploitation activities. Our other blog post on the power of Kubernetes RBAC LIST highlighted the interaction between Kubernetes secrets
access and Kubernetes serviceaccount
token storage, so we can potentially cover both via a single, canary serviceaccount
token.
Revisiting the serviceaccount
token concept for our desired criteria:
- Quick and easy to implement: Creating a new
serviceaccount
takes seconds, and the controller will generate the never-expiring JWT token for us as asecret
in the desirednamespace
. - No negative effects: Creating a single
serviceaccount
incurs extremely low risk and requires very low overhead. - High confidence of malicious activity: Assuming it's run in a
namespace
likekube-system
and administrators are not in the habit of randomly attachingserviceaccounts
to workloads inkube-system
without peer review, any in-cluster use would be highly suspect. - Common techniques and pathways: Running it in the
kube-system
namespace
where a lot of system workloads with higher privileges exist is a natural target for enumeration at a minimum and active exploitation in most cases. - Silent alerting: Shipping the Kubernetes Audit Logs from interactions with the API server to an off-cluster location happens silently in the background, so there would be no direct feedback loop to an attacker.
Which serviceaccount
token would an attacker go for if they found it? Our first thoughts are common workloads that are typically granted cluster-admin
permissions to do what they need to do and have a history of allowing a path for easy escalation. Helm v2 is at the top of that list because it leverages an in-cluster deployment
named Tiller
to install Helm charts. Also, most admins grant cluster-admin
to its locally-mounted serviceaccount
token to be able to interact with the API server and listens on TCP/44134
without requiring authentication by default. So, if an attacker compromised a pod
or a lower-privileged credential to a cluster, looking for and leveraging that serviceaccount
token attached to that Tiller
deployment would be an attractive approach.
Note: If you are still running Helm v2 with Tiller, you should read this blog and migrate to Helm v3 ASAP
Installing a realistic Tiller
deployment with a dedicated serviceaccount
mounted takes a few seconds:
$ kubectl create sa -n kube-system tiller
serviceaccount/tiller created
$ helm init --service-account tiller --tiller-namespace kube-system --stable-repo-url https://charts.helm.sh/stable
To remove tiller
and the serviceaccount
:
$ helm reset --force
$ kubectl delete sa -n kube-system tiller
This requires that your cluster is configured to send the proper audit logs to a central location and for that logging facility to parse the subject
of any successful API server request. For GKE, this requires project data access logging to be enabled and for metrics and alerts to be set up when logs that match the tiller
serviceaccount
name are sent. For EKS, this requires the EKS Control Plane Audit Logs to be enabled and to be sent to CloudWatch Logs along with similar filters and alerts.
Now, any time the tiller
serviceaccount
token is used to authenticate successfully against the API server, it indicates malicious access is already obtained with sufficient permissions to read the contents of a secret
in the kube-system
namespace
, and we are capturing the correct audit logs and have filters in place send a high priority alert to the right team.
Depending on your threat model, exposing tiller's
tcp/44134
unauthenticated gRPC port inside the cluster may not be desirable as it may potentially give an unauthenticated attacker a valid credential to be in the system:authenticated
RBAC group. Implementing client certificate authentication or a NetworkPolicy
preventing ingress to the tiller
pod should mitigate this avenue, but it is a trade-off in that it might prevent them from gaining access to tiller's
serviceaccount
altogether and avoid our custom detection mechanism.
Finally, it's important to state that this approach is experimental and that you should fully validate this in a test environment before considering it to your definition of "production ready". That said, we hope this fosters discussion and other creative solutions along these lines. We'd love to hear what you think, so feel free to reach out to us.
Blog photo: Unsplash