.NET Core Engineering manages a lot of secrets, and it is difficult to both manage and reason about them.
What follows is an overview of why we need a process in place for secret management, and then an overview of the scope of secret management that .NET Core engineering manages.
-
The experience associated with managing our secrets is error-prone, it requires manual developer steps.
-
Conventions for secrets are loosely defined, there is no strict enforcement of the convention.
-
Metadata for a secret is not sufficient to identify the intent of the secret
-
Not every secret is accounted for, it is easy to create a secret which is not easily tracked
-
We don't know where our secrets are being used, how often, or even if they are still being used at all; this means that rotating secrets can have unknown consequences
-
We don't know if we have duplicated secrets
-
-
We need to be able to track our secrets:
-
who is using our secrets
-
where are secrets being used
-
how often are secrets being used
-
-
We need to provide a uniform way for creating secrets and providing metadata around that secret
-
Common metadata includes:
-
Intent
-
Last modified datetime
-
Last modified by
-
Secret type
-
Expiration datetime
-
-
-
We need a method for preventing or alerting if our services are accessing secrets we don't manage, and / or bad secrets.
-
We should be able to detect duplication of a secret with the same key vault
-
We do not need to know every secret that a particular service uses, but we do need to be able to automatically rotate any/all secrets we manage
-
Secret consumers need to have live access to required secrets
-
We need to be able to validate, at any moment, that specific (or all) secrets we manage are valid
-
Do not design a system that prevents management of new secret types without significant overhead
-
We need to be able to manage (rotate, delete, expire) any secret we own using a common system
There are a large variety of types of secrets that we manage in .NET Core Engineering. We do not intend to move every secret we own to the new model, but we will move the secrets related to the primary services we own (helix services, arcade services, etc...) and we will not preclude managing other secrets. Below, are the primary types of secrets that we manage.
These secret types can be rotated by automation.
-
Azure Storage connection string or sas token Easily rotated using azure apis. Metadata needs to include what account/resource and requires permissions.
-
Service Bus connection string Easily rotated using azure apis. Need metadata for namespace and required permissions.
-
Event Hub connection string Easily rotated using azure apis. Need metadata for namespace and required permissions.
-
SQL Database connection string Can be rotated assuming the rotation service can get permissions. Metadata needs to include database server/name and required permissions.
-
Random base64 "key" The only instance of this is a key used to encode job cancellation tokens, this can be rotated very easily.
These secrets cannot be rotated without human intervention.
-
Kusto connection string This is just a service principal secret, we should probably use MSI for this.
-
Azure Devops access token Metadata should include required organizations and scopes for the token.
-
Github access token Metadata needs to include required scopes, and accessible repos
-
Maestro/Helix access tokens Standard Token
-
Github app secrets These can't be rotated without disruption
-
Domain account password Metadata should say what account it is.
Who uses our secrets? For this document, the users of our secrets (actors) are defined as those entities that perform "actions" associated with our secrets.
-
First responders - responsible for rotating secrets
-
.NET Core Engineering team - create secrets for use in .NET Core Engineering services
-
.NET Core teams / partners - create secrets that are used in AzDo builds
-
.NET Core Engineering services - consume secrets created by other actors
-
AzDo builds - consume secrets created by other actors
-
Create - create a secret, store secret in Key Vault, add secret metadata
-
Manage - rotate a secret, expire a secret, delete a secret
- rotation (typically) involves changing a secret value in an application, and then modifying the value in key vault so that the correct value is associated with the secret
-
Use - use a secret in an application, or via key vault access
The model for when secrets will be rotated is TBD. ie, are we rotating secrets using automation in an on-demand fashion? regular scheduled rotation? Other?