Skip to content

Instantly share code, notes, and snippets.

@jsturtevant
Last active August 25, 2022 21:10
Show Gist options
  • Save jsturtevant/2b4dca0716b70faec61a6abc7aed9521 to your computer and use it in GitHub Desktop.
Save jsturtevant/2b4dca0716b70faec61a6abc7aed9521 to your computer and use it in GitHub Desktop.
spiffe-spire research

I've got my raw notes in this gist. I will summarize the outcomes here:

Pros and cons comparing SPIFFE to the current service identity mechanism in OSM.

The aspects of SPIFFE that we are primarily interested in are the x509 SVIDs and the Spiffe ID.

The SPIFFE specs doesn't specify how you use this identity just how it is encoded in the x509 certificate. In our case we have an existing abstraction around identity. And so we don't need to change our identity mechanism but instead make it compatible with the SPIFFE specs. For more information on OSM identity see identity PR

Some of the benefits I see for this are:

  • By adhering specifications it enables integrations with rest of eco system (see "Who uses SPIFFE?" at https://spiffe.io/) that use SPIFFE specifications. Once able to work with the specification things like OPA integrations and OIDC integrations become options
  • Opens the door for integrations with SPIRE. Spire has a few advantages that would likely be worth exploring further like workload attestation plugins, federation, certification rotation.
  • potential for a way to reason about using across multiple types of workload beyond kubernetes. Though the naming schema is still left up to the implementors. Our current schema might work but needs some additional thought.

Especially explore the scenario if they are used in multi-cluster environment

At this point I believe SPIFFE isn't a requirement for multi-cluster. It could provide a way to simplify federation and allow for Attestation across different platforms.

Rough task list if we migrate from current service identity component to SPIFFE.

  • Switch from MatchSubjectAltNames to MatchTypedSubjectAltNames in Envoy config
  • Add feature flag that turns spiffe id generation on
  • Add feature flag that turns on spiffe id validation (this can't be done at same time in backwards compatible way with out re-issuing all the Service Certificates). It might be possible to automate this but might be just as easy to just wait given time frame to flip a switch or start a new deployment with this enabled.
  • Implement spiffe ID generation in x509 certificates
    • clean up our IssueCertificate Interface, right now it uses prefixes and it could benefit from more structure to make parsing decisions easier later
    • [ ]
  • Implement spiffeid x509 validation for RBAC controls
  • Implement using spiffeid to make RBAC decisions
  • Enable control plane certificates issue x509 SVIDS (optional)
  • Implement bundle API for federation with other systems (optional)
  • Integrate envoy certificate validator (optional)

Spire integration

There are several stages that could be implemented for the spire integration. This would need to be completed after

To be done for SPIRE integrations:

  • Prototype of SPIRE using SDS server
  • prototype of using SPIRE as certificate provider

(optional) POC using SPIFFE in OSM

I've got a POC of using SPIFFE IDs and x509 SVIDs at https://github.com/openservicemesh/osm/compare/main...jsturtevant:osm:spiffeid?expand=1. This could be used to do spire prototype.

tressor new ca:

tressor new leaf:

golang SANS: https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/crypto/x509/x509.go;drc=c177d9d98a7bfb21346f6309c115d0a2bf3167e3;l=699

vault docs:

Identity creation via CN:

Proxy to CP Idenity validation:

Proxy identity:

Proxy to Proxy - LDS - RBAC setup:

SDS is configured with SAN verification (but not sure if it is used)

identity maching for traffic polices happens through EDS (endpoint discovery service). Envoy says give me all the endpoinds for this "cluster" https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/service_discovery#endpoint-discovery-service-eds https://github.com/openservicemesh/osm/blob/662268eb31c5777a235b99f1c529168ca052b22a/pkg/envoy/eds/response.go#L21

checking mTLS: https://github.com/clarenceb/osm-mtls-check ksniff: https://github.com/eldadru/ksniff

https://spiffe.io/ https://github.com/spiffe/spire

Book: https://thebottomturtle.io/Solving-the-bottom-turtle-SPIFFE-SPIRE-Book.pdf

Existing Usage (from Who uses SPIFFE):

requests for other meshes:

Azure stuff:

Learning:

notes/questions:

  • SPIFFE doesn't do auth.
  • identify workloads (could be pod or whole deployemnt). How does osm identify?
  • How does trust domain play into this? openservicemesh/osm#4754
  • running spire? implement something?
  • what is the use case?
    • opens door for using spire
    • integrates to ecosystem
    • common idenity for k8s and vms, etc
    • dedicated infrastructure layer that controls service-to-service communication over a network.
  • current identiy mechanism?
    • sa, and custom naming schema
  • do we have all ids?
  • packages in envoy - what is lds/eds/cds/etc...
  • does a cert validate if SAN is present? the custom cert trust domain stuff... what is required there?
    • yes I believe it does. Since we are just doing the CRS with that additional information in it. As long as the upstream CA doesn't block it should be OK.
  • how to get to envoy ui?
    • k port-forward bookbuyer-6c759555b8-vlqnb 15000
  • what is the naming schema for our spiffe id?
    • for non vm, will it be user accounts, useridenities, etc?
  • how is this going to work for gateways and webhooks

random questions:

  • do we have a central ui?
  • console integration?

Standards:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment