Keylime currently operates on a pull basis which means that the tenant or verifier connect to the agent to collect attestation data. Therefore they need to know the IP and Port to connect to beforehand and this currently cannot change during attestation. This works fine in most virtualized environments where all the devices are in the same network, but not for edge devices or in BYOD contexts. There are workarounds using VPNs/overlay networking using OpenVPN, ZeroTier, Nebula etc. but none of them provide an ideal solution.
- Identity quote: The purpose of the identity quote is to prove to the tenant that the NK (also called transport key) belongs to the same TPM as the agent. The NK is used for encrypting the U and V key during transport and is the also the key of mTLS certificate of the agent. The tenant uses this feature. This is also done to ensure that the agent behind that IP is still the same that registered by validating the quote against the registered AK.
- Integrity quote: The purpose of the integrity quote is to get a TPM quote with all the necessary data for attestation (PCR values, UEFI log, IMA log)
- Sending the payload and U key: After the tenant validated the identity quote, it can sent a payload (like a small cloud-init) to the agent. The payload is encrypted with a key that is split into a U and V part. The U part is sent from the tenant to say I want to bootstrap this agent and the V part will be sent by the verifier if initial attestation was successful.
- Sending the V key: The V key is sent by the verifier if initial attestation was successful
- (Checking if UV decryption was successful)
First we remove all unnecessary interactions with the agent.
Motivation: Removes one interaction with the agent and the registrar already is trusted for the make/activate credential process for the AK.
We want to replace the two functions of the identity quote with a new mechanism that does not require a separate connection to the agent.
1. Proving that the NK belongs to the same EK/AK
Instead of using a resettable PCR like PCR 16 and generating a quote, we can load the NK temporarily in the TPM use TPM2_Ceritfy to generate signature with AK to proove that the NK belongs to the same TPM.
2. Verifying that this is still the same agent
This is mostly relevant for the payload mechanism. Here we do not want to sent a payload to the wrong agent. In the current model the agent cannot decrypt the payload if the AK changes they will never get the V key from the registrar. This still holds after eliminating the identity quote. The identity confirmation of the NK is moved from the tenant to the registrar. This does not change the trust model, because we already trust the registrar for the AK belonging the EK.
In the push model the registrar becomes the main contact point with the server components.
The registration is a three way protocol.
First the agent sends the following information (new) over HTTPS using the mTLS certificate:
- Agent UUID
- EK certificate (normally provided by the manufacturer)
- AK: Used for signing the quotes
- new Public portion of the NK loaded on the TPM (pubkey, attributes, name etc.) and a signature of TPM2_Certify for the NK.
- mTLS certificate (contains the public portion of the NK)
- Contact IP/port (only relevant for the pull model)
Then the registrar then does the following:
- new check if the mTLS certificate and the one used for authentication match
- (new optional) run user provided checks on UUID, EK, mTLS certificate. (For example only allow agents to register where the mTLS certificate is signed by a specific CA)
- Verify that the NK signature matches the public portion of the NK and the AK
- Generate make credential challenge for AK
- new Save the registration data in the DB with the challenge values as the primary key
- This allows for multiple agents start the registration process for the same UUID but only the one that completes the challenge gets it.
- Return the challange as a response of the agent initial request.
Next the agent does:
- Do activate credential for and AK with the challenge provided by the registrar
- Sent the the challenge values to the registrar (new) over HTTPS with the mTLS certificate
To complete the registration process:
- Check if the sent challenge values match any open registration
- check if mTLS certificate matches the agent that started the registration with the provided challenge values
- Mark agent as registered and allow the agent to use the registered UUID
Note that all state is in stored in a DB, so that the registrar can be easily scaled.
Once the agent is registered it polls the registrar for the following information:
- Is there a verifier active where attestation data should be sent
- Is attestation stopped for a specific verifier
- Is there a payload to download and where can it be found
Once the agent got the information that a verifier wants attestation data it starts pushing to the verifier.
This is done in three steps:
- Agent connects the the verifier to get what information should be sent
- Verifier responds with PCR selection, nonce, starting points for incremental attestation. Also potentially the V key if the first attestation was sucessful.
- Agent pushes quote and required data to the verifier
Only the event loop and REST API interface require major changes to support the push model.
Event Loop
The push event loop is very simple. Check if after a grace period the agent has pushed data and then check that the pushes from the agent match the push interval.
REST API
The agent connects to a endpoint like: /agent/{UUID}
this only works with the agent uses mTLS with the same certificate provided during registration. If authentication was successful the verifier responds with
- Nonce
- PCR selection
- Next entry for IMA incremental attestation
Then the agent collects the necessary data and posts it to the same endpoint. Here the verifier needs to check if the time period between providing the data for the agent and receiving the attestation data is not too long.
Persisting Agent State
In the pull model not the entire agent state is committed to the DB because there was no need to do that. To make the push model easier scalable the entire agent state must be committed to the DB.
The user interface for managing a agent only has minor changes, but the steps done by the tenant change.
Adding an agent to a verifier with a payload
Input: agent UUID, unencrypted data for the payload, which verifier should be used, policies (IMA, measured boot, static PCRs)
Steps:
1. Connect to the registrar and retrieve agent information
2. Check if there were two registrations with the same UUID but different EKs *TODO: check if this is still necessary or if we fully move that feature into the registrar*
3. *(Optional)* Validate EK against cert store
4. *(New optional)* Validate registrar data using custom scripts (was only possible for the EK before)
5. Generate U and V key for payload and encrypt all of them with the NK
6. Add the agent to the verifier with the following data: UUID, mTLS certificate, AK, V key, policies, push interval, grace period. The grace period is there to give the agent the chance to notice the verifier wants attestation data and not failing it automatically.
7. Now notify the agent by adding the necessary data to the registrar:
1. Add to the entry of the agent that the verifier wants attestation data with the given push interval
2. Upload the payload to the registrar for the agent to download
Note that all the steps for revocations are just part of the payload generation and therefore ignored in the above steps.
When the agent now finds this new information it does the following:
- Starts pushing attestation data to the verifier
- Downloads the payload
- Decrypts the payload once it also has the V key and marks decryption successful in the registrar
Remove agent from attestation
Input: agent UUID, verifier
Steps:
- Mark the verifier as no longer interested in attestation data at the registrar
- Remove the agent from the verifier (should not require changes to the current API)
Verifying that payload decryption was successful
This can be done with a lookup at the registrar.
We already have a CA for Keylime that can be used by the agent to verify connections from and to the verifier/tenant/registrar. This can be reused for the agent to verify if it actually trusts those components.
On the server side we want the agent to authenticate itself with the mTLS certificate provided during the registration process. In practice we noticed that doing this is not really a good idea doing that with the web server frameworks written in Python. Instead authentication and validation of the client certificate should be done by reverse proxies like nginx and passed as an HTTP header. This makes also simpler for load balancing and putting the registrar and verifier on the Internet.
- Encrypting U/V key with the NK is still useful so that other server components cannot decrypt the payload
- Follow the IETF standard for remote attestation where possible: https://www.ietf.org/archive/id/draft-ietf-rats-architecture-15.html
Issue: I would say that the bigger issue is security, not connection information. The attesting device does not want to run a web server, and it does not want to open a port through its firewall to the internet. A secondary issue is power consumption in a battery powered device. The attestor may be powered down at times.
I still wonder about the whole U,V design. While I saw it in the original MIT paper, it appears have nothing to do with attestation and therefore with the core keylime application. Is it being removed (Design section)?
I understand that NK is being used for the TLS session. A simple way to prove that it comes from the same TPM is TPM2_Certify. It's easier than make/activate.
Consider that the registrar is very security sensitive, what does the statement "In the push model the registrar becomes the main contact point with the server components." mean?