A TPM contains multiple PCRs and can generate a signed quote over the concatenated hash of a selection of PCRs. The quote itself does not contain the values of the PCRs. If you want to have matching quote and PCR values most implementations (also Keylime) do the following trick:
- Read PCR values (8 at the time)
- Generate quote
- Read PCR values (8 at the time)
- Check if the PCR values from step 1. and 3. match, if not start with 1.
This works fine if the PCR values are essentially static which is the case for all the PCRs used during UEFI Secure Boot, but is not the case when IMA is enabled and extends PCR 10 quite frequently.
In cases where IMA is enabled this might cause an unintentional attestation failures because there is no atomic quote ever be generated. Also the quote signing is a computationally expensive operation that might block the TPM from performing other action.
For the verifier to attest the agent sends the following information:
- quote
- signature for the quote
- PCR values checked to be the same as the hash in the quote with the method described above.
- We currently use a binary data structure from tpm2-tools for that
- IMA log (optional)
- UEFI log (optional)
- NK transport key measured into PCR 16 (might not be always sent)
Then the verifier does the following steps:
- Check if hash in quote matches the concatenated hash of the PCR values
- Check the signature, quote and AK of the agent match
- (Validate data quote for NK against PCR 16 sent by the agent)
- (optional) Validate UEFI log
- Walk the UEFI log and get the computed PCR values
- Validate the UEFI log against a measured boot policy
- Check the computed PCR values against the PCR values sent by the agent
- (optional) IMA validation
- Validate an entry of the IMA log
- compute running hash for PCR 10
- check if running hash matches PCR 10 sent by the agent
- if yes stop
- if no goto i. or fail if there are no more entries
- Validate static PCRs (all PCRs that were not covered by UEFI or IMA log validation)
-
Check PCR value against static allow list
-
Check if all PCRs that should be validated are now actually validated
-
This model has several disadvantages:
- It requires that the quote and sent PCR values match exactly
- More complex validators (e.g. ImaBuf validator or measured boot policies) run before the integrity of the data fully validated against the quote. Which is not directly a security issue, but increases the attack surface of Keylime.
We do not actually require that the PCR values and quote is atomic implement any of the functionality above (if we assume that no other PCR than PCR 10 changes frequently).
The agent sends still the same data as above with the difference that the PCR values might not match the quote.
Now the following steps for verification are:
- Check if the signature, quote and AK match
- (skip if no UEFI log validation enabled) walk UEFI log and only save computed PCRs
- Build list with PCR 16 sent by the agent and the computed PCRs from the UEFI log with they are not present use the selected PCR values also sent by the agent
- (skip if no IMA log validation enabled) IMA entry structure validation. In this step we now try to iterate the log until we find a matching running hash for the quote. If there are no external failures this should always work because entries are first added to the IMA log then measured into the TPM. In this step we only validate the structure (hash of the entire struct) of the entry not its content. For the first iteration start with 4.ii, because the quote might already match before we even validated one entry (happens often by incremental attestation).
- Compute running hash with running hash for PCR 10 and the list from above and stop if it matches the quote or fail if there are no more entries and no match was found. We also save what the last entry of the IMA log was and only validate the content up to that point later.
- Parse and compute hash of IMA entry and update running hash for PCR 10
- goto i.
- If now the running hash matches the quote, we can assume that all the PCR values and data are valid.
- (optional) Validate UEFI log using a measured boot policy
- Parse UEFI log into JSON format
- run measured boot policy and produce an failure if the policy fails
- (optional) Validate content of IMA entries
- run complex validator on IMA entry and keep track of any failures
- goto i if not last entry.
- Output all the failures
- Validate static PCRs (all PCRs that were not covered by UEFI or IMA log validation)
- Check PCR value against static allow list
- Check if all PCRs that should be validated are now actually validated
Up to step 5. we return early if a failure occurs. After that we collect them and handle them according to their severity level. More information on that can be found here: https://github.com/keylime/enhancements/blob/master/46_revocation_severity_and_context.md
The currently Keylime puts most of the PCR validation the validation into an abstract TPM which does the necessary calls to tpm2-tools for validation. This made sense for supporting TPM 1.2 and TPM 2.0 and sharing the code with the agent. We no longer support TPM 1.2 and longterm the Python agent will be deprecated and removed. Therefore the new validation code should have the following properties:
- Content validation of logs (UEFI, IMA) should be fully separate from testing that quote and data is valid
- This should allow us new data for validation easily
- If there is a new TPM or a similar (Pluton??) protocol we should be easily add support for that without changing our data validation
- quote validation is abstracted in a way that the current dependency on tpm2-tools can be swapped out with for example tpm2-pytss
- Easily unit testable. The current code is only covered through end-to-end testing.
With that in mind the proposed steps from above can be implemented without changing how users currently use Keylime.
The agent only one mayor change that should simplify the code in most cases. Instead of checking that the PCR values and the quote are atomic, the agent first reads the PCR values and then generates the quote and sends the data to the verifier.
With this change we want to reduce the dependency on tpm2-tools. We currently use for sending the PCR values a tpm2-tools specific data structure (tpm2_pcrs) and have a custom format encoding this with the quote and signature, this will get replaced by a JSON structure with the following structure:
{"pcrs" :
{
"0": "HEX_ENCODED_VALUE_OF_PCR_0",
"1": "HEX_ENCODED_VALUE_OF_PCR_1",
...
},
"quote": "BASE64_ENCODED_VALUE_OF_TPM_QUOTE",
"sigature": "BASE64_ENCODED_VALUE_OF_TPM_QUOTE_SIGNATURE"
}
With only the PCRs present that were requested by the verifier.
Note that the old 2.0 API still provides all the necessary information only the data structures are changed to make implementations simpler, so the verifier can easily support both APIs.
The TPM quote also contains clock and firmware information besides the quote hash. Keylime currently does not use this data. The firmware string can be just another data point that can be validated like the logs. With the clock to checks can be implemented:
- Checking that there was no changes to the clock (the safe flag is set to true)
- If the system was rebooted between two quotes by checking if the clock advances at the right pace and checking reset and restart counters. Note that the two counters are obfuscated to make fingerprinting harder, so they can only be checked on equality.
The second point will allow Keylime easily detect scenarios where a device left the trusted state for a short period of time and then rebooted to get again into a trusted state.
There are now Python bindings for the TPM with tpm2-pytss which implements parsing of TPM specific data structures and makes it possible to implement the quote signature fully in Python. Moving in the verifier to pytss would allow us to remove external calls to tpm2-tools. It might make sense to put more generic code for validation into pytss fist before using it in Keylime.
@kgold2 thanks for the comments.
I think it can be only one PCR, right? So someone could build a kernel with PCR 11 for IMA, but not PCR 10 and 11. If we are already changing the API of Keylime we could add a flag for what PCR is the IMA PCR.
I completely agree and that is why in the other proposal I eliminate the need for binding data to a quote using resettable PCRs. Moving to PCR 23 is probably as bad as using PCR 16. Just in general is there a good way to bind a checksum of arbitrary data to a quote?
The idea of using PCR 16 comes probably from here: https://opensecuritytraining.info/IntroToTrustedComputing_files/Day2-1-auth-and-att.pdf
They are optional if no attestation of the IMA or UEFI log is specified, so if you are only checking against static PCR values. If verification for IMA and/or UEFI event log is enabled those steps are not optional. I will change the wording to make this more clear.
GitHub renders numbered bullet points differently than my local editor. Should be fixed.
Yes, is fixed.
You are right that the PCR values are not required to generate the quote, but we still want to read and send them to the verifier to still allow checking them against predefined values. I know that this is brittle, but we currently support that in Keylime, so we cannot break this functionality. In cases where tboot is enabled you will loose the UEFI eventlog, so you need the PCR values.
See answer to 5.
It adds more traffic (roughly 65% more), but I don't if that an issue in practise. I would like to enable transport compression, but it has shown that it is a great attack vector for DOS attacks.
Yes, is all data obfuscated? I know that reset_count and restart_count are but is the actual time also obfuscated?
tpm2-tools provides tpm2_checkquote and tpm2_makecredential which we use for quote validation and make credential. As you already pointed out we would use tpm2-pytss for unmarshalling TPM data structures and also for having a make credential implementation written in Python.