Practical Cryptographic Release Branch Validation

One of the problems I've been thinking about recently is how to get reasonable cryptographic validation of release sources and artifacts without destroying usability. There are several randomly-assorted problems here:

SHA-1 is relatively easy to collide, and thus signed Git commits and tags are insufficient
Maintaining an auditable and relatively tamper-proof list of trusted signatures is hard
"Strong crypto" is generally (and accurately) equated with "not human usable"

Things to Sign

After some consideration, I think that the ideal set of things to sign in a release process is the following:

The published artifacts

This is, in fact, rigorously enforced by many Nexus configurations, including Sonatype's Nexus for Maven Central. It's a good practice to have, because it makes releases attributable and verifiable, at least within the bounds of cryptographic key trust (more on this in a bit). If you're using SBT, the publish-signed task contributed by sbt-pgp accomplishes this task quite nicely.
The release Git tag

As mentioned, SHA-1 collisions are relatively easy to generate, and so a signed tag should not in and of itself be taken as a guarantee that the history is tamper-free. However, tag signatures are a best practice, they are very easy to verify (e.g. by running git show) and they do raise the bar for history tampering by quite a bit. Thus, while insufficient for truly comprehensive guarantees, they do provide an easy and convenient baseline.
The source directory

Keybase provides a fantastically easy way of doing directory signing. It's literally as simple as running keybase dir sign. A SIGNED.md file is written to whatever directory you're rooted in. This file contains a recursive set of all files in the directory with their SHA-256 hashes. Additionally contained in this text block is a set of files that are ignored by the recursive signer (and the keybase command does understand .gitignore). This text block, containing the file names, hashes and ignored files, is signed as a giant chunk using GPG. Directory validation can easily be accomplished by running keybase dir verify, or by running a lot of find, awk and gpg witchcraft by hand (if you don't trust the tool).

This SIGNED.md file should be in the "release commit", which is in turn tagged (with a signed tag) and then compiled into published artifacts which are themselves signed. Despite the easy of validating a directory signature with Keybase, it is still not as easy to verify at a glance looking through the Git history. It is however, as a practical matter, impossible to forge. So, if you're suspicious that someone may have tampered with a release, and you don't trust SHA-1 (as you shouldn't!), you can always checkout the release tag and validate the directory signature.

All of these signatures should be generated from the same public/private key. This provides the ability to cross-validate. So long as control over the key is maintained (social issues with private informational security not addressed here!), we have a strong cryptographic proof of who pushed a release and exactly what its contents are.

People to Trust

Of course, you don't want to trust just anyone to push a release. Nexus authentication is far from sufficient, and it obviously doesn't validate any of the above with respect to the source repository. This is where Git commit hooks come into play.

Ideally, we would have a hook on our server which intercepts tag pushes, checks to see if they follow the release naming convention (e.g. v0.2.4, or release/3.14), and then validates that all of the signatures are in place, consistent, and performed by a trusted signer. Nexus is capable of holding a publish in escrow until released by an external system (this is precisely what Sonatype does for Maven Central releases). Thus, published artifacts should be held and not made available until a release tag has been pushed and validated. Release tags which fail to validate should be rejected on push, and the failure noted in some sort of auditable log.

All of this is obviously very mechanically feasible. It's not hard to implement a Git hook, and the Nexus configuration for escrowing publications is trivial. The bigger problem is a social one: how do we determine who to trust? We need a way of automatically and without user intervention grabbing a set of trusted signatures. It needs to be very easy for developers to add new signatures to this set (e.g. when a new release-capable developer is added to the team), remove old signatures (e.g. when someone leaves the team) and trivially, passively audit that no unauthorized signatures have been added in error.

Obviously, a text file sitting on the build (or Git) server is a terrible idea. Very easy for a malicious entity to modify, relatively inconvenient for authorized parties to modify, and impossible to passively audit. A much better idea is to take this list of signatures and throw it in your face every time you look at the repository. Everyone should see this list a hundred times a day, such that even the slightest change jumps out like a sore thumb or an altered editor font.

The Idea

My proposal is to store this set of signatures in the README.md file at the root of the repository. This is a file which is visible to everyone all the time. Changes are very visible, and it's obviously trivial to audit (just load the project page!). One could use a very simple format, such as:

## Authorized Developers

The following developers are authorized to sign release artifacts and branches:

- [Daniel Spiewak](https://keybase.io/djspiewak) <[email protected]> - `3587 7FB3 2BAE 5960`
- ...etc

Very easy to consume and human-friendly. After seeing this list dozens of times a day for months and months, I have very little doubt that even the least observant junior intern would catch even a single character out of place in any one of the signatures.

When validating a release, a script (not contained in the repository!) should pull these signatures directly from the README.md file in master (very important!) and verify that the public/private key which signed the release has a signature which matches one of these authorized developers. A malicious entity attempting to perform a release would not only need to attach a very automatically-auditable signature to their handiwork, but they would also need to modify the project README.md, a very visible and easily-checked document.

Of course, files in master change all the time, but this is where the pull request process steps in to save us. There are many, many reasons why pull requests are good, and this is just another one. If you forbid pushing to master, all changes going into master must flow through a pull request at some point, presumably causing a lot of stir and notifying everyone on the team in a permanently-auditable fashion. Thus, even a disgruntled developer who has commit (but not release) permissions would be forced to make a very public and visible move in order to add themselves to the authorized release list (specifically, they would need to make a pull request that includes the change). Since adding new names to the list of release-capable developers is a rare event by definition, such a pull request should be very attention-grabbing when it happens. Additionally, the pull request process slows down any sort of malicious action here.

One could even enhance the security of this process further by using Git's commit signing (different from its tag signing) and auditing these signatures with git blame. A secondary commit hook (using git blame) could validate that any commit which changes this list of signatures is signed by a previously-trusted signature (i.e. a signature that was on the list prior to the change). Thus, only a trusted developer could add another developer to the trusted set, and this trusted set is open and visible to everyone and seen constantly.

A Note on Comments

It sounds stupid, but the fact that Markdown allows comment blocks actually makes this slightly harder to pull off. It is very, very important that the trusted set of signatures comes entirely and exclusively from a rendered list of names, ideally at the top of the file. The only way to truly validate this is to use a full-fledged Markdown parser which directly renders the README.md file into a set of signatures, rather than the traditional HTML. This prevents malicious entities from fooling the signature parser by injecting an identically-formatted list block, commented out such that it doesn't render.

Summary

I'm relatively certain that this scheme produces a self-contained, very human usable, cryptographically sound scheme for safely releasing sensitive projects. It's certainly not fool-proof, but the tampering required to bypass these checks would be quite substantial. One would need to compromise several systems (e.g. Nexus, the Git repository, probably the build server) in order to push a malicious release.

Critically, there is no reliance on the "PGP Web of Trust" in this scheme, nor is there any naive assumptions about the collision-resilience of SHA-1 or the cryptographic-savvy of the average developer. The system is easy to use, clear and straightforward, and also very strongly verifiable.