tl;dr Generate a GPG key pair (exercising appropriate paranoia). Send it to key servers. Create a Keybase account with the public part of that key. Use your keypair to sign git tags and SBT artifacts.
GPG is probably one of the least understood day-to-day pieces of software in the modern developer's toolshed. It's certainly the least understood of the important pieces of software (literally no one cares that you can't remember grep's regex variant), and this is a testament to the mightily terrible user interface it exposes to its otherwise extremely simple functionality. It's almost like cryptographers think that part of the security comes from the fact that bad guys can't figure it out any more than the good guys can.
Anyway, GPG is important for open source in particular because of one specific feature of public/private key cryptography: signing. Any published software should be signed by the developer (or company) who published it. Ideally, consumers of this software would verify these signatures before trusting their laptops and datacenters with its randomly-sourced code, though in practice this rarely happens. For the few people who do care about this though, it's important to get our ducks in a row when it comes to signing distributables and ensuring that the chain of provenance is publicly auditable and trustworthy.
As a sidebar, I'm not going to take the time to explain the ins and outs of asymmetric cryptography. This is partially because I believe that anyone in tech should already have a pretty good conceptual picture of how this stuff works, and partially because there are already far better articles floating around which can do the explaining for me. Cmd-T, www.google.com
, Return, public key cryptography
, Return. Enjoy.
If you're on Linux, you're already set. If you're on Windows, you're completely in the woods by yourself; in the distance, a wolf howls.
If you're on Mac, you actually have to install GPG yourself. You want this. It installs all the gpg
and gpg-agent
command line tools, as well as a bunch of other things which generally work to make GPG slightly less user-hostile.
I'm going to assume you don't already have a pre-existing public/private keypair. If you do, you can skip this step.
The exact steps you take to generate your public/private keypair depend on how paranoid you want to be. Most of us aren't maintaining Emacs, and thus are unlikely to be targeted by VIM-using infidels seeking to publish fraudulent versions of Emacs signed with our fraudulently-obtained key. With that said, a little paranoia is healthy here. Control of your primary private key is the literal definition of proof that you are you on the internet. Proof in a cryptographic sense, that is. So protect your "you-ness".
Generating the keypair itself is quite straightforward:
$ gpg --gen-key
Follow the prompts. You want an RSA/RSA keypair (the default) with a 4096 bit size (not the default). If you're being exceptionally paranoid and air-gapping your primary key (as I have), then you'll have to go through some extra steps here to split your subkeys. You should also consider carefully some physical security questions as you do this, and maintaining some background noise (randomized white noise is good), blocking sight-lines, removing network and storage hardware, and turning off phones. Like I said, exceptionally paranoid. If you aren't as worried about being hacked by the Cult of VIM, just generate the keypair and don't worry about the rest of it. Just… maybe don't do it in a coffee shop.
Use your real name for the user id, and for the email address use an address that you expect to control for the next few decades. This pair is you. You can add more names and emails later, if you want.
The expiration date is something of a contentious question. Frankly, I don't see the point in setting an expiration date for a primary key. It's just going to expire unexpectedly out from under your feet, you'll be embarrassed for a few minutes while you reset the expiration date, and then continue with your life. You're going to hold onto this key for a while (ideally the rest of your career); no point in making it expire prematurely.
Set a strong password. Ideally one you can't remember. (you are using a password manager, aren't you?) This password is the last line of defense if someone steals your private key. It acts as a second factor (something you know) to the primary defense factor: possession of the file which contains the encrypted private key (something you have).
If your machine prompts you to create entropy, bang on the keyboard, wiggle the mouse, and check Facebook. A 4096 bit key takes a little while to generate on some architectures, so be patient (my air-gapped keypair on an older CPU took about 10 minutes to generate).
I'm assuming you're doing this on a laptop. Enable your full disk encryption! For the love of god, enable encryption. This is like, an incredibly easy step to data security that is an order of magnitude better than what many people do, and modern hardware is designed such that the performance penalty is (surprisingly!) almost non-existent. Enable encryption. Also make sure your backups are encrypted, too.
…enable automated backups if you haven't previously done so.
In the category of protecting your private key, you should strongly consider getting a YubiKey and using it to store the private part of your GPG keypair. This is considerably more secure than just leaving it on your hard drive (even encrypted!), since the private key itself will no longer be reversible. The YubiKey is basically a removable tamper-resistent hardware unit which understands how to use your private key to sign data that it is passed in, returning the results to the caller without ever exposing the private key itself. Since the private key never leaves the YubiKey, and the YubiKey is tamper-resistent, we have a strong assurance that the key cannot be exfiltrated (encrypted or otherwise) by malicious parties.
Note that if you do something like this, you should also keep an (encrypted) offline backup of your private key, just in case your YubiKey is lost or damaged. I have a few old-school CD-Rs, though other solutions such as USB flash drives and even printed paper are also reasonable.
Anyway, this step is worth taking for two reasons. First, it's very cheap and easy (YubiKey runs about $50 USD and is very well documented and reliable). Second, it addresses the most realistic of the paranoid concerns about key security: namely, the fact that you probably run a bunch of random software on your laptop, all of which has access to your user data, which includes your encrypted private key, and all of which can access the internet at (almost) any time. With YubiKey, this is no longer a question, since no one can read your private key, only use it, and even using it requires the physical token.
A public/private keypair does literally no one any good if you're the only one who knows it. The public key that is. There are two ways that we're going to publish our new public key: GPG's Web of Trust and Keybase.
GPG was designed by some incredibly paranoid people. I have a strong appreciation for these sorts of people, since they think through some absolutely terrible and arcane possibilities, and then they design software which is impervious to these eventualities that 99% of the world never has to deal with. I'm glad they're thinking about these things so that I don't have to.
A central, guiding principle of GPG is that you should only trust the parties (in the general sense) that you have explicitly verified as trustworthy. So this probably means yourself and also the person you're communicating with. No one else should be trusted. Not your wifi. Not your ISP. Not some random server. This poses a bit of a problem though: if two people (who know each other, but are not in physical proximity) wish to securely communicate without previously obtaining each other's public keys, how do they do it? Once they have each other's public keys, the math takes over and we're all secure, but how do we get each party's public keys to each other without either side trusting any intermediary? After all, if I know that Edward is trying to send a message to Glenn, one of the easiest ways for me to intercept the contents of that message is to fool Edward into thinking that my public key is actually Glenn's. And if I'm the NSA, I have a lot of resources for covertly achieving this goal!
GPG's answer to this question involves something called a web of trust. The idea is that interested parties will run key servers which host public keys. A public key is not only the 4096 bit (or in most cases, smaller) integer representing the public key itself, as well as the metadata we entered during generation (name, email, expiration), but also a set of signatures which have been applied to that key. The idea is that if Bob is really sure that the key with signature CAFE BABE
belongs to Alice, then Bob can sign the contents of that public key (including the metadata) and publish that signature to a key server. This is literally a public attestation that Alice's key corresponds to the real Alice, and anyone who wants to talk to Alice can use that key.
Of course, this public attestation is only as trustworthy as Bob. Or rather, as trustworthy as Bob's public key. So we've taken the problem of trusting Alice's key and made it a problem of trusting Bob's key. But the idea is that maybe Bob is trusted by someone else, Carl. And maybe Carl is trusted by Diana. And maybe Diana is a friend of ours, and we trust them.
GPG allows us to express this notion of transitive trust by simply signing keys, just like Bob did for Alice, and then publishing the signatures. This publication doesn't require trusting the key server (they can't falsify the data, only corrupt it), but it allows us to establish a chain of intermediaries through which we can obtain some notion that Alice's key is that key right there.
We can get this whole process rolling by publishing our public key to one of the standard key servers. pgp.mit.edu
is a very commonly-used server, and it syncs with almost everything. You can publish your key to this server with the following command:
$ gpg --keyserver pgp.mit.edu --send-key [email protected]
Replace the email address with the one you entered earlier during key generation.
In practice, very very few people actually rely on the web of trust property of GPG's published signatures. It's just too awkward to use, and also quite difficult to verify. If there are six links between my key and yours (well above average, but still possible), how much do I really trust that your key is valid? Did everyone along the line verify their signatories to the extent that I would want, or did they just find someone's random key and attach their signature because they thought it was cool?
We have no way of knowing. So Keybase takes a different approach. Rather than trying to provide a mechanism by which Bob can verify that Alice is the real Alice – i.e. the living, breathing human being who has a public key – Keybase instead provides a mechanism by which Bob can verify that the Alice they're sending a message to is the same Alice they've been following on Twitter, Facebook, Github, and maybe more. (Bob has taken a sudden turn for the creepy in this metaphor) Literally, the idea is that we are our online presence, nothing more or less. This is a pretty cool concept, and I think in practice it more closely matches our notion of online identity.
From a more rigorous threat modeling standpoint, Keybase invites users to invest their trust in the extreme unlikelihood that all of Twitter, Github, Facebook, Reddit, the DNS infrastructure, and more are compromised simultaneously. As an example, take a look at my profile: https://keybase.io/djspiewak. There are several "proofs" listed: one for Twitter, one for Github, one for Reddit, and one for codecommit.com
. Each of these proofs is a simple statement, signed by my key: I am djspiewak
on Keybase. That's it.
In order for someone else to fool you into thinking that their key is in fact my key, they would need to have changed my Twitter proof and my Github proof and my Reddit proof and compromised my DNS server to edit my domain's TEXT
records. All at the same time. Ideally without anyone at any of these companies (or elsewhere!) noticing anything was amiss. Is that possible? Yes. Is it likely? Absolutely not. Even a nation state adversary would have to work very, very hard to either compromise or coerce all of these disparate parties into presenting incorrect information. To my knowledge, I am not the target of a nation state adversary, so I'm pretty comfortable with the system Keybase provides.
Critically though, we don't have to trust Keybase in any of this. The proofs are in the services (i.e. I actually tweeted, gisted, reddited, and edited my DNS records); Keybase only hosts the metadata which links these services together and displays it in a nice form, but that metadata is trivial. You don't have to take Keybase's word for it. You can visit the individual proofs, verify you're on Twitter/Github/Reddit/etc, and check the validity of the signature.
Keybase also has a ton of other features that we're basically not interested in for the purposes of this document. That doesn't matter though: their identity linking feature alone is worth signing up for.
I think sign-ups are open now. If they aren't, ping me and I'll shoot you an invite. Once you've created your account, upload your GPG key. DO NOT UPLOAD YOUR PRIVATE KEY! This is incredibly important. Don't do it. Only upload your public key. You'll be presented with a setup dialog in which you can choose public + private, or two different ways of uploading the public key. Do one of those ways. Do not upload your private key. To do so would be to trust Keybase's servers with your identity, and I don't trust anyone with that.
For the record, you can export your public key in ASCII form using the following command:
$ gpg -a --export [email protected]
That exported public key is just that: public. Publish it wherever you want. Send it to Keybase, post it to a gist, print it on a business card, whatever.
Anyway, once your key is uploaded, you can follow their instructions to publish a Twitter and Github proof, as well as any others you care to publish. I recommend at least those two, and ideally a few more.
While not providing any cryptographic features of note, Github does have the ability to associate a public key with an account. There is no cryptographic proof here – you're just taking Github's word for things – but that's still worth something. You can upload the public key exported using the gpg -a --export
incantation in your Github Settings (under "SSH and GPG Keys").
As a sidebar, it's worth taking a moment to discuss key fingerprints. Every public key has a fingerprint, which is a 40 byte hash of the number which is the key. As an example, mine is 27DD C030 0B8C 55E0 FF6A 14E8 4663 0499 1E23 B3D7
. You can see your fingerprint by running the following command:
$ gpg --fingerprint [email protected]
Running this command will also show you a shorter, 8 byte descriptor. In the case of my key, this is 1E23B3D7
. It is always the 8 least significant bytes of your 40 byte fingerprint. These descriptors are often used for concision, and since a substring of a hash is itself a hash, it is a valid thing to do. However, note that 8 byte fingerprints are, by definition, prone to collision. It's not at all improbable for two keys to have the same 8 byte descriptor, and this has been used by malicious parties as an avenue to attack certain people.
When in doubt, use more bytes. It is effectively impossible (as of 2017) to present a fraudulent key which has the same 40 byte fingerprint as your bona-fide key. To do so would require a successful preimage attack, to which not even the lowly MD5 is vulnerable (I believe key fingerprints are actually SHA1 hashes).
Any time you publish Scala artifacts, you should be signing them using your key. This signature cryptographically proves (which is to say, proves to at least the hardness of prime factoring 4096 bit integers) that you and you alone could have published those artifacts. If you publish artifacts to Sonatype, this is an actual hard requirement and they will reject unsigned releases. If you publish to Bintray, it is not a hard requirement, mostly because they don't understand PGP. You should do it anyway.
Publishing signed artifacts from SBT is nearly trivial, thanks to the sbt-pgp plugin. Once that plugin is in your build, just use the publishSigned
task instead of publish
and you're good to go! It is also relatively easy to configure sbt-release to use publishSigned
instead of publish
, if you're already using that plugin.
As a sidebar, you should probably configure useGpg := true
in your build.sbt
. This setting is important because it causes SBT to delegate to the gpg
executable on your system, rather than directly accessing your keychain using the Bouncycastle Java crypto APIs. This is good for two reasons. First, Bouncycastle has several bugs which actually make it impossible to optimize the security of your GPG keys (e.g. my keys cannot be properly read by Bouncycastle, due to the way that I've air-gapped the primary key). Second, using gpg
means that you will be entering the all-important primary key password (the one you set back when you generated the key) into gpg-agent
rather than into some anonymous SBT readline
. If you're on Mac and using GPG Tools (you should!), gpg-agent
will have a nice UI and even integrate with the system Keychain if you want. More importantly, it is immune to several classes of attacks which could steal your private key, whereas a random SBT build into which anyone can inject arbitrary code for you to execute is… not.
Anyway, with that out of the way, there's still a problem remaining. Even with all of the steps we've taken – uploading our key to MIT, creating Keybase proofs, and signing our published artifacts – it still isn't 100% clear that a truly paranoid consumer of our project would be able to trust the artifacts we pushed to Sonatype/Bintray. This is where signed tags come into play.
Basically all software releases correspond to tags in Git. For example, the 0.3 release of cats-effect corresponds to this tag in the git repo. These are the sources which were used to generate the artifacts which I then signed and published as cats-effect 0.3, and this is the proper place to "close the loop" on trust.
We've already proven that these random artifacts on Sonatype were published by some guy who controls both @djspiewak
and djspiewak
and a few more things. Now we need to prove that this person, whoever they are, is also the person who created the 0.3 release of cats-effect according to the source code. And we do this by signing the v0.3
tag with the same key that we use to sign the published artifacts:
$ git tag -s -m 'Tagging 0.3' v0.3
You'll be prompted to unlock your private key (by gpg-agent
). The results will be a tag with some extra metadata: namely, a signature derived from the SHA1 hash of the commit and your private key. This signature is a cryptographic proof that you and you alone could have tagged the release in the repository, just as the signed artifacts is a cryptographic proof that you and you alone could have made the release on Sonatype/Bintray, and the fact that those two keys match means that a properly paranoid consumer of your software can rest easy knowing that no outside parties have tampered with the dependency they are about to bring in.
If you do all this right, Github will reward you with a nifty little badge on your releases page:
Take the time to get this stuff right. Please. As a community, we're usually pretty lazy about verifying that the dependencies we're about to import into our data center are in fact what we think they are, published by people we expect to be publishing those dependencies, and not a thinly-veiled attempt to steal our corporate data (or worse). As a maintainer, you have a responsibility not only to those who are casually adding your library to their toy project, but also to those serious security folks who actually understand sbt checkPgpSignatures
and even use it on a regular basis.
And who knows? With the way the world is headed, you're probably going to want to beef up your personal cryptography story sooner rather than later.
Comments, criticisms, corrections, etc are all very welcome, but Gist doesn't forward them on to me (I'm assuming because Scott decided a long time ago that he really, really didn't want to get email from Gist and no one else did either). You're better off replying to this content on Twitter