Tahoe LAFS is a distributed file system with an interesting permissions model. (whitepaper)
Both Immutable and Mutable files are supported (Mutable is the most complex and interesting)
There are three levels of permissions, Write
, Read
, and Verify
. Each permission is
granted by giving a user a special key called a "capability". If you have the Write
capability
you can update the file, if you have the Read
capability you can retrieve the plain text,
but if you only have the Verify
capability you can only validate the file integrity, but not read the contents.
The lower level capabilities are generated deterministically from the higher level capabilites.
So, someone who has the Write
capability can generate the Read
capability and give it to someone,
who will then be able to read the plain text of that file.
The Tahoe LAFS paper describes two methods of implementing this 3-layer capability model, I'll just discuss the first one.
here are some methods I'll use to describe the system, the specific cryptographic algorithms used for each are given in the paper.
// hash a text (or binary blob)
sum = hash(text)
// encrypt a text/blob
ciphertext = encrypt(key, plaintext)
// sign a text/blob.
signature = sign(private_key, text)
// generate a random salt
salt = random()
// generate the public key to a given private key
public_key = public(private_key)
// verify that a blob is signed
signed = verify(sig, public_key, blob)
// upload a file + metadata
updload(id, tuple)
Each file is associated with a private key (called the "signing key" in the paper), each update to that file must be signed with that private key. (note, the private key represents write access to the file, not the user). The private key is encrypted and stored with the file metadata.
Given a public-private key pair, the various capabilities are generated like this:
write_cap = hash(private_key)
verify_cap = hash(public_key)
//the read key is comprised of the hash(write_cap), and the verify_cap
read_cap = {hash(write_cap), verify_cap}
To write a new file, the writer generates a key pair, encrypts the file contents, the private key, signs the encrypted file + metadata, and then uploads everything.
{public_key, private_key} = generate_pair()
write_cap = hash(private_key)
read_cap = hash(public_key)
verify_cap = hash(read_key)
salt = random()
cipherkey = encrypt(hash(write_cap + salt), private_key)
ciphertext = encrypt(hash(read_cap + salt), plaintext)
signature = sign(private_key, hash(ciphertext + salt))
cipher_public_key = encrypt(hash(verify_cap + salt), public_key)
upload(file_id, {cipherkey, ciphertext, salt, sig, cipher_public_key})
The purpose of encrypting with hash(read_cap + salt)
is so that
each version of the file is encrypted with a different key, which
prevents against certain attacks.
(this isn't mentioned specifically in the paper, I'm reading between the lines)
In the paper, the ciphertext
is hashed with a merkle tree,
so that it is possible to spread the file across multiple servers.
However it's not necessary to consider this in order to understand
the capability/permissions model.
verification is also the first step in reading, so I'll explain it first.
the verify downloads the encrypted file + metadata and then verifys the public key and the signature.
{cipherkey, ciphertext, salt, sig, cipher_public_key} = download(file_id)
//reconstruct the public key
public_key = decrypt(hash(verify_cap + salt), cipher_public_key)
if(verify_cap != hash(public_key))
throw INVALID
if(!verify(public_key, sig, hash(ciphertext + salt)))
throw INVALID
if no exceptions where thrown, the file is valid.
continue from the end of the verify step.
reconstruct the key
and then decrypt the file.
plaintext = decrypt(hash(read_cap + salt), ciphertext)
Finally, to update a file, the updater only needs to know the write_cap
.
As they can use it do decrypt the private_key
//we only care about the cipher key and the salt in this case.
{cipherkey, _, salt, _, _} = download(file_id)
//reconstruct the private_key
private_key = decrypt(hash(write_cap + salt), cipherkey)
salt2 = random()
ciphertext2 = encrypt(hash(read_cap + salt2), plaintext2)
signature = sign(private_key, hash(ciphertext2 + salt2))
upload(file_id, {cipherkey, ciphertext2, salt2, sig2, public_key})
The paper describes how TahoeLAFS uses multiple servers, and splits each file across multiple servers. This both improves robustness, but also means that a server can't return a stale version of a mutable file, because it won't agree with the other servers, assuming most of the servers are not in collusion!
Tahoe has a clever model for implementing permissions where the server is not actually in a position of authority. The clients can always verify that what the server gave it was correct, and clients also have the ability to delegate permissions to other clients without ever having rely on the servers as referee, other than to run the protocol correctly!
This was an enlightening read! Your clear and engaging writing made it very enjoyable. And specially reminds me srd status check