-
2024-05-15: I took a quick look at the Matrix source code. I identified two issues and emailed them to their security@ email address.In my email, I specify that I plan to disclose my findings publicly in 90 days (i.e. on August 14), in adherence with industry best practices for coordinated disclosure, unless they request an extension in writing.
-
2024-05-16: I checked something else on a whim and find a third issue, which I also email to their security@ email address.
-
2024-05-17: Matrix security team confirms receipt of my reports.
-
2024-05-17: I follow up with a suspected fourth finding–the most critical of them all. They point out that it is not actually an issue, because I overlooked an important detail in how the code is architected. Mea culpa!
-
2024-05-18: A friend discloses a separate finding with Matrix’s protocol. (Will update this to link to their write-up once it’s public.)They instructed the Matrix developers to consult with me if they needed cryptography guidance. I never heard from them on this externally reported issue.
-
2024-07-12: I shared this blog post draft with the Matrix security team while reminding them of the public disclosure date.
-
2024-07-31: Matrix pushes a commit that announces that libolm is deprecated.
-
2024-07-31: I email the Matrix security team asking if they plan to fix the reported issues (and if not, if there’s any other reason I should withhold publication).
-
2024-07-31: Matrix confirms they will not fix these issues (due to its now deprecated status), but ask that I withhold publication until the 14th as originally discussed.
-
2024-08-14: This blog post is publicly disclosed to the Internet.
-
2024-08-14: The lead Matrix dev claims they already knew about these issues, and, in fact, knowingly shipped cryptography code that was vulnerable to side-channel attacks for years. See Addendum.
-
-
Save soatok/0a3d24710b2ccac2aac14820008b06ab to your computer and use it in GitHub Desktop.
I identified the following issues with Olm through a quick skim of their source code on Gitlab:
This is sorted by the order in which they were discovered, rather than severity.
Olm ships a pure-software implementation of AES, rather than leveraging hardware acceleration.
// Substitutes a word using the AES S-Box.
WORD SubWord(WORD word)
{
unsigned int result;
result = (int)aes_sbox[(word >> 4) & 0x0000000F][word & 0x0000000F];
result += (int)aes_sbox[(word >> 12) & 0x0000000F][(word >> 8) & 0x0000000F] << 8;
result += (int)aes_sbox[(word >> 20) & 0x0000000F][(word >> 16) & 0x0000000F] << 16;
result += (int)aes_sbox[(word >> 28) & 0x0000000F][(word >> 24) & 0x0000000F] << 24;
return(result);
}
The code in question is called from this code, which is in turn used to actually encrypt messages.
Software implementations of AES that use a look-up table for the SubWord step of the algorithm are famously susceptible to cache-timing attacks.
This kind of vulnerability in software AES was previously used to extract a secret key from OpenSSL and dm-crypt in about 65 milliseconds. Both papers were published in 2005.
A general rule in cryptography is, “attacks only get better; they never get worse“.
As of 2009, you could remotely detect a timing difference of about 15 microseconds over the Internet with under 50,000 samples. Side-channel exploits are generally statistical in nature, so such a sample size is generally not a significant mitigation.
In the above code snippet, the vulnerability occurs in aes\_sbox\[/\* ... \*/\]\[/\* ... \*/\]
.
Due to the details of how the AES block cipher works, the input variable (word) is a sensitive value.
Software written this way allows attackers to detect whether or not a specific value was present in one of the processor’s caches.
To state the obvious: Cache hits are faster than cache misses. This creates an observable timing difference.
Such a timing leak allows the attacker to learn the value that was actually stored in said cache. You can directly learn this from other processes on the same hardware, but it’s also observable over the Internet (with some jitter) through the normal operation of vulnerable software.
See also: cryptocoding’s description for table look-ups indexed by secret data.
The correct way to solve this problem is to use hardware accelerated AES, which uses distinct processor features to implement the AES round function and side-steps any cache-timing shenanigans with the S-box.
Not only is this more secure, but it’s faster and uses less energy too!
If you’re also targeting devices that don’t have hardware acceleration available, you should first use hardware acceleration where possible, but then fallback to a bitsliced implementation such as the one in Thomas Pornin’s BearSSL.
See also: the BearSSL documentation for constant-time AES.
Ed25519 libraries come in various levels of quality regarding signature validation criteria; much to the chagrin of cryptography engineers everywhere. One of those validation criteria involves signature malleability.
Signature malleability usually isn’t a big deal for most protocols, until suddenly you discover a use case where it is. If it matters, that usually that means you’re doing something with cryptocurrency.
Briefly, if your signatures are malleable, that means you can take an existing valid signature for a given message and public key, and generate a second valid signature for the same message. The utility of this flexibility is limited, and the impact depends a lot on how you’re using signatures and what properties you hope to get out of them.
For ECDSA, this means that for a given signature , a second signature is also possible (where is the order of the elliptic curve group you’re working with).
Matrix uses Ed25519, whose malleability is demonstrated between and .
This is trivially possible because S is implicitly reduced modulo the order of the curve, , which is a 253-bit number (0x1000000000000000000000000000000014def9dea2f79cd65812631a5cf5d3ed) and S is encoded as a 256-bit number.
The Ed25519 library used within Olm does not ensure that , thus signatures are malleable. You can verify this yourself by looking at the Ed25519 verification code.
int ed25519_verify(const unsigned char *signature, const unsigned char *message, size_t message_len, const unsigned char *public_key) {
unsigned char h[64];
unsigned char checker[32];
sha512_context hash;
ge_p3 A;
ge_p2 R;
if (signature[63] & 224) {
return 0;
}
if (ge_frombytes_negate_vartime(&A, public_key) != 0) {
return 0;
}
sha512_init(&hash);
sha512_update(&hash, signature, 32);
sha512_update(&hash, public_key, 32);
sha512_update(&hash, message, message_len);
sha512_final(&hash, h);
sc_reduce(h);
ge_double_scalarmult_vartime(&R, h, &A, signature + 32);
ge_tobytes(checker, &R);
if (!consttime_equal(checker, signature)) {
return 0;
}
return 1;
}
This is almost certainly a no-impact finding (or low-impact at worst), but still an annoying one to see in 2024.
If you’d like to learn more, this page is a fun demo of Ed25519 malleability.
To mitigate this, I recommend implementing these checks from libsodium.
If you weren’t already tired of cache-timing attacks based on table look-ups from AES, the Matrix base64 code is also susceptible to the same implementation flaw.
while (pos != end) {
unsigned value = DECODE_BASE64[pos[0] & 0x7F];
value <<= 6; value |= DECODE_BASE64[pos[1] & 0x7F];
value <<= 6; value |= DECODE_BASE64[pos[2] & 0x7F];
value <<= 6; value |= DECODE_BASE64[pos[3] & 0x7F];
pos += 4;
output[2] = value;
value >>= 8; output[1] = value;
value >>= 8; output[0] = value;
output += 3;
}
The base64 decoding function in question is used to load the group session key, which means the attack published in this paper almost certainly applies.
Steve Thomas (one of the judges of the Password Hashing Competition, among other noteworthy contributions) wrote some open source code a while back that implements base64 encoding routines in constant-time.
The real interesting part is how it avoids a table look-up by using arithmetic (from this file):
// Base64 character set:
// [A-Z] [a-z] [0-9] + /
// 0x41-0x5a, 0x61-0x7a, 0x30-0x39, 0x2b, 0x2f
inline int base64Decode6Bits(char src)
{
int ch = (unsigned char) src;
int ret = -1;
// if (ch > 0x40 && ch < 0x5b) ret += ch - 0x41 + 1; // -64
ret += (((0x40 - ch) & (ch - 0x5b)) >> 8) & (ch - 64);
// if (ch > 0x60 && ch < 0x7b) ret += ch - 0x61 + 26 + 1; // -70
ret += (((0x60 - ch) & (ch - 0x7b)) >> 8) & (ch - 70);
// if (ch > 0x2f && ch < 0x3a) ret += ch - 0x30 + 52 + 1; // 5
ret += (((0x2f - ch) & (ch - 0x3a)) >> 8) & (ch + 5);
// if (ch == 0x2b) ret += 62 + 1;
ret += (((0x2a - ch) & (ch - 0x2c)) >> 8) & 63;
// if (ch == 0x2f) ret += 63 + 1;
ret += (((0x2e - ch) & (ch - 0x30)) >> 8) & 64;
return ret;
}
Any C library that handles base64 codecs for private key material should use a similar implementation. It’s fine to have a faster base64 implementation for non-secret data.
Worth noting: Libsodium also provides a reasonable Base64 codec.
It’s been more than 2 years since they released vodozemac. What does the ecosystem penetration for this new library look like, in practice?
A quick survey of the various Matrix clients on GitHub says that libolm is still the most widely used cryptography implementation in the Matrix ecosystem (as of this writing):
Matrix Client | Cryptography Backend |
---|---|
https://github.com/tulir/gomuks | libolm (1, 2) |
https://github.com/niochat/nio | libolm (1, 2) |
https://github.com/ulyssa/iamb | vodozemac (1, 2) |
https://github.com/mirukana/mirage | libolm (1) |
https://github.com/Pony-House/Client | libolm (1) |
https://github.com/MTRNord/cetirizine | vodozemac (1) |
https://github.com/nadams/go-matrixcli | none |
https://github.com/mustang-im/mustang | libolm (1)) |
https://github.com/marekvospel/libretrix | libolm (1) |
https://github.com/yusdacra/icy_matrix | none |
https://github.com/ierho/element | libolm (through the python SDK) |
https://github.com/mtorials/cordless | none |
https://github.com/hwipl/nuqql-matrixd | libolm (through the python SDK |
https://github.com/maxkratz/element-web | vodozemac (1, 2, 3, 4) |
https://github.com/asozialesnetzwerk/riot | libolm (wasm file) |
https://github.com/NotAlexNoyle/Versi | libolm (1, 2) |
3 of the 16 clients surveyed use the new vodozemac library. 10 still use libolm, and 3 don’t appear to implement end-to-end encryption at all.
If we only focus on clients that support E2EE, vodozemac has successfully been adopted by 19% of the open source Matrix clients on GitHub.
I deliberately excluded any repositories that were archived or clearly marked as “old” or “legacy” software, because including those would artificially inflate the representation of libolm. It would make for a more compelling narrative to do so, but I’m not trying to be persuasive here.