Nicolas IOOSS, 2024-12-10
Like previous puzzles, the day before the last puzzle of ZK Hack V a tweet proposed a warm-up challenge:
Ready for ๐๐ ๐๐๐๐ค ๐ Puzzle #3 tomorrow?
It has now become a habit โ let's warm up! Can you figure out the name of our last puzzle?
๐๐ช๐น๐ซ๐โ๏ธ๐ฅ
The tweet includes an extract from series Pretty Little Liars, Season 4, Episode 19: Shadow Play (extract on YouTube, at 1:22) where Alison tells Spencer: "You have all the pieces. So why can't you put them together?"
This does not help much finding that the puzzle title will be Shadow (as revealed in a later Tweet).
How are the emojis related to the title?
Python provides a module to decode Unicode characters, unicodedata
.
It can be used in a simple loop to get the official names of the emojis:
import unicodedata
for c in "๐๐ช๐น๐ซ๐โ๏ธ๐ฅ":
print(f"{c} (U+{ord(c):5X}): {unicodedata.name(c)}")
๐ (U+1F344): MUSHROOM
๐ช (U+1F1EA): REGIONAL INDICATOR SYMBOL LETTER E
๐น (U+1F1F9): REGIONAL INDICATOR SYMBOL LETTER T
๐ซ (U+1FAD6): TEAPOT
๐ (U+1F41E): LADY BEETLE
โ (U+ 26C4): SNOWMAN WITHOUT SNOW
(U+ FE0F): VARIATION SELECTOR-16
๐ฅ (U+1F95D): KIWIFRUIT
The 6 emojis are actually represented by 8 Unicode characters because:
- The flag of Ethiopia is encoded with two regional indicator symbols associated to its ISO 3166-1 alpha-2 country code ET.
VARIATION SELECTOR-16
is used to make the snowman displayed as an emoji (instead of text). This behavior is specified in Unicodeยฎ Technical Standard #51 Unicode Emoji, ED-9. emoji presentation selector.
Anyway, the emojis are:
Mushroom
Ethiopia Flag
Teapot
Lady Beetle
Snowman Without Snow (Emoji)
Kiwifruit
Extracting the third letters gives "shadow", the title of the puzzle.
The puzzle was released on https://zkhack.dev/zkhackV/puzzleV3.html:
Itโs a wonderful era of ZK where we can use it for flexible and safe authentication mechanisms. Or can we? This puzzle was inspired by the zkEmail and Aptos Keyless circuit codebases.
It contains a link to a GitHub repository: https://github.com/ZK-Hack/puzzle-shadow. After some commands to install Noir and the Barretenberg proving backend, the README file presents the subject of the puzzle:
A cutting-edge crypto company unveiled http://JWT.pk, a revolutionary identity infrastructure platform designed to simplify private key management. By allowing users to register seamlessly with existing keys, the service promised to redefine convenience and security in the digital world. It allows users to send amounts up to $100 without having access to their private keys.
The registration process goes as follows: Users sign up with their existing public keys (pk) http://JWT.pk then samples a secret value called โpepperโ and computes an identifier of the form: โ{pk}_{pepper}โ. SHA256 of the identifier is then added to the set of registered identifiers.
Users can later authenticate a transaction by simply providing a proof of knowledge of the whitelisted identifier formed by the (pk, pepper) pair without needing to use their private key at any step of the process.
Alice found out that Bob registered to http://JWT.pk with a public key โBOB_pkโ. She then registered using โALICE_pkโ and obtained โALICE_pepperโ from http://JWT.pk.
Bob woke up one morning to see his account drained. How did Alice do it?
The repository is a Noir project with few files.
The implementation of the transaction authentication schema happens in src/main.nr
, with a function taking 3 parameters:
fn main(identifier: BoundedVec<u8, 128>, pub_key: pub [u8; 32], whitelist: pub [[u8; 32]; 10]) {
Another file, Prover.toml
, defines some values for these parameters.
Here is an simplified copy of this file:
# BEGIN DON'T CHANGE THIS SECTION
pub_key = [157, 133, 167, 136, ...]
whitelist = [...]
# END DON'T CHANGE THIS SECTION
# BEGIN HACK
[identifier]
len = "128"
storage = [...]
# END HACK
This makes it clear the puzzle aims to craft a valid identifier
for the provided public key and whitelist.
What does src/main.nr
do with the parameters?
It starts by computing the SHA256 digest of identifier
:
let digest = std::hash::sha256_var(identifier.storage(), identifier.len() as u64);
Then it verifies whitelist
contains this digest:
let mut present = false;
for i in 0..whitelist.len() {
if whitelist[i] == digest {
present = true;
}
}
assert(present);
Finally, it verifies identifier
contains pub_key
:
// the specified public key is in the identifier
let id_haystack: StringBody128 = StringBody::new(identifier.storage(), 128);
let pk_needle: SubString32 = SubString::new(pub_key, 32);
let (result, _): (bool, u32) = id_haystack.substring_match(pk_needle);
assert(result);
The puzzle does not provide a valid value for identifier
.
The README nonetheless indicates it is {pk}_{pepper}
with a public key pk
and a secret pepper
.
Alice's public key and pepper are provided in src/main.nr
:
// alice_pk = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187, 35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62]
// alice_pepper = [213, 231, 76, 105, 105, 96, 199, 183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254, 205, 79, 21, 90, 13, 39, 172, 114, 59, 131, 15, 78, 118]
We can create an identifier by concatenating these two lists and inserting an underscore between them. In Python:
alice_pk = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62]
alice_pepper = [213, 231, 76, 105, 105, 96, 199, 183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254,
205, 79, 21, 90, 13, 39, 172, 114, 59, 131, 15, 78, 118]
alice_id = alice_pk + [ord("_")] + alice_pepper
print(f"Alice ID = {alice_id}")
import hashlib
digest = hashlib.sha256(bytes(alice_id)).digest()
print(f"SHA256(Alice ID) = {list(digest)}")
Alice ID = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62, 95, 213, 231, 76, 105, 105, 96, 199,
183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254, 205, 79, 21, 90, 13, 39, 172, 114, 59, 131,
15, 78, 118]
SHA256(Alice ID) = [190, 109, 123, 228, 174, 32, 60, 171, 12, 164, 196, 218, 12, 200, 191, 101, 28,
15, 130, 203, 4, 165, 3, 157, 68, 159, 122, 209, 184, 103, 215, 149]
The digest matches the last entry of the whitelist (in Prover.toml
line 14)!
If we call the Noir program with Alice's public key and her identifier, it should work.
Let's modify Prover.toml
with:
pub_key = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62]
whitelist = [
# ... # the whitelist is not modified
]
[identifier]
len = "65"
storage = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62, 95, 213, 231, 76, 105, 105, 96, 199,
183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254, 205, 79, 21, 90, 13, 39, 172, 114, 59, 131,
15, 78, 118]
Running nargo execute
fails:
The value passed for parameter `identifier.storage` does not match the specified type:
Type Array { length: 128, typ: Integer { sign: Unsigned, width: 8 } } is expected to have length
128 but value Vec([Field(155), Field(143), [...], Field(78), Field(118)]) has length 65
Alice's identifier is 65-byte long but the Noir program is expecting 128 bytes in identifier.storage
.
Let's add enough zeros to fill 128 bytes, in Prover.toml
:
[identifier]
len = "65"
storage = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62, 95, 213, 231, 76, 105, 105, 96, 199,
183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254, 205, 79, 21, 90, 13, 39, 172, 114, 59, 131,
15, 78, 118, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0]
Now, nargo execute
succeeds and displays [puzzle3] Circuit witness successfully solved
.
We therefore managed to use Alice's public key and secret pepper to craft a valid identifier which authenticate her.
This is nice, but this is not a valid solution of the puzzle.
Indeed, the pub_key
in Prover.toml
was modified (to Alice's public key), whereas the file indicates "DON'T CHANGE THIS SECTION".
Putting the initial public key back makes nargo execute
fail:
error: Assertion failed: 'utils::search could not find needle in haystack'
โโ /root/nargo/github.com/noir-lang/noir_string_search.gitv0.3.0/src/utils.nr:27:12
โ
27 โ assert(found == true, "utils::search could not find needle in haystack");
โ -------------
โ
= Call stack:
1. /puzzle-shadow/src/main.nr:22:36
2. /home/user/nargo/github.com/noir-lang/noir_string_search.gitv0.3.0/src/lib.nr:238:13
3. /home/user/nargo/github.com/noir-lang/noir_string_search.gitv0.3.0/src/utils.nr:27:12
Taking a step back, this failure is normal: we are trying to authenticate someone else using Alice's identifier. If this worked, it would be an authentication bypass vulnerability. But actually the subject indicates there is such a vulnerability:
Bob woke up one morning to see his account drained. How did Alice do it?
So, how did Alice manage to authenticate as Bob?
The previous section constructed Alice's identifier and successfully used it with her public key.
But Prover.toml
contains another public key.
With the subject in mind, it is likely to be Bob's key.
Now, the puzzle becomes clear: it consists in crafting a valid identifier knowing Alice's secret credentials to successfully authenticate as Bob.
The identifier actually consists of two fields: a length (len
) and a storage of 128 bytes (storage
).
The Noir program starts by computing the SHA256 digest of the len
first bytes of storage
.
It verifies this digest is in the whitelist.
And it finally verifies the provided public key is anywhere in storage
... whatever len
is!
If we replace the zeros we added in storage
with Bob's public key, would it work?
[identifier]
len = "65"
storage = [155, 143, 27, 66, 87, 125, 33, 110, 57, 153, 93, 228, 167, 76, 120, 220, 178, 200, 187,
35, 211, 175, 104, 63, 140, 208, 36, 184, 88, 1, 203, 62, 95, 213, 231, 76, 105, 105, 96, 199,
183, 106, 26, 29, 7, 28, 234, 145, 69, 48, 9, 254, 205, 79, 21, 90, 13, 39, 172, 114, 59, 131,
15, 78, 118,
157, 133, 167, 136, 43, 161, 75, 166, 33, 14, 35, 106, 238, 18, 60, 56, 93, 209, 205, 52, 247,
110, 174, 192, 20, 58, 70, 42, 215, 98, 195, 150,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
$ nargo execute
...
[puzzle3] Circuit witness successfully solved
[puzzle3] Witness saved to /puzzle-shadow/target/puzzle3.gz
It does work! This is a valid solution of the puzzle.
Alice is able to impersonate any other user by putting their public key next to her identifier in storage
(with keeping len = "65"
).
To fix this vulnerability, several options exist:
- The program could verify
identifier.storage
only contains zeros afteridentifier.len
bytes. - The program could ensure
id_haystack.substring_match(pk_needle)
returns a position belowidentifier.len
. - The program could make
identifier
a fixed-size type[u8; 65]
, removing the need to pad data. - The program could instead verify
identifier.storage
starts withpub_key
. This also prevents an attacker from registering using the public key of a victim user as their pepper (doing so would make thesubstring_match
check pass).
The last option is the stronger one and here is a possible way to implement it:
use noir_string_search::{StringBody, StringBody128, SubString, SubString32};
fn main(identifier: BoundedVec<u8, 128>, pub_key: pub [u8; 32], whitelist: pub [[u8; 32]; 10]) {
// the identifier hashes to a digest that is in the public whitelist
let digest = std::hash::sha256_var(identifier.storage(), identifier.len() as u64);
let mut present = false;
for i in 0..whitelist.len() {
if whitelist[i] == digest {
present = true;
}
}
assert(present);
// the identifier starts with the specified public key
for i in 0..32 {
assert_eq(identifier.storage()[i], pub_key[i]);
}
}
Using padding to make variable-length data fit into fixed-length arrays is dangerous.
This puzzle illustrated what can happen when functions ignoring the variable-length nature of the data are used to verify sensitive properties such as "the data contains the provided public key". Here, this enabled Alice to impersonate Bob and perform unauthorized actions on his behalf.