- Author: Nicolas IOOSS
- Date: 2024-01-18
- Puzzle: https://zkhack.dev/zkhackIV/puzzleF1.html, https://github.com/ZK-Hack/puzzle-gamma-ray
Bob was deeply inspired by the Zcash design [1] for private transactions and had
some pretty cool ideas on how to adapt it for his requirements. He was also
inspired by the Mina design for the lightest blockchain and wanted to combine
the two. In order to achieve that, Bob used the MNT6753 cycle of curves to
enable efficient infinite recursion, and used elliptic curve public keys to
authorize spends. He released a first version of the system to the world and
Alice soon announced she was able to double spend by creating two different
nullifiers for the same key...
[1] https://zips.z.cash/protocol/protocol.pdf
Let's find out what the implementation contains and which bugs it could have!
The puzzle takes place in a GitHub repository containing a Rust project and some other files.
The main code is in src/main.rs
.
Function main
loads some files, computes some things and then generate a proof:
let c = SpendCircuit {
leaf_params: leaf_crh_params.clone(),
two_to_one_params: two_to_one_crh_params.clone(),
root: root.clone(),
proof: tree_proof.clone(),
nullifier: nullifier.clone(),
secret: leaked_secret.clone(),
};
let proof = Groth16::<MNT4_753>::prove(&pk, c.clone(), rng).unwrap();
assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());
This code involves some words which are unfamiliar for newcomers to the Zero-Knowledge world.
First, it crafts a SpendCircuit
structure.
The ZK Jargon Decoder defines "Circuit" as:
In ZK literature this usually refers to an arithmetic circuit: an ordered collection of additions and multiplications (and sometimes custom operations) that are applied to a set of inputs to yield an output.
In the same file, there is:
impl ConstraintSynthesizer<ConstraintF> for SpendCircuit {
fn generate_constraints(
self,
cs: ConstraintSystemRef<ConstraintF>,
) -> Result<(), SynthesisError> {
This function crafts operations between the variables present in structure SpendCircuit
(leaf_params
, two_to_one_params
, root
...) and ensures the result of these operations matches some properties called constraints.
This is used in the project to generate many constraints used in a Groth16 proving system. This system enables building proofs about properties such as "the prover knows some values which match all constraints". These values are called witnesses or private inputs. And the proving system also enables verifying these proofs without actually knowing what the witnesses are.
What is Groth16? When asking this on https://zkpod.ai/, the chatbot replies:
Groth16 is a proving system used in verifying polynomial relationships. It is commonly deployed in systems such as PLONK or Groth16's own structure. It's a cost-effective solution but requires a trusted setup ceremony. The Groth16 proof systems were introduced via the Groth 16 paper. Groth16 works more efficiently with pairing friendly curves. In the zkEVM project, a STARK proof is created and then verified using either PLONK or Groth16, combining the advantages of both systems.
Sources: Episode 194, Episode 242, Episode 289
There are also some blog articles about Groth16, such as https://blog.lambdaclass.com/groth16/. This article explains that the name Groth16 actually comes from the paper "On the Size of Pairing-based Non-interactive Arguments" published by Jens Groth in 2016.
Groth16 relies on an elliptic curve to perform operations.
Here, the code uses Groth16::<MNT4_753>
which uses a type imported with this line at the top of src/main.rs
:
use ark_mnt4_753::{Fr as MNT4BigFr, MNT4_753};
This uses the Rust crate ark-mnt4-753
, which documents:
This library implements the MNT4_753 curve generated in [BCTV14]. The name denotes that it is a Miyaji-Nakabayashi-Takano curve of embedding degree 4, defined over a 753-bit (prime) field. The main feature of this curve is that its scalar field and base field respectively equal the base field and scalar field of MNT6_753.
In short, the program generates a proof using Groth16 proving system on MNT4-753.
Now, what is actually being proved?
In the project, src/main.rs
builds a circuit and generates a proof with Groth16.
The circuit uses six parameters defined in a structure named SpendCircuit
in lines 63-71:
struct SpendCircuit {
pub leaf_params: <LeafH as CRHScheme>::Parameters,
pub two_to_one_params: <LeafH as CRHScheme>::Parameters,
pub root: <CompressH as TwoToOneCRHScheme>::Output,
pub proof: Path<MntMerkleTreeParams>,
pub secret: ConstraintF,
pub nullifier: ConstraintF,
}
Right now, it is unclear what these fields mean. The types can give some pointers to the documentation.
CRHScheme
is defined in ark_crypto_primitives
and documented as "Interface to CRH". But what does CRH mean?
A paper on Cryptology ePrint Archive provides the meaning of this acronym:
Collision-resistant hash functions (CRH) are a fundamental and ubiquitous cryptographic primitive.
So it seems similar to Cryptographic Hash Functions (CHF).
MntMerkleTreeParams
refers to a Merkle Tree, defined in Wikipedia as:
In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" (node) is labelled with the cryptographic hash of a data block, and every node that is not a leaf (called a branch, inner node, or inode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure.
The Wikipedia article also gives a helpful graph:
This makes the purpose of root
and TwoToOneCRHScheme
clearer: root
is the top hash of the Merkle tree and TwoToOneCRHScheme
is the collision-resistant hash function (CRH) used to combine two hashes into one.
By the way, this is called a tree because it is a special kind of graph.
nullifier
is a term used in the Zcash Protocol Specification given in the references to the puzzle.
In section 3.2.3:
The nullifier for a note [...] is a value unique to the note that is used to prevent double-spends.
In the project, line 155 of src/main.rs
contains:
let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();
This code actually computes the hash of a value named leaked_secret
, which is used in field secret
.
The nullifier
is therefore the hash of the secret
.
Why is the secret leaked?
This will surely become clearer later.
Now, let's analyze function generate_constraints
in line 74 to better understand how secret
, root
, nullifier
... are tight together in the Groth16 proof.
As the code is quite verbose, it will not be copied here.
Instead, here is an English translation of the code:
- Define a variable
root
as a public input. - Define
leaf_crh_params_var
andtwo_to_one_crh_params_var
as constants. - Define
secret
as a witness (so a private input). - Extract the bits of
secret
insecret_bits
(in Little Endian order). - Enforce that
secrets
is smaller or equal the modulus of MNT6-753 scalar field (MNT6BigFr::MODULUS
). - Define
nullifier
as an output. - Compute
nullifier_in_circuit
as the hash of[secret]
with parametersleaf_crh_params_var
and enforce that it equals tonullifier
. - Define
base
as the generator of MNT6-753's G1 curve (defined inark-mnt6-753
's documentation). - Define
pk
as the scalar product ofsecret_bits
withbase
, in affine representation. This operation is usually done in elliptic-curve cryptography to compute the public key (pk
) associated with a private key (secret
). - Define
leaf_g
as[pk.x]
, a one-element vector with thex
coordinate of the public key. - Define
cw
as a witness associated with fieldproof
of structureSpendCircuit
. Its type isPathVar
, which represents a Merkle tree path. - Verify that
cw
is a valid proof thatleaf_g
is a member of the Merkle tree with rootroot
, and parametersleaf_crh_params_var
andtwo_to_one_crh_params_var
.
As there are many variables, let's simplify this a bit.
The previous sections explained how the project is crafting a Groth16 proof on curve MNT4-753 with constraints which seem quite complex at first glance. This complexity can be overcome with a few notations.
First, function main
contains (line 153:
let two_to_one_crh_params = leaf_crh_params.clone();
So in practice, there is a single hash function, defined with parameters given by poseidon_parameters
in src/poseidon_parameters.rs
.
Let's call this function
Second, MNT4-753 and MNT6-753 curves use several parameters which can be confusing, because the base field of one curve is the scalar field of the other.
Let's use the scalar prime numbers and define MNT4BigFr::MODULUS
and MNT6BigFr::MODULUS
.
Third, proof
in structure SpendCircuit
is computed by generating a proof of membership of a Merkle tree on line 166.
The circuit verifies this proof, linking pk
and root
together.
Finally, a question remains: in which finite field(s) do values like secret
, nullifier
... lie?
All these variables use type ConstraintF
, defined as MNT4BigFr
on line 22.
So secret
and nullifier
are numbers in MNT4-753's scalar field.
Moreover, the coordinates of points on MNT6-753's G1 curve are also defined on MNT4-753's scalar field (which is why both curves are said to form a cycle).
So pk.x
is also a number in MNT4-753's scalar field.
With these notations, the constraints become:
- Ensure
secret
$\le r_6$ - Compute
nullifier
=$H$ (secret
) and ensure it is equals to thenullifier
given as public input (in Zero-Knowledge circuits, it is common for output variables to be considered as public inputs). - Compute
pk
as the public key on MNT6-753's G1 curve associated withsecret
. It is a point on an elliptic curve and its affine representation consists in two coordinates(pk.x, pk.y)
. - Ensure
[pk.x]
is a leaf of the Merkle tree built with hash$H$ and defined with rootroot
(given as a public input).
And this is what is verified on line 187 when the Groth16 verification is invoked with root
and nullifier
:
assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());
In a nutshell, the project seems to craft a proof for the following statement:
"The prover knows secret
nullifier
and whose public key is a leaf of the Merkle tree with root root
".
Actually, this is similar to the Semaphore circuit.
secret
is used as a private key of a member of a group, which is probably why it was named leaked_secret
: the project is able to generate a proof because it knows one private key.
The previous sections described that the circuit implemented in the project verified some constraints linking secret
, nullifier
and root
.
Is it secure?
The subject says that it is not:
Alice soon announced she was able to double spend by creating two different nullifiers for the same key...
As the nullifier is directly computed from the secret key (with a hash function), having two different values for nullifier
while using the same secret
seems difficult.
What about the public key?
secret
is an element of MNT4-753's scalar field, which can be seen as an integer such that: secret
pk
is computed on MNT6-753's G1 curve, which means that secret
is considered modulo secret
nullifier
.
However the circuit blocks this attack, by verifying secret
But in fact, the public key used in the Merkle tree is not the point pk
but only its x
coordinate.
The reason why the circuit does this is given by the first hint of the puzzle:
Hint 1. See lemma 5.4.7 in the Zcash specs for Bob's reasoning on why to only use the x coordinate when storing the public keys.
Indeed, lemma 5.4.7 page 102 of section "5.4.9.4 Coordinate Extractor for Jubjub" justifies that if a point
However the project is using here MNT6-753's G1 curve instead of Jubjub. This curve is very different:
- The equation of MNT6-753's G1 curve is
$y^2 = x^3 + ax + b$ (with$a$ and$b$ being some constants). This is usually called the Weierstrass form of the curve. - The equation of Jubjub is
$au^2 + v^2 = 1 + d u^2 v^2$ . This is usually called a twisted Edwards curve.
With a curve in Weierstrass form, when -secret
is a private key for the point -pk
whose x
coordinate is the same as pk.x
.
This seems promising!
But secret
is represented as a non-negative integer below secret_hack
= secret
(this works because secret
is also below
// Compute the private key of -pk and the associated nullifier
let secret_hack = MNT4BigFr::from(MNT6BigFr::MODULUS) - leaked_secret;
let nullifier_hack =
<LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![secret_hack]).unwrap();
// Ensure the new nullifier is different
assert_ne!(nullifier, nullifier_hack);
// Generate a proof with this new secret and nullifier, using the same public key
let c2 = SpendCircuit {
leaf_params: leaf_crh_params.clone(),
two_to_one_params: two_to_one_crh_params.clone(),
root: root.clone(),
proof: tree_proof.clone(),
nullifier: nullifier_hack.clone(),
secret: secret_hack.clone(),
};
let proof = Groth16::<MNT4_753>::prove(&pk, c2.clone(), rng).unwrap();
// Verify the proof is correct
assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier_hack], &proof).unwrap());
This works fine: the attack generated a different nullifier for the same public key!
The project implements a Semaphore-like circuit where each member of a Merkle tree holds a secret and can publish some nullifier associated with them. It reused ideas from the Zcash protocol specification to use only the first coordinate of the public key in the Merkle tree. But it used a curve in Weierstrass form instead contrary to Zcash. This caused a vulnerability where members are able to craft a second nullifier from their secret.
The project loads data from three binary files:
leaves.bin
contains the leaves of the Merkle tree.leaked_secret.bin
contains thesecret
of a member of the Merkle tree.proof_keys.bin
contains the Groth16 proving key and verifying key used by the proving system.
The leaves can be displayed by inserting in function main
:
for (i, leave) in leaves.iter().enumerate() {
for (j, item) in leave.iter().enumerate() {
println!("leaves[{i}][{j}] = {item}");
}
}
When executed with cargo run -r
, this displays:
leaves[0][0] = 39453539927994817197765579308813928157713497906804571172812392407301200842720302569364713338155904681857308292513537321122003926745548105057267811518930892533431272741854608293778101386134980475696342437084239813479128774968644
leaves[1][0] = 33254115487803193787467741044441012720225989305267212137687962176807969668034363565356747232971449755943420422841908954891611894712029140840387178281636867610914501870223357373267720354319774975690575899689396890107253424770714
leaves[2][0] = 34132631296389675482710216196855384730588026618643388196381197455165096474322420842000474392388910211605440962886889520537257509916782391325714095547727483867859845499350620212129517189288791002573129189316776191236321369792874
leaves[3][0] = 5778575329528145851468015897113673238556086946311219717817423197294182566563330618169530892520688041650035126157465821707448313825994298787611366263021675473157427814051271926395567527456736782324997432867019260585338926662533
The public key associated with the leaked secret
can be displayed with:
let base = G1Affine::generator();
let pk = base.mul_bigint(leaked_secret.into_bigint()).into_affine();
println!("pk.x = {}", pk.x);
pk.x = 34132631296389675482710216196855384730588026618643388196381197455165096474322420842000474392388910211605440962886889520537257509916782391325714095547727483867859845499350620212129517189288791002573129189316776191236321369792874
Therefore leaves[2] == [pk.x]
, meaning the leaked secret belong to the member of the quorum with index 2 (which is also why there is let i = 2
on line 152).
The root of the Merkle tree (computed on line 163 is:
root = 36599113137087969930840997920162741481243089880760623122221627324493044885888902307937689027223976731946291368601574061500023827182334658903539183702082327695077702000596999787788687607224158883109451285952662417426402885087313
For the secret and the attack, here are the values:
leaked_secret = 39846838468154423503928570543886746237397699869367684446977339808678485551027766891422021947516399629698996987371709141437423308118294917956395865033556420042349607780370689414551732365417821124712925443064651277726630250956495
nullifier = 11299723492798861325473231888351413863098247005209121305093211498445081436515006186642409996808912085756554418514162237593128468013985768925807559260959504690965586740735752967636401896633695685701092305548913938349088716371493
secret_hack = 2051652499764529898415644247353890890773010050586264624806163112346867261543339881636871816273939291719073984516544644676930418411289467245195740688456706426581796567579151128456253962325641729007702608627489987576484470733106
nullifier_hack = 6881215798018018599531910386444885098302033279235277464646929488957043263036777273191609551816397881089283753501421676811673218332777161277972031115981705870699000891633061494006576100324261080415411526636434624843944169263053
This document talks about the Collision-Resistant Hash function (CRH) which was used without giving much details.
Nevertheless, as it uses custom parameters in src/poseidon_parameters.rs
, is this function secure?
First, which hash function is being used?
On line 151 there is the computation of nullifier
from leaked_secret
:
let leaf_crh_params = poseidon_parameters::poseidon_parameters();
let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();
This uses the type LeafH
, which was defined as poseidon::CRH<ConstraintF>
with type ConstraintF = MNT4BigFr;
and poseidon
coming from ark_crypto_primitives::crh::poseidon
.
This module implements the Poseidon hash function defined in an ePrint paper titled "Poseidon: A New Hash Function for Zero-Knowledge Proof Systems
".
Function evaluate
and the functions it calls are quite abstract, because they need to support many use-cases.
In the project, the CRH is only called to hash one or two elements of MNT4-753's scalar field into a single element of the same field.
This actually enables simplifying the code a lot, as the permutation layer is only called once and there is no bitwise conversion.
Here are two functions reimplementing this CRH:
/// CRH with one element as input, one element as output
fn crh_1(input: MNT4BigFr) -> MNT4BigFr {
crh_2(input, MNT4BigFr::zero())
}
/// CRH with two elements as input, one element as output
fn crh_2(input0: MNT4BigFr, input1: MNT4BigFr) -> MNT4BigFr {
// Load the parameters
let params = poseidon_parameters::poseidon_parameters();
// Initialize the state after absorbing 2 elements
// (capacity = 1 and rate = 2 so the state contains 1+2 = 3 elements)
let mut s0 = MNT4BigFr::zero();
let mut s1 = input0;
let mut s2 = input1;
for i in 0..37 {
// Retrieve the Additive Round Keys (ARK) for round i
let ark = ¶ms.ark[i];
let mut new_s0 = s0 + ark[0];
let mut new_s1 = s1 + ark[1];
let mut new_s2 = s2 + ark[2];
// S-boxes with exponent alpha = 17
new_s0 = new_s0.pow(&[17]);
if i < 4 || i >= 33 {
// 4 full rounds, 29 partial rounds and 4 full rounds
new_s1 = new_s1.pow(&[17]);
new_s2 = new_s2.pow(&[17]);
}
// Apply a near Maximally Distance Separating (MDS) Matrix
// [[1, 0, 1],
// [1, 1, 0],
// [0, 1, 1]]
s0 = new_s0 + new_s2;
s1 = new_s0 + new_s1;
s2 = new_s1 + new_s2;
}
s1
}
To test these functions, the following assertions can be added to function main
:
// Compute the nullifier
assert_eq!(crh_1(leaked_secret), nullifier);
// Compute the root of the Merkle tree
assert_eq!(
crh_2(
crh_2(crh_1(leaves[0][0]), crh_1(leaves[1][0])),
crh_2(crh_1(leaves[2][0]), crh_1(leaves[3][0]))
),
root
);
The Additive Round Keys (ARK) are hard-coded to some undocumented values.
Searching them online gave some results:
-
Rust crate
ark_crypto_primitives
tests its Merkle tree implementation with these parameters: https://github.com/arkworks-rs/crypto-primitives/blob/v0.4.0/src/merkle_tree/tests/test_utils.rs. However the way there were generated is not documented and the parameters actually do not match the curve which is used, as they are above the order of its scalar field. I reported this issue in arkworks-rs/crypto-primitives#131. -
https://github.com/scipr-lab/libiop is a C++ project which defines the exact same parameters in https://github.com/scipr-lab/libiop/blob/a2ed2ec2f3e85f29b6035951553b02cb737c817a/libiop/bcs/hashing/poseidon.tcc#L402-L463:
alpha = 17
,capacity = 1
,rate = state_size - capacity = 3 - 1 = 2
,full_rounds = 8
,partial_rounds = 29
, near MDS matrix, and parameters starting with9478896780421655835758496955063136571251874317427585180076394551808670301829
. These parameters are justified from a file next to the C++ code:github.com/scipr-lab/libiop:libiop/bcs/hashing/poseidon.md
. There were added in a large git commit from August 2020. Moreover, this project uses the Alt-BN128 curve, with scalar field order$r = 21888242871839275222246405745257275088548364400416034343698204186575808495617$ . All values are below this number. -
The same parameters are also used in https://github.com/ValarDragon/temp_hashing/blob/f9a94baec7317cf2a636c7e3e1e1256f6b60234b/src/hashing/poseidon/mod.rs#L103, which dates back from June 2020. This file reuses the same round constants in a larger setting in function
libiop_alpha_5_state_17_poseidon
withstate_size = 17
(instead of 3) andpartial_rounds = 58
(instead of 29). This last function gives a sequence of 1122 numbers which begins with the same numbers as the ones used in the puzzle. This cannot be a coincidence.
Generating Poseidon Additive Round Keys can be done with some scripts available on https://extgit.iaik.tugraz.at/krypto/hadeshash. Here is how it is possible to use them
# Generate parameters for a field over Alt-BN128's scalar field with
# state_size = 17, full_rounds = 8, partial_rounds = 58
sage generate_parameters_grain.sage 1 0 254 17 8 58 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Number of round constants: 1122" and 1122 numbers starting with
# 0x0851ed0fd5e0aee13ce60c79c538512b509f59d97f0f7e27b6a475b4d8347086
# Generate parameters for a the same field and
# state_size = 3, full_rounds = 8, partial_rounds = 29
sage generate_parameters_grain.sage 1 0 254 3 8 29 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Number of round constants: 111" and numbers starting with
# 0x1b673c72211267578875c48f8a1a35ad0a6ca50a53897bb46669bbd506691f26
# Generate parameters with the "Poseidon" script, for the same field and
# state_size = 3, alpha = 17, security_level = 128
sage generate_params_poseidon.sage 1 0 254 3 17 128 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Params: n=254, t=3, alpha=17, M=128, R_F=8, R_P=31",
# (so 8 full rounds and 31 partial rounds), "Number of round constants: 117" and
# numbers starting with
# 0x195c895306d9937da3bd2c4ca106749f9ff9636fa7af7f6c300fb253e9ad820a
# Generate parameters with the "Poseidon" script, for the same field and
# state_size = 3, alpha = 5, security_level = 128
sage generate_params_poseidon.sage 1 0 254 3 5 128 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Params: n=254, t=3, alpha=5, M=128, R_F=8, R_P=56",
# (so 8 full rounds and 56 partial rounds), "Number of round constants: 192" and
# numbers starting with
# 0x1d066a255517b7fd8bddd3a93f7804ef7f8fcde48bb4c37a59a09a1a97052816
The computed round constants never matched the Additive Round Keys used in the puzzle.
Anyway, the puzzle did not use curve Alt-BN128 but MNT4-753 with src/poseidon_parameters.rs
because every number in this file is 254 bits or less.
This raises another question. If the Poseidon Additive Round Keys were generated for other parameters than the ones used in the puzzle, is the hash actually a collision-resistant hash (CRH)? It should be possible to answer this question by verifying the effective security level.
The script which generates Poseidon parameters provide a helpful function named sat_inequiv_alpha
.
By adding at the end of the file:
print(sat_inequiv_alpha(p=PRIME_NUMBER, t=NUM_CELLS, R_F=8, R_P=29, alpha=17, M=128))
The Sage script displays:
$ sage generate_params_poseidon.sage 1 0 753 3 17 128 0x1c4c62d92c41110229022eee2cdadb7f997505b8fafed5eb7e8f96c97d87307fdb925e8a0ed8d99d124d9a15af79db26c5c28c859a99b3eebca9429212636b9dff97634993aa4d6c381bc3f0057974ea099170fa13a4fd90776e240000001
...
True
If I correctly understand what function sat_inequiv_alpha
does, this means the parameters used in the puzzle are acceptable (8 full rounds and 29 partial rounds with alpha=17 on MNT4-753 scalar field).
The fact that the Additive Round Keys are small compared to
To sum up, I do not believe there exists a practical attack on the hash function used in the Merkle tree in the puzzle.
So once Bob fixes the vulnerability with the way the public keys are used in the Merkle tree (for example by using the full coordinates instead of only the x
one), his system seems to be secure.