Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save sebastiandanconia/9f46ddcc41973f96623f16016ada5771 to your computer and use it in GitHub Desktop.
Save sebastiandanconia/9f46ddcc41973f96623f16016ada5771 to your computer and use it in GitHub Desktop.
Write-up for ZK Hack IV puzzle F1: Gamma Ray

Write-up for ZK Hack IV puzzle F1: Gamma Ray

1. Subject

Bob was deeply inspired by the Zcash design [1] for private transactions and had
some pretty cool ideas on how to adapt it for his requirements. He was also
inspired by the Mina design for the lightest blockchain and wanted to combine
the two. In order to achieve that, Bob used the MNT6753 cycle of curves to
enable efficient infinite recursion, and used elliptic curve public keys to
authorize spends. He released a first version of the system to the world and
Alice soon announced she was able to double spend by creating two different
nullifiers for the same key...

[1] https://zips.z.cash/protocol/protocol.pdf

Let's find out what the implementation contains and which bugs it could have!

2. From Rust code to Zero Knowledge proof

The puzzle takes place in a GitHub repository containing a Rust project and some other files. The main code is in src/main.rs. Function main loads some files, computes some things and then generate a proof:

let c = SpendCircuit {
    leaf_params: leaf_crh_params.clone(),
    two_to_one_params: two_to_one_crh_params.clone(),
    root: root.clone(),
    proof: tree_proof.clone(),
    nullifier: nullifier.clone(),
    secret: leaked_secret.clone(),
};

let proof = Groth16::<MNT4_753>::prove(&pk, c.clone(), rng).unwrap();

assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());

This code involves some words which are unfamiliar for newcomers to the Zero-Knowledge world. First, it crafts a SpendCircuit structure. The ZK Jargon Decoder defines "Circuit" as:

In ZK literature this usually refers to an arithmetic circuit: an ordered collection of additions and multiplications (and sometimes custom operations) that are applied to a set of inputs to yield an output.

In the same file, there is:

impl ConstraintSynthesizer<ConstraintF> for SpendCircuit {
    fn generate_constraints(
        self,
        cs: ConstraintSystemRef<ConstraintF>,
    ) -> Result<(), SynthesisError> {

This function crafts operations between the variables present in structure SpendCircuit (leaf_params, two_to_one_params, root...) and ensures the result of these operations matches some properties called constraints.

This is used in the project to generate many constraints used in a Groth16 proving system. This system enables building proofs about properties such as "the prover knows some values which match all constraints". These values are called witnesses or private inputs. And the proving system also enables verifying these proofs without actually knowing what the witnesses are.

What is Groth16? When asking this on https://zkpod.ai/, the chatbot replies:

Groth16 is a proving system used in verifying polynomial relationships. It is commonly deployed in systems such as PLONK or Groth16's own structure. It's a cost-effective solution but requires a trusted setup ceremony. The Groth16 proof systems were introduced via the Groth 16 paper. Groth16 works more efficiently with pairing friendly curves. In the zkEVM project, a STARK proof is created and then verified using either PLONK or Groth16, combining the advantages of both systems.

Sources: Episode 194, Episode 242, Episode 289

There are also some blog articles about Groth16, such as https://blog.lambdaclass.com/groth16/. This article explains that the name Groth16 actually comes from the paper "On the Size of Pairing-based Non-interactive Arguments" published by Jens Groth in 2016.

Groth16 relies on an elliptic curve to perform operations. Here, the code uses Groth16::<MNT4_753> which uses a type imported with this line at the top of src/main.rs:

use ark_mnt4_753::{Fr as MNT4BigFr, MNT4_753};

This uses the Rust crate ark-mnt4-753, which documents:

This library implements the MNT4_753 curve generated in [BCTV14]. The name denotes that it is a Miyaji-Nakabayashi-Takano curve of embedding degree 4, defined over a 753-bit (prime) field. The main feature of this curve is that its scalar field and base field respectively equal the base field and scalar field of MNT6_753.

In short, the program generates a proof using Groth16 proving system on MNT4-753.

Now, what is actually being proved?

3. Decoding the constraints

In the project, src/main.rs builds a circuit and generates a proof with Groth16. The circuit uses six parameters defined in a structure named SpendCircuit in lines 63-71:

struct SpendCircuit {
    pub leaf_params: <LeafH as CRHScheme>::Parameters,
    pub two_to_one_params: <LeafH as CRHScheme>::Parameters,
    pub root: <CompressH as TwoToOneCRHScheme>::Output,
    pub proof: Path<MntMerkleTreeParams>,
    pub secret: ConstraintF,
    pub nullifier: ConstraintF,
}

Right now, it is unclear what these fields mean. The types can give some pointers to the documentation.

CRHScheme is defined in ark_crypto_primitives and documented as "Interface to CRH". But what does CRH mean? A paper on Cryptology ePrint Archive provides the meaning of this acronym:

Collision-resistant hash functions (CRH) are a fundamental and ubiquitous cryptographic primitive.

So it seems similar to Cryptographic Hash Functions (CHF).

MntMerkleTreeParams refers to a Merkle Tree, defined in Wikipedia as:

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" (node) is labelled with the cryptographic hash of a data block, and every node that is not a leaf (called a branch, inner node, or inode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure.

The Wikipedia article also gives a helpful graph:

An example of a binary hash tree, from Wikipedia

This makes the purpose of root and TwoToOneCRHScheme clearer: root is the top hash of the Merkle tree and TwoToOneCRHScheme is the collision-resistant hash function (CRH) used to combine two hashes into one. By the way, this is called a tree because it is a special kind of graph.

nullifier is a term used in the Zcash Protocol Specification given in the references to the puzzle. In section 3.2.3:

The nullifier for a note [...] is a value unique to the note that is used to prevent double-spends.

In the project, line 155 of src/main.rs contains:

let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();

This code actually computes the hash of a value named leaked_secret, which is used in field secret. The nullifier is therefore the hash of the secret. Why is the secret leaked? This will surely become clearer later.

Now, let's analyze function generate_constraints in line 74 to better understand how secret, root, nullifier... are tight together in the Groth16 proof. As the code is quite verbose, it will not be copied here. Instead, here is an English translation of the code:

  • Define a variable root as a public input.
  • Define leaf_crh_params_var and two_to_one_crh_params_var as constants.
  • Define secret as a witness (so a private input).
  • Extract the bits of secret in secret_bits (in Little Endian order).
  • Enforce that secrets is smaller or equal the modulus of MNT6-753 scalar field (MNT6BigFr::MODULUS).
  • Define nullifier as an output.
  • Compute nullifier_in_circuit as the hash of [secret] with parameters leaf_crh_params_var and enforce that it equals to nullifier.
  • Define base as the generator of MNT6-753's G1 curve (defined in ark-mnt6-753's documentation).
  • Define pk as the scalar product of secret_bits with base, in affine representation. This operation is usually done in elliptic-curve cryptography to compute the public key (pk) associated with a private key (secret).
  • Define leaf_g as [pk.x], a one-element vector with the x coordinate of the public key.
  • Define cw as a witness associated with field proof of structure SpendCircuit. Its type is PathVar, which represents a Merkle tree path.
  • Verify that cw is a valid proof that leaf_g is a member of the Merkle tree with root root, and parameters leaf_crh_params_var and two_to_one_crh_params_var.

As there are many variables, let's simplify this a bit.

3. Understanding the constraints

The previous sections explained how the project is crafting a Groth16 proof on curve MNT4-753 with constraints which seem quite complex at first glance. This complexity can be overcome with a few notations.

First, function main contains (line 153:

    let two_to_one_crh_params = leaf_crh_params.clone();

So in practice, there is a single hash function, defined with parameters given by poseidon_parameters in src/poseidon_parameters.rs. Let's call this function $H$.

Second, MNT4-753 and MNT6-753 curves use several parameters which can be confusing, because the base field of one curve is the scalar field of the other. Let's use the scalar prime numbers and define $r_4$ as the order (meaning "the number of elements) of MNT4-753's scalar field and $r_6$ as the order of MNT6-753. In Rust code, $r_4$ is MNT4BigFr::MODULUS and $r_6$ MNT6BigFr::MODULUS.

Third, proof in structure SpendCircuit is computed by generating a proof of membership of a Merkle tree on line 166. The circuit verifies this proof, linking pk and root together.

Finally, a question remains: in which finite field(s) do values like secret, nullifier... lie? All these variables use type ConstraintF, defined as MNT4BigFr on line 22. So secret and nullifier are numbers in MNT4-753's scalar field. Moreover, the coordinates of points on MNT6-753's G1 curve are also defined on MNT4-753's scalar field (which is why both curves are said to form a cycle). So pk.x is also a number in MNT4-753's scalar field.

With these notations, the constraints become:

  • Ensure secret $\le r_6$
  • Compute nullifier = $H$(secret) and ensure it is equals to the nullifier given as public input (in Zero-Knowledge circuits, it is common for output variables to be considered as public inputs).
  • Compute pk as the public key on MNT6-753's G1 curve associated with secret. It is a point on an elliptic curve and its affine representation consists in two coordinates (pk.x, pk.y).
  • Ensure [pk.x] is a leaf of the Merkle tree built with hash $H$ and defined with root root (given as a public input).

And this is what is verified on line 187 when the Groth16 verification is invoked with root and nullifier:

    assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier], &proof).unwrap());

In a nutshell, the project seems to craft a proof for the following statement:

"The prover knows secret $\le r_6$ whose hash is the given nullifier and whose public key is a leaf of the Merkle tree with root root".

Actually, this is similar to the Semaphore circuit. secret is used as a private key of a member of a group, which is probably why it was named leaked_secret: the project is able to generate a proof because it knows one private key.

4. Attacking the circuit

The previous sections described that the circuit implemented in the project verified some constraints linking secret, nullifier and root. Is it secure?

The subject says that it is not:

Alice soon announced she was able to double spend by creating two different nullifiers for the same key...

As the nullifier is directly computed from the secret key (with a hash function), having two different values for nullifier while using the same secret seems difficult. What about the public key?

secret is an element of MNT4-753's scalar field, which can be seen as an integer such that: $0 \le$ secret $&lt; r_4$. The public key pk is computed on MNT6-753's G1 curve, which means that secret is considered modulo $r_6$. And $r_6 &lt; r_4$. So secret $+ r_6$ would also be a valid private key for the same public key, and would give a different nullifier. However the circuit blocks this attack, by verifying secret $\le r_6$. Too bad, it did not work.

But in fact, the public key used in the Merkle tree is not the point pk but only its x coordinate. The reason why the circuit does this is given by the first hint of the puzzle:

Hint 1. See lemma 5.4.7 in the Zcash specs for Bob's reasoning on why to only use the x coordinate when storing the public keys.

Indeed, lemma 5.4.7 page 102 of section "5.4.9.4 Coordinate Extractor for Jubjub" justifies that if a point $(u, v)$ is in the subgroup, the point $(u, -v)$ cannot be in it. Then theorem 5.4.8 on the same page uses this lemma to prove that the function $(u, v) \mapsto u$ is injective, meaning that for any $u$, there exists at most one $v$ such that the point $(u, v)$ is in the subgroup. Therefore, when using Jubjub curve, using the first coordinate of points is enough to ensure there is no collision between two keys.

However the project is using here MNT6-753's G1 curve instead of Jubjub. This curve is very different:

  • The equation of MNT6-753's G1 curve is $y^2 = x^3 + ax + b$ (with $a$ and $b$ being some constants). This is usually called the Weierstrass form of the curve.
  • The equation of Jubjub is $au^2 + v^2 = 1 + d u^2 v^2$. This is usually called a twisted Edwards curve.

With a curve in Weierstrass form, when $P = (x, y)$ is a point in a subgroup, the opposite $-P$ is always in the subgroup and its coordinates are $(x, -y)$. This means that -secret is a private key for the point -pk whose x coordinate is the same as pk.x. This seems promising!

But secret is represented as a non-negative integer below $r_4$. A possible way to take its opposite in MNT6-753's scalar field consists in computing: secret_hack = $r_6$ - secret (this works because secret is also below $r_6$). The following code tests this attack, when put on line 191:

// Compute the private key of -pk and the associated nullifier
let secret_hack = MNT4BigFr::from(MNT6BigFr::MODULUS) - leaked_secret;
let nullifier_hack =
    <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![secret_hack]).unwrap();

// Ensure the new nullifier is different
assert_ne!(nullifier, nullifier_hack);

// Generate a proof with this new secret and nullifier, using the same public key
let c2 = SpendCircuit {
    leaf_params: leaf_crh_params.clone(),
    two_to_one_params: two_to_one_crh_params.clone(),
    root: root.clone(),
    proof: tree_proof.clone(),
    nullifier: nullifier_hack.clone(),
    secret: secret_hack.clone(),
};
let proof = Groth16::<MNT4_753>::prove(&pk, c2.clone(), rng).unwrap();

// Verify the proof is correct
assert!(Groth16::<MNT4_753>::verify(&vk, &vec![root, nullifier_hack], &proof).unwrap());

This works fine: the attack generated a different nullifier for the same public key!

Conclusion

The project implements a Semaphore-like circuit where each member of a Merkle tree holds a secret and can publish some nullifier associated with them. It reused ideas from the Zcash protocol specification to use only the first coordinate of the public key in the Merkle tree. But it used a curve in Weierstrass form instead contrary to Zcash. This caused a vulnerability where members are able to craft a second nullifier from their secret.

Appendix A: show me the values!

The project loads data from three binary files:

  • leaves.bin contains the leaves of the Merkle tree.
  • leaked_secret.bin contains the secret of a member of the Merkle tree.
  • proof_keys.bin contains the Groth16 proving key and verifying key used by the proving system.

The leaves can be displayed by inserting in function main:

for (i, leave) in leaves.iter().enumerate() {
    for (j, item) in leave.iter().enumerate() {
        println!("leaves[{i}][{j}] = {item}");
    }
}

When executed with cargo run -r, this displays:

leaves[0][0] = 39453539927994817197765579308813928157713497906804571172812392407301200842720302569364713338155904681857308292513537321122003926745548105057267811518930892533431272741854608293778101386134980475696342437084239813479128774968644
leaves[1][0] = 33254115487803193787467741044441012720225989305267212137687962176807969668034363565356747232971449755943420422841908954891611894712029140840387178281636867610914501870223357373267720354319774975690575899689396890107253424770714
leaves[2][0] = 34132631296389675482710216196855384730588026618643388196381197455165096474322420842000474392388910211605440962886889520537257509916782391325714095547727483867859845499350620212129517189288791002573129189316776191236321369792874
leaves[3][0] = 5778575329528145851468015897113673238556086946311219717817423197294182566563330618169530892520688041650035126157465821707448313825994298787611366263021675473157427814051271926395567527456736782324997432867019260585338926662533

The public key associated with the leaked secret can be displayed with:

let base = G1Affine::generator();
let pk = base.mul_bigint(leaked_secret.into_bigint()).into_affine();
println!("pk.x = {}", pk.x);
pk.x = 34132631296389675482710216196855384730588026618643388196381197455165096474322420842000474392388910211605440962886889520537257509916782391325714095547727483867859845499350620212129517189288791002573129189316776191236321369792874

Therefore leaves[2] == [pk.x], meaning the leaked secret belong to the member of the quorum with index 2 (which is also why there is let i = 2 on line 152).

The root of the Merkle tree (computed on line 163 is:

root = 36599113137087969930840997920162741481243089880760623122221627324493044885888902307937689027223976731946291368601574061500023827182334658903539183702082327695077702000596999787788687607224158883109451285952662417426402885087313

For the secret and the attack, here are the values:

leaked_secret = 39846838468154423503928570543886746237397699869367684446977339808678485551027766891422021947516399629698996987371709141437423308118294917956395865033556420042349607780370689414551732365417821124712925443064651277726630250956495
nullifier = 11299723492798861325473231888351413863098247005209121305093211498445081436515006186642409996808912085756554418514162237593128468013985768925807559260959504690965586740735752967636401896633695685701092305548913938349088716371493

secret_hack = 2051652499764529898415644247353890890773010050586264624806163112346867261543339881636871816273939291719073984516544644676930418411289467245195740688456706426581796567579151128456253962325641729007702608627489987576484470733106
nullifier_hack = 6881215798018018599531910386444885098302033279235277464646929488957043263036777273191609551816397881089283753501421676811673218332777161277972031115981705870699000891633061494006576100324261080415411526636434624843944169263053

Appendix B: Poseidon hash

This document talks about the Collision-Resistant Hash function (CRH) which was used without giving much details. Nevertheless, as it uses custom parameters in src/poseidon_parameters.rs, is this function secure?

First, which hash function is being used? On line 151 there is the computation of nullifier from leaked_secret:

let leaf_crh_params = poseidon_parameters::poseidon_parameters();
let nullifier = <LeafH as CRHScheme>::evaluate(&leaf_crh_params, vec![leaked_secret]).unwrap();

This uses the type LeafH, which was defined as poseidon::CRH<ConstraintF> with type ConstraintF = MNT4BigFr; and poseidon coming from ark_crypto_primitives::crh::poseidon. This module implements the Poseidon hash function defined in an ePrint paper titled "Poseidon: A New Hash Function for Zero-Knowledge Proof Systems ".

Function evaluate and the functions it calls are quite abstract, because they need to support many use-cases. In the project, the CRH is only called to hash one or two elements of MNT4-753's scalar field into a single element of the same field. This actually enables simplifying the code a lot, as the permutation layer is only called once and there is no bitwise conversion.

Here are two functions reimplementing this CRH:

/// CRH with one element as input, one element as output
fn crh_1(input: MNT4BigFr) -> MNT4BigFr {
    crh_2(input, MNT4BigFr::zero())
}

/// CRH with two elements as input, one element as output
fn crh_2(input0: MNT4BigFr, input1: MNT4BigFr) -> MNT4BigFr {
    // Load the parameters
    let params = poseidon_parameters::poseidon_parameters();

    // Initialize the state after absorbing 2 elements
    // (capacity = 1 and rate = 2 so the state contains 1+2 = 3 elements)
    let mut s0 = MNT4BigFr::zero();
    let mut s1 = input0;
    let mut s2 = input1;
    for i in 0..37 {
        // Retrieve the Additive Round Keys (ARK) for round i
        let ark = &params.ark[i];
        let mut new_s0 = s0 + ark[0];
        let mut new_s1 = s1 + ark[1];
        let mut new_s2 = s2 + ark[2];

        // S-boxes with exponent alpha = 17
        new_s0 = new_s0.pow(&[17]);
        if i < 4 || i >= 33 {
            // 4 full rounds, 29 partial rounds and 4 full rounds
            new_s1 = new_s1.pow(&[17]);
            new_s2 = new_s2.pow(&[17]);
        }

        // Apply a near Maximally Distance Separating (MDS) Matrix
        // [[1, 0, 1],
        //  [1, 1, 0],
        //  [0, 1, 1]]
        s0 = new_s0 + new_s2;
        s1 = new_s0 + new_s1;
        s2 = new_s1 + new_s2;
    }
    s1
}

To test these functions, the following assertions can be added to function main:

// Compute the nullifier
assert_eq!(crh_1(leaked_secret), nullifier);

// Compute the root of the Merkle tree
assert_eq!(
    crh_2(
        crh_2(crh_1(leaves[0][0]), crh_1(leaves[1][0])),
        crh_2(crh_1(leaves[2][0]), crh_1(leaves[3][0]))
    ),
    root
);

The Additive Round Keys (ARK) are hard-coded to some undocumented values.

Searching them online gave some results:

Generating Poseidon Additive Round Keys can be done with some scripts available on https://extgit.iaik.tugraz.at/krypto/hadeshash. Here is how it is possible to use them

# Generate parameters for a field over Alt-BN128's scalar field with
# state_size = 17, full_rounds = 8, partial_rounds = 58
sage generate_parameters_grain.sage 1 0 254 17 8 58 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Number of round constants: 1122" and 1122 numbers starting with
# 0x0851ed0fd5e0aee13ce60c79c538512b509f59d97f0f7e27b6a475b4d8347086

# Generate parameters for a the same field and
# state_size = 3, full_rounds = 8, partial_rounds = 29
sage generate_parameters_grain.sage 1 0 254 3 8 29 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Number of round constants: 111" and numbers starting with
# 0x1b673c72211267578875c48f8a1a35ad0a6ca50a53897bb46669bbd506691f26

# Generate parameters with the "Poseidon" script, for the same field and
# state_size = 3, alpha = 17, security_level = 128
sage generate_params_poseidon.sage 1 0 254 3 17 128 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Params: n=254, t=3, alpha=17, M=128, R_F=8, R_P=31",
# (so 8 full rounds and 31 partial rounds), "Number of round constants: 117" and
# numbers starting with
# 0x195c895306d9937da3bd2c4ca106749f9ff9636fa7af7f6c300fb253e9ad820a

# Generate parameters with the "Poseidon" script, for the same field and
# state_size = 3, alpha = 5, security_level = 128
sage generate_params_poseidon.sage 1 0 254 3 5 128 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001
# This displays "Params: n=254, t=3, alpha=5, M=128, R_F=8, R_P=56",
# (so 8 full rounds and 56 partial rounds), "Number of round constants: 192" and
# numbers starting with
# 0x1d066a255517b7fd8bddd3a93f7804ef7f8fcde48bb4c37a59a09a1a97052816

The computed round constants never matched the Additive Round Keys used in the puzzle.

Anyway, the puzzle did not use curve Alt-BN128 but MNT4-753 with $r_4 = 41898490967918953402344214791240637128170709919953949071783502921025352812571106773058893763790338921418070971888458477323173057491593855069696241854796396165721416325350064441470418137846398469611935719059908164220784476160001$. Generating Poseidon Additive Round Keys with this field gives much larger numbers, which are inconsistent with the ones in src/poseidon_parameters.rs because every number in this file is 254 bits or less.

This raises another question. If the Poseidon Additive Round Keys were generated for other parameters than the ones used in the puzzle, is the hash actually a collision-resistant hash (CRH)? It should be possible to answer this question by verifying the effective security level.

The script which generates Poseidon parameters provide a helpful function named sat_inequiv_alpha. By adding at the end of the file:

print(sat_inequiv_alpha(p=PRIME_NUMBER, t=NUM_CELLS, R_F=8, R_P=29, alpha=17, M=128))

The Sage script displays:

$ sage generate_params_poseidon.sage 1 0 753 3 17 128 0x1c4c62d92c41110229022eee2cdadb7f997505b8fafed5eb7e8f96c97d87307fdb925e8a0ed8d99d124d9a15af79db26c5c28c859a99b3eebca9429212636b9dff97634993aa4d6c381bc3f0057974ea099170fa13a4fd90776e240000001
...
True

If I correctly understand what function sat_inequiv_alpha does, this means the parameters used in the puzzle are acceptable (8 full rounds and 29 partial rounds with alpha=17 on MNT4-753 scalar field). The fact that the Additive Round Keys are small compared to $r_4$ (254 bits instead of 753) does not seem to impact the security of the CRH.

To sum up, I do not believe there exists a practical attack on the hash function used in the Merkle tree in the puzzle. So once Bob fixes the vulnerability with the way the public keys are used in the Merkle tree (for example by using the full coordinates instead of only the x one), his system seems to be secure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment