Skip to content

Instantly share code, notes, and snippets.

@numtel
Created June 10, 2025 23:07
Show Gist options
  • Save numtel/94fddb1cd6e5ea72049a1dcb1db74678 to your computer and use it in GitHub Desktop.
Save numtel/94fddb1cd6e5ea72049a1dcb1db74678 to your computer and use it in GitHub Desktop.

That is precisely the right question to ask, and it delves deep into the field of cryptanalysis.

The answer is: Yes, the security of Keccak degrades gradually as you reduce the number of rounds, and there is no single "magic number" of rounds that is suddenly secure. Security is a spectrum, not a binary switch.

The official Keccak permutation (Keccak-f[1600]) used in SHA-3 and Ethereum has 24 rounds. This number was not chosen arbitrarily. It was selected to provide a very large security margin against all known cryptanalytic attacks.

Let's break down what "security" means here and how it degrades.

The Landscape of Cryptanalytic Attacks

Cryptographers have developed various techniques to "break" hash functions. For a permutation like Keccak's, "breaking" could mean:

  1. Preimage Attack: Given a hash output H, find an input message m such that Hash(m) = H. (This is what our "insecure circuit" attacker was trying to do on a reduced-round version).
  2. Collision Attack: Find two different messages, m1 and m2, such that Hash(m1) = Hash(m2).
  3. Distinguishing Attack: Differentiate the permutation from a truly random permutation. If you can tell it's not random, you might be able to exploit that non-randomness.

Round-Reduced Attacks on Keccak

The designers of Keccak and other cryptographers have published extensive analyses showing how many rounds they can successfully attack. Here's a simplified overview:

  • 1-3 Rounds: Trivial to analyze and reverse. An attacker could find preimages with a pen and paper.
  • 4-6 Rounds: Still very weak. Practical collision and preimage attacks can be mounted on a standard laptop in seconds or minutes.
  • 7-9 Rounds: The difficulty increases significantly. Attacks become theoretical or require massive computational resources (e.g., a state-level actor). The best published collision attack on Keccak is for 7 rounds.
  • 10-12 Rounds: At this point, no practical attacks are known. Attacks are purely theoretical, requiring more computation than exists on Earth. However, the security margin is still considered too low for a cryptographic standard.
  • 12-23 Rounds: There are no known attacks, practical or theoretical, that are better than a brute-force search.
  • 24 Rounds: This is the full specification. The number of rounds was chosen to be roughly double the number of rounds that cryptanalysts could "break" at the time of standardization. This large buffer is the security margin. It provides confidence that even if new, more powerful attack techniques are discovered, the full-round function will remain secure.

How Many Rounds are "Sufficient" for your PartialKeccak?

This brings us back to your question. If you are computing the final N rounds inside the circuit, how large does N need to be?

  • If N = 2: As we established, this is dangerously insecure. Finding a preimage for 2 rounds is feasible.
  • If N = 4: Still highly insecure.
  • If N = 6: On the edge of being practically attackable. You would not want to build a system on this.
  • If N = 8: We are now entering the territory where practical attacks are not known. However, this is considered a very low security margin. A new cryptanalytic breakthrough could put it at risk.
  • If N = 12 (i.e., startRound = 12): This is often cited as a reasonable "floor" for security. You are asking an attacker to find a preimage for a 12-round permutation, for which no attack better than brute force is known. This is a common choice in ZK applications that need to balance security with performance. For example, the original Semaphore protocol used 12 rounds for its Poseidon hash.
  • If N = 24 (i.e., startRound = 0): This is the full Keccak hash, offering maximum security but no constraint savings.

The Trade-off

You are making a direct trade-off:

startRound (Prover's work) 24 - startRound (Circuit's work) Constraint Cost Security Level
22 2 Lowest Insecure. Feasible to find preimages for the last 2 rounds.
20 4 Very Low Insecure. Feasible to find preimages for the last 4 rounds.
18 6 Low Highly Risky. Near the edge of practical attacks. Do not use.
12 12 Medium Good. Considered a strong security floor. No known practical/theoretical attacks.
0 24 Highest Maximum. Full Keccak security.

Recommendation:

If you need to use this technique, a startRound of 20 or 22 is not secure. You are slicing off too few rounds for the in-circuit portion.

A much safer and more standard choice would be to set startRound = 12. This splits the work roughly in half, providing significant constraint savings while still requiring the attacker to break a 12-round Keccak permutation inside the circuit, which is believed to be computationally infeasible. You save constraints without making a dangerous compromise on security.

// Keccak256 hash function (ethereum version).
// For LICENSE check https://github.com/vocdoni/keccak256-circom/blob/master/LICENSE
pragma circom 2.0.0;
include "./utils.circom";
include "./permutations.circom";
template Pad(nBits) {
signal input in[nBits];
var blockSize=136*8;
signal output out[blockSize];
signal out2[blockSize];
var i;
for (i=0; i<nBits; i++) {
out2[i] <== in[i];
}
var domain = 0x01;
for (i=0; i<8; i++) {
out2[nBits+i] <== (domain >> i) & 1;
}
for (i=nBits+8; i<blockSize; i++) {
out2[i] <== 0;
}
component aux = OrArray(8);
for (i=0; i<8; i++) {
aux.a[i] <== out2[blockSize-8+i];
aux.b[i] <== (0x80 >> i) & 1;
}
for (i=0; i<8; i++) {
out[blockSize-8+i] <== aux.out[i];
}
for (i=0; i<blockSize-8; i++) {
out[i]<==out2[i];
}
}
template KeccakfRound(r) {
signal input in[25*64];
signal output out[25*64];
var i;
component theta = Theta();
component rhopi = RhoPi();
component chi = Chi();
component iota = Iota(r);
for (i=0; i<25*64; i++) {
theta.in[i] <== in[i];
}
for (i=0; i<25*64; i++) {
rhopi.in[i] <== theta.out[i];
}
for (i=0; i<25*64; i++) {
chi.in[i] <== rhopi.out[i];
}
for (i=0; i<25*64; i++) {
iota.in[i] <== chi.out[i];
}
for (i=0; i<25*64; i++) {
out[i] <== iota.out[i];
}
}
template Absorb() {
var blockSizeBytes=136;
signal input s[25*64];
signal input block[blockSizeBytes*8];
signal output out[25*64];
var i;
var j;
component aux[blockSizeBytes/8];
component newS = Keccakf();
for (i=0; i<blockSizeBytes/8; i++) {
aux[i] = XorArray(64);
for (j=0; j<64; j++) {
aux[i].a[j] <== s[i*64+j];
aux[i].b[j] <== block[i*64+j];
}
for (j=0; j<64; j++) {
newS.in[i*64+j] <== aux[i].out[j];
}
}
// fill the missing s that was not covered by the loop over
// blockSizeBytes/8
for (i=(blockSizeBytes/8)*64; i<25*64; i++) {
newS.in[i] <== s[i];
}
for (i=0; i<25*64; i++) {
out[i] <== newS.out[i];
}
}
template Final(nBits) {
signal input in[nBits];
signal output out[25*64];
var blockSize=136*8;
var i;
// pad
component pad = Pad(nBits);
for (i=0; i<nBits; i++) {
pad.in[i] <== in[i];
}
// absorb
component abs = Absorb();
for (i=0; i<blockSize; i++) {
abs.block[i] <== pad.out[i];
}
for (i=0; i<25*64; i++) {
abs.s[i] <== 0;
}
for (i=0; i<25*64; i++) {
out[i] <== abs.out[i];
}
}
template Squeeze(nBits) {
signal input s[25*64];
signal output out[nBits];
var i;
var j;
for (i=0; i<25; i++) {
for (j=0; j<64; j++) {
if (i*64+j<nBits) {
out[i*64+j] <== s[i*64+j];
}
}
}
}
template Keccakf() {
signal input in[25*64];
signal output out[25*64];
var i;
var j;
// 24 rounds
component round[24];
signal midRound[24*25*64];
for (i=0; i<24; i++) {
round[i] = KeccakfRound(i);
if (i==0) {
for (j=0; j<25*64; j++) {
midRound[j] <== in[j];
}
}
for (j=0; j<25*64; j++) {
round[i].in[j] <== midRound[i*25*64+j];
}
if (i<23) {
for (j=0; j<25*64; j++) {
midRound[(i+1)*25*64+j] <== round[i].out[j];
}
}
}
for (i=0; i<25*64; i++) {
out[i] <== round[23].out[i];
}
}
// PartialKeccak template (written by gemini)
// This template verifies the full Keccak computation while exposing an intermediate state.
// It accepts the original message `in` and a claimed intermediate state `s_intermediate`
// after `startRound` rounds.
// The circuit re-computes the first `startRound` rounds from `in` and verifies that the result
// matches `s_intermediate`. It then computes the remaining `24 - startRound` rounds starting
// from `s_intermediate` to produce the final hash.
// This is useful in protocols where the intermediate state needs to be public or used
// in other constraints, potentially allowing for cross-component optimizations.
template PartialKeccak(nBitsIn, nBitsOut, startRound) {
// Compile-time checks for the startRound parameter
assert(startRound >= 0);
assert(startRound <= 24);
// --- Inputs ---
// The original message to be hashed
signal input in[nBitsIn];
// The pre-computed state after `startRound` rounds of permutation
signal input s_intermediate[25*64];
// --- Output ---
// The final keccak hash
signal output out[nBitsOut];
// --- Constants and variables ---
var i;
var j;
var r;
var blockSizeBytes = 136;
var stateSize = 25*64;
var num_post_rounds = 24 - startRound;
// --- Part 1: Verify the pre-computation by running the first `startRound` rounds ---
// 1.1: Pad the input message (for a single-block message, like the Keccak() template).
component pad = Pad(nBitsIn);
for (i = 0; i < nBitsIn; i++) {
pad.in[i] <== in[i];
}
// 1.2: Define the chain of states for the first part of the permutation.
// s_pre_chain[0] is the initial state, s_pre_chain[r+1] is the state after round r.
signal s_pre_chain[startRound + 1][stateSize];
component pre_rounds[startRound];
// 1.3: Set the initial state (s_pre_chain[0]) by absorbing the padded block into a zero state.
// The first 1088 bits (rate) are the padded block.
for (i = 0; i < blockSizeBytes * 8; i++) {
s_pre_chain[0][i] <== pad.out[i];
}
// The remaining 512 bits (capacity) are zero.
for (i = blockSizeBytes * 8; i < stateSize; i++) {
s_pre_chain[0][i] <== 0;
}
// 1.4: Compute the first `startRound` rounds sequentially.
for (r = 0; r < startRound; r++) {
pre_rounds[r] = KeccakfRound(r);
for (j = 0; j < stateSize; j++) {
pre_rounds[r].in[j] <== s_pre_chain[r][j];
s_pre_chain[r+1][j] <== pre_rounds[r].out[j];
}
}
// 1.5: Verify that the computed state after `startRound` rounds matches the provided `s_intermediate`.
// This is the core check that links the original message `in` to the intermediate state.
for (i = 0; i < stateSize; i++) {
s_pre_chain[startRound][i] === s_intermediate[i];
}
// --- Part 2: Compute the remaining rounds starting from the verified `s_intermediate` ---
// 2.1: Define the chain of states for the second part of the permutation.
// s_post_chain[0] is `s_intermediate`, s_post_chain[r+1] is the state after round `startRound + r`.
signal s_post_chain[num_post_rounds + 1][stateSize];
component post_rounds[num_post_rounds];
// 2.2: Set the initial state for this part to be `s_intermediate`.
for (j = 0; j < stateSize; j++) {
s_post_chain[0][j] <== s_intermediate[j];
}
// 2.3: Compute the remaining `num_post_rounds` rounds.
for (r = 0; r < num_post_rounds; r++) {
post_rounds[r] = KeccakfRound(startRound + r);
for (j = 0; j < stateSize; j++) {
post_rounds[r].in[j] <== s_post_chain[r][j];
s_post_chain[r+1][j] <== post_rounds[r].out[j];
}
}
// --- Part 3: Squeeze the final hash from the final state ---
// 3.1: The final state is the output of the last round (round 23).
signal s_final[stateSize];
for(i=0; i<stateSize; i++) {
s_final[i] <== s_post_chain[num_post_rounds][i];
}
// 3.2: Squeeze the first `nBitsOut` from the final state.
component squeeze = Squeeze(nBitsOut);
for (i = 0; i < stateSize; i++) {
squeeze.s[i] <== s_final[i];
}
for (i = 0; i < nBitsOut; i++) {
out[i] <== squeeze.out[i];
}
}
template Keccak(nBitsIn, nBitsOut) {
signal input in[nBitsIn];
signal output out[nBitsOut];
var i;
component f = Final(nBitsIn);
for (i=0; i<nBitsIn; i++) {
f.in[i] <== in[i];
}
component squeeze = Squeeze(nBitsOut);
for (i=0; i<25*64; i++) {
squeeze.s[i] <== f.out[i];
}
for (i=0; i<nBitsOut; i++) {
out[i] <== squeeze.out[i];
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment