Skip to content

Instantly share code, notes, and snippets.

@kevin-he-01
Last active February 9, 2023 14:12
Show Gist options
  • Save kevin-he-01/b6dec757724170cf0096f02240c312b5 to your computer and use it in GitHub Desktop.
Save kevin-he-01/b6dec757724170cf0096f02240c312b5 to your computer and use it in GitHub Desktop.
Chunk Norris - Google CTF 2020 Writeup

Chunk Norris — Google CTF 2020

Author: Kevin He
Team: 3PAC
Challenge Category: Crypto
Points: 98
Attachments: See challenge.py and output.txt in this Gist.

Chunk Norris is black belt in fast random number generation.

Start off by examining the code for challenge.py. It uses RSA — a modern public key cryptography algorithm — to encrypt the flag. The idea behind public key cryptography is that given the public key, everyone can encrypt plaintext messages, but only the party who possesses the secret private key can decrypt the ciphertext back into plaintext. In this challenge, only the public key is given, and it consists of the modulo n — a product of 2 large prime numbers — and the exponent e. However, the security of RSA heavily depends on the fact that its public key modulo n cannot be factored into its constituent primes p and q. Once p and q are known, the private key d can be computed easily (d = (p - 1) * (q - 1)) and decrypting is trivial.

Therefore, in a secure RSA implementation, n must be large enough to make factoring infeasible, and the challenge uses a 2048-bit n (product of two 1024-bit primes) to meet this security criterion. However, I exploited the fact that the code uses an insecure prime number generator to generate p and q, which allowed me to obtain p and q without factoring the gigantic n directly, by deriving p and q from 2 significantly smaller integers called P, and Q.

Finding the weakness

The code responsible for generating primes is in the gen_prime function of challenge.py. To generate each prime, the code uses a naïve LCG to generate 16 pseudorandom 64-bit chunks. And in each iteration of the while loop, it pieces the chunks together to form a 1024-bit integer. It then checks whether the generated number is prime, and if not, it repeats the whole process again. The process ends when the generated number is prime and that prime number is returned.

The code actually uses a much more simplified version of LCG, having a multiplier a but no increment in each step:

for _ in range(bits // chunk_size):
  p = (p << chunk_size) + s
  s = a * s % 2**chunk_size

This LCG works by performing the last line repeatedly, and in each step s is the generated pseudorandom number. The last line multiplies s by a and take the result modulo 2^chunk_size (from now on ^ denotes exponentiation, equivalent to Python's **). Pseudorandom number generators like LCG are not truly random, it is deterministic once we know the initial value of s. Moreover, since the code includes all 64 bits of s into the generated prime, actually knowing any 64-bit chunk in the prime would allow full recovery of the prime number. It is easy to see that once we know one chunk, we immediately know what the next chunks will be by simulating the LCG steps (multiply by a and mod) ourselves. The following diagram demonstrates this idea:

MSB                                  LSB
p = |chunk|chunk|chunk|chunk|...|chunk |
  = |  R  | R*a |R*a^2|R*a^3|...|R*a^15|

Definition of Variables

  • R: the leftmost (most significant) chunk of p
  • a: a from challenge.py, the multiplier in LCG
  • p: the prime generated, represents p or q
  • From now on including the diagram above, operations in every chunk is computed modulo 2^64 (|R*a| means (R*a)%(2^64)) unless stated otherwise.

If there's multiplication in modulo arithmetic, then should there be "division", an operation modulo 2^64 that effectively undoes the multiplication by a, too? The answer is to multiply by the modular multiplicative inverse of a mod 2^64, which I define as b. That is, R*a*b = R mod 2^64

b can be computed efficiently using the Extended Euclidean Algorithm, and it exists as an already implemented function in PyCryptodome:

from Crypto.Util.number import inverse
chunk_size = 64
a = 0xe64a5f84e2762be5
print('b =', hex(inverse(a, 1 << chunk_size)))

and here is the value of b:

b = 0xc14c0c90d01019ed

Since multiplying by b is the inverse operation of multiplying by a, we can compute p and q backwards from their least significant (rightmost) 64-bit chunks, called uppercase P and Q respectively:

MSB                                    LSB
p = |chunk |chunk |...|chunk|chunk|chunk |
  = |P*b^15|P*b^14|...|P*b^2|P*b  |P     |
q = |Q*b^15|Q*b^14|...|Q*b^2|Q*b  |Q     |

This establishes a deterministic formula for p and q (both 1024-bit) in terms of P and Q (both 64-bit). Therefore, the effective randomness of the primes is 64-bit (2^64 possible values) rather than 1024-bit (2^1024 possible values), but 64 bits is enough to make brute-forcing combinations of p and q infeasible. However, factoring a 128-bit integer (product of two 64-bit integers) is practical. This is a key piece of information that leads me to the next step, where I managed to extract the 128-bit P*Q, which is significantly easier to factor than the 2048-bit public key n.

Key insight

When we perform multiplication on numbers (Let's say 1111 * 1111) by hand, we perform this procedure:

   1111
*  1111
-------
   1111
  1111
 1111
1111
-------
1234321

Think of multiplying p and q as multiplying base 2^64 arithmetic, or that 64-bit chunks of p and q are digits, then:

               P*b^15 ...   P*b^2    P*b   P
 *             Q*b^15 ...   Q*b^2    Q*b   Q
--------------------------------------------
             P*Q*b^15 ... P*Q*b^2  P*Q*b  PQ
...P*Q*b^16  P*Q*b^15 ... P*Q*b^2  P*Q*b    
... ... ... ... ... ... ... ...
--------------------------------------------
.................................2*P*Q*b  PQ

Define:

  • n0 as the the lowest 64-bit (rightmost chunk) of n (bit 0 to 63)
  • n1 as the the chunk to the left of n0 (bit 64 to 127)

And we know that the product of p and q is n, so

  • n0 = P*Q mod 2^64 and
  • n1 = 2*P*Q*b + floor(PQ/(2^64)) mod 2^64.

Note that since both P and Q are 64-bit, their product is 128-bit. The lower 64 bits of P*Q is P*Q mod 2^64 while the upper 64 bits of P*Q are floor(PQ/(2^64)). The upper 64 bits are carried over to the chunk to its left.

So immediately we know the lower 64 bits of P*Q, which is just the lower 64 bits of n, to find the upper bits, I applied some modular arithmetic tricks:

2*P*Q*b + floor(PQ/(2^64)) % 2^64 == n1
(2*P*Q*b % 2^64) + (floor(PQ/(2^64)) % 2^64) == n1
2*b*(P*Q % 2^64) + (floor(PQ/(2^64)) % 2^64) == n1
since floor(PQ/(2^64) is the upper 64 bits of P*Q and thus cannot exceed 2^64, we can remove the modulo: 2*b*(P*Q % 2^64) + floor(PQ/(2^64)) == n1
floor(PQ/(2^64)) == n1 - 2*b*(P*Q % 2^64)

So now we have a formula for the entire P*Q. I wrote a script to solve it: getPQ.py which is in this gist It turns out that it can easily be factored (which is natural given that they are randomly generated numbers and only lowercase p and q are supposed to be primes, P and Q can be composite) using the GNU coreutil factor:

$ factor 3463373886635289660353622154000952089
3463373886635289660353622154000952089: 11 13 109 223 1290533 4608287 167541865434116759

Then we just have to try all possible combinations of P and Q using these factors, which is trivial: there are only 2^7=128 combinations to try, and we can easily compute the lowercase letters (p, q) from the uppercase ones: p = ... (P*(b^2) % (2^64)) << 128 + (P*(b^1) % (2^64)) << 64 + P

to check whether our guess is correct we just multiply p and q and see if it is equal to n, an implementation of this is in proc_pq_factors.py

and finally we have the flag:

$ python3 proc_pq_factors.py 
Flag: b'CTF{__donald_knuths_lcg_would_be_better_well_i_dont_think_s0__}\n'

End Note

It is a big no-no to use pseudorandom number generators (PRNGs) in cryptography, instead, use random devices like /dev/urandom or even /dev/random to get better sources of randomness. The deterministic property of PRNGs sometimes make cryptanalysis possible, especially with old and low quality PRNGs. I managed to get the flag thanks to the simple theory behind LCG.

#!/usr/bin/python3 -u
import random
from Crypto.Util.number import *
import gmpy2
a = 0xe64a5f84e2762be5
chunk_size = 64
def gen_prime(bits):
s = random.getrandbits(chunk_size)
while True:
s |= 0xc000000000000001
p = 0
for _ in range(bits // chunk_size):
p = (p << chunk_size) + s
s = a * s % 2**chunk_size
if gmpy2.is_prime(p):
return p
n = gen_prime(1024) * gen_prime(1024)
e = 65537
flag = open("flag.txt", "rb").read()
print('n =', hex(n))
print('e =', hex(e))
print('c =', hex(pow(bytes_to_long(flag), e, n)))
# ****** Step 1: Get (uppercase) P*Q (or p0*q0) ******
# Step 1: YOU ARE HERE!
# Step 2: factor P*Q, the output of proc_pq_factors.py (Ex. using the `factor` command from GNU coreutils)
# Step 3: run `python3 ./proc_pq_factors.py`: the final stage of the attack, get the flag!
# For a definition of terms (variable names) see ./solve.txt, note uppercase P and Q is not the same as lowercase p, q
## for the flag
n = 0xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19
# P*Q= 3463373886635289660353622154000952089
a = 0xe64a5f84e2762be5 # a = 17204219 * 964541360543 (prime factors)
b = 0xc14c0c90d01019ed # inverse of a mod (1 << chunk_size)
chunk_size = 64
cmod = 1 << chunk_size
def get_chunks(n, bits):
narr = []
for i in range(bits // chunk_size):
# print('{}: {}'.format(i, (n >> (i * chunk_size)) % (1 << chunk_size)))
narr.append((n >> (i * chunk_size)) % (1 << chunk_size))
return narr
narr = get_chunks(n, 2048)
pqlower = narr[0]
pqupper = (narr[1] - 2 * narr[0] * b) % cmod
print('P*Q=', (pqupper << chunk_size) + pqlower)
n = 0xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19
e = 0x10001
c = 0x6a12d56e26e460f456102c83c68b5cf355b2e57d5b176b32658d07619ce8e542d927bbea12fb8f90d7a1922fe68077af0f3794bfd26e7d560031c7c9238198685ad9ef1ac1966da39936b33c7bb00bdb13bec27b23f87028e99fdea0fbee4df721fd487d491e9d3087e986a79106f9d6f5431522270200c5d545d19df446dee6baa3051be6332ad7e4e6f44260b1594ec8a588c0450bcc8f23abb0121bcabf7551fd0ec11cd61c55ea89ae5d9bcc91f46b39d84f808562a42bb87a8854373b234e71fe6688021672c271c22aad0887304f7dd2b5f77136271a571591c48f438e6f1c08ed65d0088da562e0d8ae2dadd1234e72a40141429f5746d2d41452d916
# ****** Step 3: the final stage of the attack, get the flag! ******
# Step 1: run python3 getPQ.py
# Step 2: factor P*Q, the output of getPQ.py (Ex. using the `factor` command from GNU coreutils)
# Step 3: YOU ARE HERE!
# For a definition of terms (variable names) see ./solve.txt, note uppercase P and Q is not the same as lowercase p, q
from Crypto.Util.number import *
chunk_size = 64
a = 0xe64a5f84e2762be5 # a = 17204219 * 964541360543 (prime factors)
b = 0xc14c0c90d01019ed # b = modinv(a, (1 << chunk_size))
n = 0xab802dca026b18251449baece42ba2162bf1f8f5dda60da5f8baef3e5dd49d155c1701a21c2bd5dfee142fd3a240f429878c8d4402f5c4c7f4bc630c74a4d263db3674669a18c9a7f5018c2f32cb4732acf448c95de86fcd6f312287cebff378125f12458932722ca2f1a891f319ec672da65ea03d0e74e7b601a04435598e2994423362ec605ef5968456970cb367f6b6e55f9d713d82f89aca0b633e7643ddb0ec263dc29f0946cfc28ccbf8e65c2da1b67b18a3fbc8cee3305a25841dfa31990f9aab219c85a2149e51dff2ab7e0989a50d988ca9ccdce34892eb27686fa985f96061620e6902e42bdd00d2768b14a9eb39b3feee51e80273d3d4255f6b19
e = 0x10001
c = 0x6a12d56e26e460f456102c83c68b5cf355b2e57d5b176b32658d07619ce8e542d927bbea12fb8f90d7a1922fe68077af0f3794bfd26e7d560031c7c9238198685ad9ef1ac1966da39936b33c7bb00bdb13bec27b23f87028e99fdea0fbee4df721fd487d491e9d3087e986a79106f9d6f5431522270200c5d545d19df446dee6baa3051be6332ad7e4e6f44260b1594ec8a588c0450bcc8f23abb0121bcabf7551fd0ec11cd61c55ea89ae5d9bcc91f46b39d84f808562a42bb87a8854373b234e71fe6688021672c271c22aad0887304f7dd2b5f77136271a571591c48f438e6f1c08ed65d0088da562e0d8ae2dadd1234e72a40141429f5746d2d41452d916
## factorization of P*Q
pqfact = [11, 13, 109, 223, 1290533, 4608287, 167541865434116759]
possible_pqs = set()
for i in range(2 ** len(pqfact)):
# use = bin(i)[2:]
use = '{{:0{}b}}'.format(len(pqfact)).format(i)
assert len(use) == len(pqfact), len(use)
pq = [1, 1]
for j in range(len(pqfact)):
pq[int(use[j])] *= pqfact[j]
p, q = pq
possible_pqs.add(tuple(sorted(pq)))
# print('factors found: {}'.format(count))
# print('possible (P,Q) pairs: {}'.format(possible_pqs))
def getlowercase(U):
u = 0
mult = U
for i in range(16):
u += (mult << (i * chunk_size))
mult = (mult * b) % (1 << chunk_size)
return u
for P, Q in possible_pqs:
# print('p: {}'.format(getlowercase(P)))
# print('q: {}'.format(getlowercase(Q)))
p = getlowercase(P)
q = getlowercase(Q)
if p * q == n:
d = inverse(e, (p - 1) * (q - 1))
print('Flag: {}'.format(long_to_bytes(pow(c, d, n))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment