Crypto Notes Week 2

Introduction to Block Ciphers

A block cipher is a cipher (E,D) which (given a key) takes n bits of PT as input and outputs n bits of CT.

Examples include:

3DES: n = 64 bits, k = 168 bits
AES: n = 128 bits, k = 128, 192, 256 bits

Block cipher are typically built by iteration.

The first thing that happens in a BC is key expansion. Key expansion uses the cipher's key to generate keys for each iteration, or round, of the cipher.

The input PT is then fed into a round function, along with the first key from the expansion. The output of the round function is then used as the input to the second round function, which uses the second expanded key, and so forth, until the k-th round function outputs the final ciphertext.

Before we continue our study of block ciphers, lets look at their abstractions and generalizations.

PRPs and PRFs

A pseudorandom function (PRF) defined over (K, X, Y) is a function F: K x X -> Y, such that we can evaluate F is poly time.

A pseudorandom permutation (PRP) defined over (K, X) is a function E: K x X -> X, s.t.:

exists a poly time det. alg. to evaluate E
the f'n E(k,.) is injective
There exists an efficient (left) inversion algorithm D(k, y)

Thus a PRP captures everything that a block cipher does, and they can be used interchangably.

Notice that a PRP is a PRF with more structure, in that X = Y and and the PRP is invertible.

What does it mean for a PRF or PRP to be secure?

Let F: K,X -> Y be a PRF. Let Funs[X,Y] be the set of all functions from X to Y, and let S_F = { F(k,.) s.t. k \in K }.

Intuitively, a PRF is secure if a random f'n from X to Y in Funs[X,Y] is indistinguishable from a function in S_F.

Now, similar to a stream cipher, we consider an adversary who is trying to distinguish between a truly random f'n and a PRF. Then the adversary will query the challenger and receive either the result of a truly random f'n or a PRF. If the adversary can distinguish the two, she breaks the PRF. If not, the PRF is said to be secure.

Note that even if a function has a non-random like behaviour at one point, it can be broken.

Note this the security of a block cipher will follow this very closely, because block ciphers are based on PRPs and PRFs.

PRGs from PRFs

Let F: K, {0,1}^n -> {0,1}^n be a secure PRF. Then the following G: K -> {0,1}^nt is a secure PRG:

G(k) = F(k,0) || F(k,1) || ... || F(k,t)

A key property of this generator is that it is parallelizable. Thus, there is a linear speed up with many cores.

Intuitively, the security of the PRG follows directly from the security of the PRF. If the PRF is indist. from truly random, and we are concatenating its output, then the resultant string must also be indist. from random. Thus, the PRG is secure.

This also implies that stream ciphers built on this primative will be parrallelizable!

Now lets look at different block cipher constructions.

DES: Data Encryption Standard

Let's talk about DES.

Fiestel Network

One of the core ideas behind DES was the Fiestel network. The Feistel network allows one to construct arbitrary invertible functions from non-invertible functions. In other words, turning a PRF intro a PRP. The network works as follows:

Given 2n bits, separate them in half. Call the first half the right half and the second the left half. To generate each subsequent block, perform the following operation:

R_i = f_i(R_{i-1}) XOR L_i L_i = R_i-1

Where f is some family of functions (invertible or not).

Algebraically, it is clear that this is indeed invertible.

Thrm: A fiestel network is always invertible.

Proof: To do later.

Thrm (Luby-Rackoff): Given a secure PRF f: K x {0,1}^n -> {0,1}, the 3 round Fiestel network F: K^3 x {0,1}^2n -> {0,1}^2n is a secure PRP.

This theorem implies we can construct a secure block cipher from a PRF using only 3 rounds of a Fiestel network!

Note that the keys in this instance must be truly independent for this to work.

DES: Inner workings

DES is essentially a 16 round Feistel network. Each round function is derived from an initial function f(k,.) which changes with the rounds keys.

The first that happens in DES is the input data is permuted under an intial permutation. Then, the data is sent through 16 rounds of a Fiestel network. Note that the 56 bit DES key is expanded into the round keys, where each round key is 48 bits. Finally the IP inverse permuation is applied, and the ciphertext is output.

The function f takes as arguments a 48 bit round key and 32 bits of data. The first thing that happens is the data goes through an expansion box, which maps the 32 bits to 48 bits. Then this data is XORed with the key. Next, the data is split into 8, 6 bit blocks, and these 6 bit blocks are sent through different substitution boxes, or S-boxes. These S-boxes randomly map their input 6-bits to 4 bit outputs. Finally, these bits are collected and permuted, giving us an 32 bit output.

The question is: how are these S-boxes chosen?

Lets explore some different choises to gain some intuition.

Suppose that the S-boxes were linear, i.e, they output a linear combination of their input bits. This can be viwed as multiplying the input bit vector by a matrix A over F_2:

S(x) = Ax (mod 2)

Now, if DES was truly like this, then the entire cipher would be linear. Therefore, we could represent encryption as:

DES(k, m) = B \dot vec(m, k_1, ..., k_16) = CT

Now, if this were the case, then DES would be a linear function, which would endow it with properties random functions do not have. Therefore, it can be dist. from a random function. E.g:

DES(k,a) XOR DES(k,b) XOR DES(k,c) = DES(k, a XOR b XOR c)

Clearly, a random function does not satisfy this condition in general.

In turns out, even if some S-boxes were linear, or close to linear, DES can be broken quiet easily.

Therefore, the designers of DES had certain rules which dictated how they chose the S-box mappings:

No output bit should be close to a linear f'n of the input bits.
S-boxes are 4-to-1 maps

Next, let's examine the security of DES.

Exhaustive Search Attacks

Let's look at a few attacks on DES.

Our goal when breaking the cipher is, given a few pairs of messages and ciphertexts, we want to find the key.

Lem: Suppose DES is an ideal cipher, that is, DES is composed of truly random invertible functions. Then for all, m and c, there is at most one key, k, s.t. c = DES(k, m), with prob. >= 1 - 1/256

Proof:

Pr[ there exists K' \neq K: C = DES(K,m) = DES(K', m) ] <= \sum_{K' \in {0,1}^56} Pr[ DES(K, m) = DES(K', m) ] <= 2^56 * 1/2^64 = 1/2^8

Therefore, the probability that there does not exists a key s.t. DES(K, m) = DES(K', m) = 1 - 1/2^8 = 99.5 %

Now, given two PT,CT pairs encrypted under the same key, the same analysis tells us that it is almost impossible that the key is not unique. Therefore, two input/output pairs are enough for exhaustive key search.

Exhaustive search works by trying each and every possible key combination on the PT, CT pairs, until you get a match.

DES Challenge

The DES challenge was issued by a company called RSA. The company published a message and cipher text pairs, which they said they encrypted using one secret key. They gave 3 cipher text plaintext pairs, and challeged anyone to decrypt the other cipher texts they released, by finding the key for the first three pairs.

In 1997, the challege was solved in 3 months
In 1998, the EFF machine (deep crack) solved in 3 days (250K)
In 1999, combined search took 22 hours
In 2006, COPACOBANA (120 FPGAs) tooks 7 days (10K)

How can we strenghten DES against these attacks?

One technique was the artificially increase the key space by iterating the cipher. This method was called 3DES.

Now, we get a key size of 168 bits, which is infeasible to crack via exhaustive search. However, it is 3 times slower than DES.

Well, if this works, why not implement 2D? That is:

2E(k,k',m) = E(k',E(k,m)).

Unfortunately, this strategy can be broken using a meet-in-the middle attack.

In the attack, we are given several PT, CT pairs. Our goal is to use these to pairs to find a pair of keys k, k' s.t the encryption under these keys gives a ciphertext we have.

Now, E(k', E(k,m)) = C is the same thing as E(k,m) = D(k',C). It turns out, we can use this fact to break 2DES.

First, we construct a table of length 2^56, which consists of the encryption of m under all possible keys and the resulting ciphertexts. Then, we use each key in the table to try and decrypt C. If C = the value in the CT col of the table, we have a match and we're done.

Now, we had to build the table which took approx. 2^56log(2^56) time, and then check it against every entry, which once again took about 2^56log(2^26) time.

Therefore, our total time is around 2*(2^26log(2^56)). Note that this is far less than 2^112. Therefore, this attack beats exhaustive search by several orders of magnitude.

Now, this same attack on 3DES also works, however it takes time 2^118, and thus is not viable.

Another method of strengthening DES against exhaustive search is using DESX.

Let E be a block cipher on n bits. Define EX as:

EX((k,k',k''), m) = k XOR E(k', m XOR k'')

This gives a theoretical security of 184 bits, but can be broken using an attack in time 2^120 (homework).

Note that XORing only on the inside/outside does nothing!

More Attacks on Block Ciphers

Let's examine more attacks on blockciphers.

One possible attack is a side channel attack, which takes advantage of the way the cipher is implemented. For example, if a smartcard perform encryption of some data, a hacker may be able to examine and measure the performance (time, current) of the card on a very accurate scale and deduce what the cipher is doing, eventually finding the key.

Another type of attack could be a fault attack, which causes the hardware to malfunction and corrupt the data. These computation error in the last rounds can completely expose the key.

Now let let at more sophisticated attacks on blockciphers.

Linear Cryptanalysis

The goal is, given many in/out pairs, can we recover the key in better that exhaustive time?

Linear cryptanalysis works as follows: Suppose we take random linear combinations of the message and ciphertext bits. What's the likely hood they are equal to some linear combination of the key bits? In other words, in:

Pr[ m[i_1] XOR ... XOR m[i_r] XOR c[j_1] XOR ... XOR c[j_v] = k[l_1] XOR ... XOR k[l_u] ] = 1/2 + e

is e neg?

In DES, e = 1/2^21 = 0.000000477. Now, even when this is small, this allows us to break DES.

An eq'n like this can be used to find specific key bits as follows. First, assume there is some linearity as described above. Then, if we test many PT/CT pairs, the relation will hold for the small epsilon. Therefore, the majority of these pairs will have a certain value, and this will be the XOR of some key bits.

By repeating this with different linear attacks, we can derive many bits of the key, either making it feasible to perform an exhaustive search, or breaking the cipher entirely.

In DES, this occurs due to a fault in the design of the 5th Sbox. In DES this can be acheived in time 2^43. Note that this is much smaller than 2^56.

The takeaway is that any linearity whatsover will lead to efficient attacks.

Quantum Attacks

Let us examine a generic search problem, which generalizes the idea of exhaustive search. Let f be a function from a large domain X to {0,1}. Our goal is to find the few inputs on X which yield 1.

On a classical computer, the best we could hope for is time O(|X|), as the function f is given as black box.

However, on a quantum computer this can be acheived in time O(|X|^(1/2))! This is amazing!

What does this have to do with block ciphers? We can define a function f(k) = 1 if E(k,m) = c and 0 otherwise. By the power of quantum computers, we could find a key for DES in time 2^28 and keys for AES in time 2^64 (breakable).

Next, lets examine the AES blockcipher.

The AES Block Cipher

Overtime, it became clear that DES and 3DES were not optimal for modern hardware and had become too slow. Therefore, a new campaign was started to find a new encryption standard, which would be called the Advanced Encryption Standard, or AES.

In 1997, NIST published requests for new cipher proposals. NIST subsequently received 15 submissions and chose 5 finalists. Finally, in 2000, NIST chose the Rijndael cipher as AES, from Belgium.

Rijndael takes keys of 128, 196 and 256 bits, and outputs blocks of 128 bits.

The assumption is that as the key size increases, the cipher becomes more secure.

AES, unlike DES, is built as a substitution-permutation network, rather than a Fiestel network. Unlike a Fiestel network, wherein half the bits remain the same through the next round, an SP network changes all of these bits at every iteration.

The SP network works as follows: First the current state is XORed with a key. Then the state goes through a substitution layer, wherein S boxes are used to subsitute data. Then, the state is passed through a permutation layer, where P boxes permute the data.

Note, that decryption works by applying the network backwards, therefore S boxes and P boxes are invertible.

Specifically, AES operates on a 128 bit block, which is 16 bytes. In AES, we write this data as a 4 by 4 block and manipulate this block throughout the working of the cipher. First, the block is XORed with the round key. Next the block's bytes are substituted, then the rows of the block are shifted, and finaly the columns of the block are mixed.

This is repeated 10 times, and finally XORed with the last round key, giving us the output.

The round keys come from a key expansion, which expands 16 bytes of the key into 11, 16 bytes round keys.

The advantage of AES lies in the fact that every step of the round function (byte sub, row mix, col mix) can be implemented using lookup tables and XORs. Therefore, there is an inherent size/speed tradeoff when using AES. In implementations where space is constrained, tables can be computed on the fly at the cost of performance. However, in high end systems, tables can be precomputed, reducing AES to simple XORs and table lookups, which are incredibly fast.

Furthermore, AES has considerable hardware support. Due to it's ubiquity, chip manufactureres such as Intel and AMD have built custom AES instructions which run implement one round of AES on the processor. They claim that this can result in an over 14 times better performance than C or C++ implementations (such as those found in OpenSSL).

All of this begs the question. What attacks does AES suffer from?

The best key recovery attack is only 4 times faster than exhaustive search. This means that AES keys can be recovered using this attack in time 2^126.

There is also another attack called a related key attack which breaks AES-256 in time 2^99. This takes advantage of a flaw in AES's key expansion mechanism.

Block Cipher From PRGs

Can we build block cipher from simple primatives from PRGs?

Before that, lets ask the question: Can we build a PRF from a PRG?

Suppose we have a PRG G: K -> K^2 which is secure.

Now, we can construct a 1-bit PRF, F: K x {0,1} -> K, as follows:

F(k, x) = G(k)[x]

Where [x] is the index of the first of second element of K in G(k).

Clearly, the PRF F is secure, as the output of G is indist. from random.

This method can be extended to construct arbitrarily large secure PRFs in the following way:

Let G be a PRG s.t. G: K -> K^2. Define F: K x {0,1}^n -> K as follows:

F(k, {0,1}^n) = k -> G(k)[x_0] -> k_1 -> G(k_1)[x_1] -> ... -> k_{n-1} -> G(k_{n-1})[x_{n-1}] -> k_n

That is, the i-th bit of the input is used to index the recursive application of G on itself n-1 times, where each application was performed by applying G to each K block element of it's output.

This construction is called the GGM PRG, after Goldreich, Goldwasser and Macali, who invented it.

Using this idea, as well as the Luby-Rackoff theorem, we can generate secure PRPs using only a secure PRG.

siddMahen/crypto-week-two.md