Cryptography

FSA March 25, 2017

Foundational ideas in Cryptography

Crypto

Cryptography: code-making / secret keeping
Cryptanalysis: code-breaking / secret revealing

Cryptography is the masonry of software security. It is not the same AS software security.

Randomness

not 'predictable/patterned'
uniformly distributed output
next is unrelated to previous
next is unrelated to previous previous

Random Number Generator

We need random numbers for some algos, including quicksort
Avoiding deadlocks and 'starvation' in parallel programming
simulations/games

Don't be fooled by a seeming non-random pattern in a string of iterations.

Random Number Generators:

quantum mechanics
thermal noise
10th char of most recent tweet?
randomness is subjective in some way, because we may be able to increase the information that we have to predict something

Pseudo-Random Number Generators:

Not truly random, but easier to work with
Easier and deterministic (repeatable)
'die hard' tests – a suite of tests that look at the quality of a pseudo-random number generator
PRNG is less crypto-secure than a CSPRNG

PRNGS use a seed and typically input one iteration's seed as the seed of the next iteration. Check out Mersenne twister and middle squares method.

CSPRNG Pass statistical tests. Satisfies the 'next-bit test' Withstands 'state compromise extensions'

Hashing

Takes some input string ('message') and outputs a string ('digest'). The same message is always going to get the same output. Digest is usually a fixed size. The hashing algorithm is irreversible.

Collisions are a problem (remember our hash table workshop?)

Irreversibility is less strange than it seems. What if I tell you a sum is 15? What were my inputs? You don't know.

'+' <-- addition
sha1
md5
neurons

Possible collisions are infinite.

Why do we hash?

Identity:

Useful for identifying something quickly (think hash table)
bloom filter
git commit ids
equality testing

Storing hashed passwords in your database protects somewhat from compromise. You can hash the password and see if it matches the hash in the DB, without knowing the user's password.

Cryptography:

password verification
integrity verification
some PRNGs
proof of work (in cryptography)

How

pad – we make sure it is of a certain size (our target size)
partition – split the string into chunks of our target size
combine – smoosh all of our chunks into a

AKA Merkle-Damgard construction

For example: Simple Hash

pad string to at least twice the desired length
partition into chunks of desired length
combine (reduce) this partitions using XOR

XOR

XOR TRUTH TABLE
______
00 | 0
01 | 1
10 | 1
11 | 0

XOR produces an evenly distributed output.

Cryptographic Hashing

Pre-image resistance: a hacker can't find collision given hash result – What if I can create something that produces the same results?
Second pre-image resistance: a hacker can't find collision given input message – If you know a file that was used to produce a hash, then you can't come up with another string of text that will produce the same hash. I receive a hashed output. Someone in the middle comes up with a different file that has the same hashed output. This is bad!
Collision resistance: a hacker can't find some arbitrary collision – Impossible to find some arbitrary collision. I can't figure out two messages in general that hash to the same output.

When people say an algo is 'collision resistant', they usually mean all three.

HMAC

Hash-message authentication code

key + message => digest key is a secret

Scenario:

When I buy something I'm asked to make a secret
Secret hashed together with purchase
Creates an HMAC which you can verify again using the 'receipt' (purchase) and the secret

Can be used to confirm message receipt without exposing identity or message.

PBKDF2

Password based key derivation function!

We use this on passwords. It repeatedly calls a hashing algorithm on 'salted' text.

My password cats, adds some secret salt to the front, jazz82cats (an HMAC). The salted password is hashed repeatedly and stored in the database.

Why should we hash passwords before storing in our DB? Avoid having them easily stolen.
Why salt? Common passwords could be guessed via collisions by testing a large quantity of passwords. Helps us avoid rainbow table hacks – because the hacker also needs the salt.
Why repeatedly hash? Slows down algorithms. It slows us down a bit, sure. But for a hacker we significantly slow their brute force attacks by increasing the how many times they have to hash every possible password.

Encryption

TBD

Asymmetric-key (RSA)

TBD

zcaceres/Cryptography-cs-saturday.md