FSA March 25, 2017
Foundational ideas in Cryptography
- Cryptography: code-making / secret keeping
- Cryptanalysis: code-breaking / secret revealing
Cryptography is the masonry of software security. It is not the same AS software security.
- not 'predictable/patterned'
- uniformly distributed output
- next is unrelated to previous
- next is unrelated to previous previous
Random Number Generator
- We need random numbers for some algos, including quicksort
- Avoiding deadlocks and 'starvation' in parallel programming
- simulations/games
Don't be fooled by a seeming non-random pattern in a string of iterations.
Random Number Generators:
- quantum mechanics
- thermal noise
- 10th char of most recent tweet?
- randomness is
subjective
in some way, because we may be able to increase the information that we have to predict something
Pseudo-Random Number Generators:
- Not truly random, but easier to work with
- Easier and deterministic (repeatable)
- 'die hard' tests – a suite of tests that look at the quality of a pseudo-random number generator
- PRNG is less crypto-secure than a CSPRNG
PRNGS use a seed and typically input one iteration's seed as the seed of the next iteration. Check out Mersenne twister and middle squares method.
CSPRNG Pass statistical tests. Satisfies the 'next-bit test' Withstands 'state compromise extensions'
Takes some input string ('message') and outputs a string ('digest'). The same message is always going to get the same output. Digest is usually a fixed size. The hashing algorithm is irreversible.
Collisions are a problem (remember our hash table workshop?)
Irreversibility is less strange than it seems. What if I tell you a sum is 15? What were my inputs? You don't know.
- '+' <-- addition
- sha1
- md5
- neurons
Possible collisions are infinite.
Why do we hash?
Identity:
- Useful for identifying something quickly (think hash table)
- bloom filter
- git commit ids
- equality testing
Storing hashed passwords in your database protects somewhat from compromise. You can hash the password and see if it matches the hash in the DB, without knowing the user's password.
Cryptography:
- password verification
- integrity verification
- some PRNGs
- proof of work (in cryptography)
How
- pad – we make sure it is of a certain size (our target size)
- partition – split the string into chunks of our target size
- combine – smoosh all of our chunks into a
AKA Merkle-Damgard construction
For example: Simple Hash
- pad string to at least twice the desired length
- partition into chunks of desired length
- combine (reduce) this partitions using XOR
XOR TRUTH TABLE
______
00 | 0
01 | 1
10 | 1
11 | 0
XOR produces an evenly distributed output.
-
Pre-image resistance: a hacker can't find collision given hash result – What if I can create something that produces the same results?
-
Second pre-image resistance: a hacker can't find collision given input message – If you know a file that was used to produce a hash, then you can't come up with another string of text that will produce the same hash. I receive a hashed output. Someone in the middle comes up with a different file that has the same hashed output. This is bad!
-
Collision resistance: a hacker can't find some arbitrary collision – Impossible to find some arbitrary collision. I can't figure out two messages in general that hash to the same output.
When people say an algo is 'collision resistant', they usually mean all three.
Hash-message authentication code
key + message => digest key is a secret
Scenario:
- When I buy something I'm asked to make a secret
- Secret hashed together with purchase
- Creates an HMAC which you can verify again using the 'receipt' (purchase) and the secret
Can be used to confirm message receipt without exposing identity or message.
Password based key derivation function!
We use this on passwords. It repeatedly calls a hashing algorithm on 'salted' text.
My password cats
, adds some secret salt to the front, jazz82cats
(an HMAC). The salted password is hashed repeatedly and stored in the database.
-
Why should we hash passwords before storing in our DB? Avoid having them easily stolen.
-
Why salt? Common passwords could be guessed via collisions by testing a large quantity of passwords. Helps us avoid rainbow table hacks – because the hacker also needs the salt.
-
Why repeatedly hash? Slows down algorithms. It slows us down a bit, sure. But for a hacker we significantly slow their brute force attacks by increasing the how many times they have to hash every possible password.
TBD
TBD