Skip to content

Instantly share code, notes, and snippets.

@mgalgs
Last active December 15, 2015 16:09
Show Gist options
  • Save mgalgs/5286726 to your computer and use it in GitHub Desktop.
Save mgalgs/5286726 to your computer and use it in GitHub Desktop.
Brute force http://xkcd.com/1193/
#!/usr/bin/env python3
# REQUIRES:
# pyskein http://pythonhosted.org/pyskein/
# python > 3.0 (due to pyskein)
#
# INSTALLATION:
# (install virtualenv)
# (install python3)
# virtualenv -p /path/to/python3 skein
# source skein/bin/activate
# pip install pyskein
import sys
import random
import struct
import string
from skein import skein1024
def hash2bytes(thehash):
return struct.unpack('128B', bytearray.fromhex(thehash))
THRESHOLD=421 # from http://almamater.xkcd.com/best.csv (search for usu.edu)
target = '5b4da95f5fa08280fc9879df44f418c8f9f12ba424b7757de02bbdfbae0d4c4fdf9317c80cc5fe04c6429073466cf29706b8c25999ddd2f6540d4475cc977b87f4757be023f19b8f4035d7722886b78869826de916a79cf9c94cc79cd4347d24b567aa3e2390a573a373a48a5e676640c79cc70197e1c5e7f902fb53ca1858b6'
target_bytes = hash2bytes(target)
def sendit(what, offby):
print ()
print ('New best hash found!')
print ('---BEGIN STRING:---')
print ('%s' % what)
print ('---END STRING---')
print ('(should be off by %d)' % offby)
print ()
def hashdiff(hash_Y_bytes,hash_Z_bytes):
nbits = 0
# see how many bits are different
for Y, Z in zip(hash_Y_bytes, hash_Z_bytes):
X = Y ^ Z
while(X):
X &= X - 1
nbits += 1
return nbits
def newstring(strlen,candidate_chars):
theString = ''.join(random.choice(candidate_chars) for x in range(strlen))
return theString
print()
best_hash = 1024
min_hash = ''
nattempts = 0
n = 0
candidate_chars = string.printable[:-6]
strlen = 2
attempt = newstring(strlen,candidate_chars)
while True:
attempt_hash = skein1024(bytes(attempt, 'ascii'), digest_bits=1024).hexdigest()
attempt_bytes = hash2bytes(attempt_hash)
d = hashdiff(attempt_bytes,target_bytes)
if d < best_hash:
best_hash = d
min_hash = attempt
if d < THRESHOLD:
sendit(attempt,d)
THRESHOLD = d
nattempts += 1
n += 1
if (n >> 10):
sys.stdout.write('\r%d hashes compared. Best so far is %d' % (nattempts, best_hash))
attempt = newstring(strlen-1,candidate_chars)
n = 0
attempt = attempt + random.choice(candidate_chars)
while :; do
rand=$(echo $RANDOM | sha1sum | cut -f1 -d' ')
len=$(shuf -i 3-31 -n 1)
try=${rand:$len}
output=$(curl -s -X POST --data "hashable=$try" http://almamater.xkcd.com/?edu=usu.edu | grep -o 'It is off by .* bits.')
echo "$try $output"
done | tee -a tries.txt
@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

Update: the script now tries $RANDOM stuff. No more dictionary, no more manual range editing. Just let 'er fly!

@nispio
Copy link

nispio commented Apr 1, 2013

I'm not really talking about submitting pre-hashed inputs to the server, I'm thinking about hashing the input locally, and comparing it to the given hash. If it is not within X bits of the given hash, then don't bother sending it to the server at all. (where X is the score of your current best hash)

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

Holy crap you're a freaking genius. I'll update it if I can find a Skein implementation...

@nispio
Copy link

nispio commented Apr 1, 2013

There are a bunch listed in the "Links" section of http://www.schneier.com/skein.html. I have only tried the bash one so far, and it was actually slower than just submitting to the xkcd server.

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

Just uploaded a python version... The hash is correct but counting the number of bits is not working correctly... Not sure where my bug is...

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

UPDATE: uploaded a working local python version!

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

The local version is cranking out about 1000/second... I've been running for a few minutes and have gotten down to 433...

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

We need someone with a login on the USU HPC cluster to submit a job with this script...

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

Just got 428! New record for USU...

@mgalgs
Copy link
Author

mgalgs commented Apr 1, 2013

421... We're now top 125...

@nispio
Copy link

nispio commented Apr 2, 2013

I think the only thing that you can do with this is throw more compute power at it. If you were trying to find the exact input used to generate the hash, then you would want to limit your attempts to things that were most likely to be included, like printable characters for example.

However, based on what I know about hash algorithms, unless you think there is a possibility that you will actually guess the original input, then a random guess is every bit as likely to have a "close" hash value as a more intelligent guess would be. Since I would expect that having the correct input except with one character wrong may give me a hash with a difference of 500 bits, I see no motivation to make intelligent guesses.

The only approach that makes more sense than brute force is if you could find a way to tweak your input to achieve a hash that is closer than the previous one. My guess is that this nearly impossible as well, because otherwise you could find an iterative solution that should eventually converge on the right answer, thus making the hash insecure.

@mgalgs
Copy link
Author

mgalgs commented Apr 2, 2013

Yeah, I thought about getting rid of the printable ascii requirement... I just need to see if I can enter non-ascii data on the submit form (if not through the web interface maybe I can make it work with wget)...

@mgalgs
Copy link
Author

mgalgs commented Apr 2, 2013

UPDATE: All future development will happen here: https://github.com/mgalgs/xkcd-1193

Native C implementation added (at the above URL)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment