-
-
Save mgalgs/5286726 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3 | |
# REQUIRES: | |
# pyskein http://pythonhosted.org/pyskein/ | |
# python > 3.0 (due to pyskein) | |
# | |
# INSTALLATION: | |
# (install virtualenv) | |
# (install python3) | |
# virtualenv -p /path/to/python3 skein | |
# source skein/bin/activate | |
# pip install pyskein | |
import sys | |
import random | |
import struct | |
import string | |
from skein import skein1024 | |
def hash2bytes(thehash): | |
return struct.unpack('128B', bytearray.fromhex(thehash)) | |
THRESHOLD=421 # from http://almamater.xkcd.com/best.csv (search for usu.edu) | |
target = '5b4da95f5fa08280fc9879df44f418c8f9f12ba424b7757de02bbdfbae0d4c4fdf9317c80cc5fe04c6429073466cf29706b8c25999ddd2f6540d4475cc977b87f4757be023f19b8f4035d7722886b78869826de916a79cf9c94cc79cd4347d24b567aa3e2390a573a373a48a5e676640c79cc70197e1c5e7f902fb53ca1858b6' | |
target_bytes = hash2bytes(target) | |
def sendit(what, offby): | |
print () | |
print ('New best hash found!') | |
print ('---BEGIN STRING:---') | |
print ('%s' % what) | |
print ('---END STRING---') | |
print ('(should be off by %d)' % offby) | |
print () | |
def hashdiff(hash_Y_bytes,hash_Z_bytes): | |
nbits = 0 | |
# see how many bits are different | |
for Y, Z in zip(hash_Y_bytes, hash_Z_bytes): | |
X = Y ^ Z | |
while(X): | |
X &= X - 1 | |
nbits += 1 | |
return nbits | |
def newstring(strlen,candidate_chars): | |
theString = ''.join(random.choice(candidate_chars) for x in range(strlen)) | |
return theString | |
print() | |
best_hash = 1024 | |
min_hash = '' | |
nattempts = 0 | |
n = 0 | |
candidate_chars = string.printable[:-6] | |
strlen = 2 | |
attempt = newstring(strlen,candidate_chars) | |
while True: | |
attempt_hash = skein1024(bytes(attempt, 'ascii'), digest_bits=1024).hexdigest() | |
attempt_bytes = hash2bytes(attempt_hash) | |
d = hashdiff(attempt_bytes,target_bytes) | |
if d < best_hash: | |
best_hash = d | |
min_hash = attempt | |
if d < THRESHOLD: | |
sendit(attempt,d) | |
THRESHOLD = d | |
nattempts += 1 | |
n += 1 | |
if (n >> 10): | |
sys.stdout.write('\r%d hashes compared. Best so far is %d' % (nattempts, best_hash)) | |
attempt = newstring(strlen-1,candidate_chars) | |
n = 0 | |
attempt = attempt + random.choice(candidate_chars) | |
while :; do | |
rand=$(echo $RANDOM | sha1sum | cut -f1 -d' ') | |
len=$(shuf -i 3-31 -n 1) | |
try=${rand:$len} | |
output=$(curl -s -X POST --data "hashable=$try" http://almamater.xkcd.com/?edu=usu.edu | grep -o 'It is off by .* bits.') | |
echo "$try $output" | |
done | tee -a tries.txt |
I'm not really talking about submitting pre-hashed inputs to the server, I'm thinking about hashing the input locally, and comparing it to the given hash. If it is not within X bits of the given hash, then don't bother sending it to the server at all. (where X is the score of your current best hash)
Holy crap you're a freaking genius. I'll update it if I can find a Skein implementation...
There are a bunch listed in the "Links" section of http://www.schneier.com/skein.html. I have only tried the bash one so far, and it was actually slower than just submitting to the xkcd server.
Just uploaded a python version... The hash is correct but counting the number of bits is not working correctly... Not sure where my bug is...
UPDATE: uploaded a working local python version!
The local version is cranking out about 1000/second... I've been running for a few minutes and have gotten down to 433...
We need someone with a login on the USU HPC cluster to submit a job with this script...
Just got 428
! New record for USU...
421
... We're now top 125...
I think the only thing that you can do with this is throw more compute power at it. If you were trying to find the exact input used to generate the hash, then you would want to limit your attempts to things that were most likely to be included, like printable characters for example.
However, based on what I know about hash algorithms, unless you think there is a possibility that you will actually guess the original input, then a random guess is every bit as likely to have a "close" hash value as a more intelligent guess would be. Since I would expect that having the correct input except with one character wrong may give me a hash with a difference of 500 bits, I see no motivation to make intelligent guesses.
The only approach that makes more sense than brute force is if you could find a way to tweak your input to achieve a hash that is closer than the previous one. My guess is that this nearly impossible as well, because otherwise you could find an iterative solution that should eventually converge on the right answer, thus making the hash insecure.
Yeah, I thought about getting rid of the printable ascii requirement... I just need to see if I can enter non-ascii data on the submit form (if not through the web interface maybe I can make it work with wget
)...
UPDATE: All future development will happen here: https://github.com/mgalgs/xkcd-1193
Native C implementation added (at the above URL)
Update: the script now tries
$RANDOM
stuff. No more dictionary, no more manual range editing. Just let 'er fly!