Created
May 26, 2011 23:02
-
-
Save brianewing/994303 to your computer and use it in GitHub Desktop.
Python MD5 of remote file (URL)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os, hashlib, urllib2, optparse | |
def get_remote_md5_sum(url, max_file_size=100*1024*1024): | |
remote = urllib2.urlopen(url) | |
hash = hashlib.md5() | |
total_read = 0 | |
while True: | |
data = remote.read(4096) | |
total_read += 4096 | |
if not data or total_read > max_file_size: | |
break | |
hash.update(data) | |
return hash.hexdigest() | |
if __name__ == '__main__': | |
opt = optparse.OptionParser() | |
opt.add_option('--url', '-u', default='http://www.google.com') | |
options, args = opt.parse_args() | |
print get_remote_md5_sum(options.url) |
If I run it on the same url twice it always produces a diff hash, even tho the file did not change. I can't figure out how to fix that. Any help would be appreciated. Or works as designed?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Note to myself:
urlopen
doesn't work with umlaut in the URL (userequest
instead)