Skip to content

Instantly share code, notes, and snippets.

@townie
Created November 18, 2015 06:26
Show Gist options
  • Save townie/859a5b18cf7abf0f1791 to your computer and use it in GitHub Desktop.
Save townie/859a5b18cf7abf0f1791 to your computer and use it in GitHub Desktop.
python_file_dupe.py
import os
import hashlib
from sets import Set
hash_set = Set([])
for dirpath, dir, files in os.walk("."):
files
hasher = hashlib.md5()
for file in files:
with open(file, 'rb') as af:
hasher.update(af.read())
filehash = h.hexdigest()
if (filehash in s):
print file
else:
hash_set.add(filehash)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment