Last active
December 17, 2015 17:09
-
-
Save zerobase/5643476 to your computer and use it in GitHub Desktop.
To clean up duplicated files. For Mac OS X.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# For Mac OS X | |
# To clean up duplicated files. | |
# isDuplicate(fileName) | |
# It returns true is the files are duplicated. | |
# It tests: | |
# - a file name (before the extention) ends with [0-9], and | |
# - there is a same sized file without trailing [0-9] of its file name. | |
# For example: "Word Power Made Easy Norman Lewis 528p_067174190X 2.pdf" | |
# original: "Word Power Made Easy Norman Lewis 528p_067174190X.pdf" | |
require 'fileutils' | |
def isDuplicate(fileName) | |
return false unless fileName =~ /^(.*) [0-9](\.\w+)$/ | |
#return false unless fileName =~ /^(.*).[0-9](\.\w+)$/ | |
baseName = Regexp.last_match[1] | |
baseName = Regexp.last_match[1] | |
extention = Regexp.last_match[2] | |
originalFileName = baseName + extention | |
return false unless File.exists?(originalFileName) | |
return false unless File.size(originalFileName) == File.size(fileName) | |
if FileUtils.compare_file(originalFileName, fileName) | |
return true | |
else | |
return false | |
end | |
end | |
def findDuplicateFiles(directoryName) | |
foundFileNames = [] | |
Dir.glob(directoryName + "/*") do |fileName| | |
if isDuplicate(fileName) | |
foundFileNames << fileName | |
end | |
end | |
return foundFileNames | |
end | |
def cleanUp(foundFileNames) | |
print "Deleting #{foundFileNames.size} files:\n" | |
foundFileNames.each do |fileName| | |
File.unlink fileName | |
print fileName, "\n" | |
end | |
print "Finished to delete #{foundFileNames.size} files.\n" | |
end | |
# main: | |
directoryName = ARGV.shift || '.' | |
foundFileNames = findDuplicateFiles(directoryName) | |
cleanUp(foundFileNames) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment