-
-
Save rbriank/464697 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'digest/md5' | |
# usage: run this in the root directory of your iTunes Music folder, or wherever, and pipe the output to a file | |
# next, pipe the output of that file through `sort` to a new file | |
# now, use the next script on that file | |
ls = Dir['**/*'] | |
ls.each_with_index do |f, i| | |
STDERR.puts ls.length - i if (i % 100 == 0) | |
next if File.directory?(f) | |
md5= Digest::MD5.hexdigest(File.read(f)) | |
puts "#{md5} #{f}" | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# use at your own risk! | |
# read the code | |
# enjoy! | |
# usage: cat filesort.txt | ruby removedupes.rb | |
# where filesort.txt is the output of the `sort` command from the last script | |
lasthash = "x" | |
lastname = "z" | |
doput = false | |
group = [] | |
while line = gets | |
line = line.chomp | |
lp = line.split(' ') | |
hash = lp.shift | |
name = lp.join(' ') | |
if hash == lasthash | |
group << lastname | |
doput = true | |
else | |
if doput | |
group << lastname | |
# process the group of identical files to delete all but the one with the shortest pathname | |
group.sort! do |a, b| | |
b.length - a.length | |
end | |
keep = group.pop | |
group.each do |f| | |
begin | |
File.delete(f) | |
rescue | |
# somewhere in the toolchain, things like double-spaces are getting collapsed to single spaces | |
# when I used this, there were 40-ish files I had to delete by hand | |
# by grepping the output of this script for ^error and using shell-completion to | |
# get the proper filename to delete | |
puts "error #{f}" | |
end | |
end | |
puts "kept #{keep}" | |
end | |
group = [] | |
doput = false | |
end | |
lasthash = hash | |
lastname = name | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment