Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Alexwi1son/6e4c5125ed48e917f0487a6b07fcf834 to your computer and use it in GitHub Desktop.
Save Alexwi1son/6e4c5125ed48e917f0487a6b07fcf834 to your computer and use it in GitHub Desktop.
Removing duplicate files

Removes duplicates from a directory based on checksum

Find duplicates and write to duplicates-report.txt

find . -type f -exec cksum {} \; | sort | tee /tmp/f.tmp | cut -f 1,2 -d ' ' | uniq -d | grep -hif /dev/stdin /tmp/f.tmp > duplicates-report.txt

Delete duplicates based on duplicates-report.txt

sort -k1,1 duplicates-report.txt --stable | gawk -F' ' '{if ( $1==old ) { print $3 }; old=$1; }' > duplicates-only.txt | xargs rm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment