Let's say you have a model, with an files attached, using Paperclip. You have a couple millions of those files and you're not sure that every one of them (and all its thumbnails) are still used by a database record.
You could use this rake task to recursively scan all the directories and check if the files need to be kept or destroyed.
In this example, the model is called Picture
, the attachment is image
and the path is partitioned like images/001/412/497/actual_file.jpg
The task is going down the path. Each time the path ends with 3 triplets of digits ("001/412/497" for example) it looks for a record with the ID 1412497. If such a record doesn't exist, the whole directory is moved to a parallel images_deleted
directory. At the end you can delete the files if you like, or move them to an archive location.
You can use the "dry run" mode : to print which files would be removed