Last active
January 26, 2022 11:06
-
-
Save gb96/f22b232d73bcda96da3b6953a05ce83d to your computer and use it in GitHub Desktop.
Roboflow exports image datasets containing names that have been suffixed with either hex encoded (or worse) guids.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Windows PowerShell script for cleaning up terrible Roboflow dataset file names. | |
# One dataset I downloaded from Roboflow had duplicate images in the dataset, some with one kind of mangled name and duplicates | |
# with another kind of mangled name. | |
# Removing all the name mangling is an easy way to clean up the duplicates. | |
Get-ChildItem -Path . -Filter "*.rf.*" -Recurse | Rename-Item -NewName {$_.name -replace '_jpg\.rf\.[0-9a-zA-Z]+\.jpg','.jpg' } | |
# Any items that could not be renamed must be duplicates, so just remove them. | |
Get-ChildItem -Path . -Filter "*.rf.*" -Recurse | Remove-Item $_ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment