Skip to content

Instantly share code, notes, and snippets.

@gb96
Last active January 26, 2022 11:06
Show Gist options
  • Save gb96/f22b232d73bcda96da3b6953a05ce83d to your computer and use it in GitHub Desktop.
Save gb96/f22b232d73bcda96da3b6953a05ce83d to your computer and use it in GitHub Desktop.
Roboflow exports image datasets containing names that have been suffixed with either hex encoded (or worse) guids.
# Windows PowerShell script for cleaning up terrible Roboflow dataset file names.
# One dataset I downloaded from Roboflow had duplicate images in the dataset, some with one kind of mangled name and duplicates
# with another kind of mangled name.
# Removing all the name mangling is an easy way to clean up the duplicates.
Get-ChildItem -Path . -Filter "*.rf.*" -Recurse | Rename-Item -NewName {$_.name -replace '_jpg\.rf\.[0-9a-zA-Z]+\.jpg','.jpg' }
# Any items that could not be renamed must be duplicates, so just remove them.
Get-ChildItem -Path . -Filter "*.rf.*" -Recurse | Remove-Item $_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment