Skip to content

Instantly share code, notes, and snippets.

@jweyrich
Created November 21, 2017 13:01
Show Gist options
  • Save jweyrich/901c903044062954acba2b88ea4ad8f7 to your computer and use it in GitHub Desktop.
Save jweyrich/901c903044062954acba2b88ea4ad8f7 to your computer and use it in GitHub Desktop.
Attempt to detect the charset of files recursively
#!/usr/bin/python
import chardet
import os
# traverse root directory, and list directories as dirs and files as files
for root, dirs, files in os.walk("."):
path = root.split('/')
for file in files:
print '%s/%s => %s (%s)' % (root, file, chardet.detect(file)['encoding'], chardet.detect(file)['confidence'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment