Skip to content

Instantly share code, notes, and snippets.

View muschneider's full-sized avatar

Mauro Schneider muschneider

View GitHub Profile
@alegomes
alegomes / iconv
Created May 27, 2012 00:28
Converting unknown charset file to UTF-8
I had a dataset but it was not UTF-8. So, I had to find out which charset was being used. 'file' command didn't helped me out.
$ file file_name.csv
file_name.csv: Non-ISO extended-ASCII C++ program text, with very long lines, with CRLF line terminators
So, I made this bash script to figure out its encoding:
First, I converted the file to every single format available by 'iconv':
$ for f in $(iconv -l); do echo "Convertendo $f ..."; iconv -f $f -t UTF-8 < file_name.csv > fil_name.$f.csv; done