Skip to content

Instantly share code, notes, and snippets.

@nassimhaddad
Created October 20, 2014 09:55
Show Gist options
  • Save nassimhaddad/48d76903ea836e05410c to your computer and use it in GitHub Desktop.
Save nassimhaddad/48d76903ea836e05410c to your computer and use it in GitHub Desktop.
how to determine the encoding of a csv file ?

Source: http://pandaproject.net/docs/determining-the-encoding-of-a-csv-file.html

If you have no way of finding out the correct encoding of the file, then try the following encodings, in this order:

  • utf-8
  • iso-8859-1 (also known as latin-1) (This is the encoding of all census data and much other data produced by government entities.)
  • utf-16

If none of these work the likelihood you are going to determine the encoding without additional information from the source is very low. In theory you may be able to guess the encoding based on the language of the author, however this not a recommended practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment