Skip to content

Instantly share code, notes, and snippets.

@joongh
Created April 23, 2018 18:25
Show Gist options
  • Save joongh/f1d85087fad38505316b33ddef26d1e4 to your computer and use it in GitHub Desktop.
Save joongh/f1d85087fad38505316b33ddef26d1e4 to your computer and use it in GitHub Desktop.
detect file encoding using the first 4 bytes of the file
def detect_encoding(file_path):
with open(file_path, 'rb') as f:
b = f.read(4)
if (b[0:3] == b'\xef\xbb\xbf'):
return 'utf-8'
if ((b[0:2] == b'\xfe\xff') or (b[0:2] == b'\xff\xfe')):
return 'utf-16'
if ((b[0:5] == b'\xfe\xff\x00\x00') or (b[0:5] == b'\x00\x00\xff\xfe')):
return 'utf-32'
return 'euc-kr' # default value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment