Skip to content

Instantly share code, notes, and snippets.

@zh4n7wm
Last active March 27, 2019 11:11
Show Gist options
  • Save zh4n7wm/bca6cac3a8fe5e30b31757a42fa6ebf0 to your computer and use it in GitHub Desktop.
Save zh4n7wm/bca6cac3a8fe5e30b31757a42fa6ebf0 to your computer and use it in GitHub Desktop.
decode double-encoded data

Encoding to Latin 1 lets us interpret characters as bytes to fix the encoding.

Rule of thumb: whenever you have double-encoded data, undo the extra 'layer' of encoding by decoding to Unicode using that codec, then encoding again with Latin-1 to get bytes again.

"»Æ¹ûÊ÷".encode("latin1").decode("gb2312")

from: https://stackoverflow.com/questions/20922024/how-to-convert-encoding-in-python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment