Encoding to Latin 1 lets us interpret characters as bytes to fix the encoding.
Rule of thumb: whenever you have double-encoded data, undo the extra 'layer' of encoding by decoding to Unicode using that codec, then encoding again with Latin-1 to get bytes again.
"»Æ¹ûÊ÷".encode("latin1").decode("gb2312")
from: https://stackoverflow.com/questions/20922024/how-to-convert-encoding-in-python