Skip to content

Instantly share code, notes, and snippets.

@kimdwkimdw
Last active July 25, 2017 14:11
Show Gist options
  • Save kimdwkimdw/a2ea13848167984adc8f to your computer and use it in GitHub Desktop.
Save kimdwkimdw/a2ea13848167984adc8f to your computer and use it in GitHub Desktop.
Python 2.x Encoding cheatsheet

With file?

Try Chardet(https://pypi.python.org/pypi/chardet)

Change Windows command line codepage to utf-8

(Need to change font)

chcp 65001

Want to get back?

chcp 949

Unicode가 아닌상태로 CJK 비교할때 주의할점.

# from http://acuros.pe.kr/?p=249
random.choice(["효","쨉","홰"]) in "너는 비도 안오는데 우산을 가지고 다니네" == True
'\xc2\xb5' in '\xb3\xca\xb4\xc2 \xba\xf1\xb5\xb5 \xbe\xc8\xbf\xc0\xb4\xc2\xb5\xa5 \xbf\xed \xb4\xd9\xb4\xcf\xb3\xd7'

한글자만 비교할때는 ..이런 문제가 생길 가능성이 높으므로 주의.

Environment Variable

PYTHONIOENCODING = utf-8 
PATH = PYTHON_INSTALL_DIR
'''
Python Encoding cheatsheet
Mostly for korean.
All below items are True
'''
"한글" != u"한글"
"한글" == '\xed\x95\x9c\xea\xb8\x80'
u"한글" == u'\ud55c\uae00'
"한글".decode("utf-8") == u"한글"
"한글" == u"한글".encode("utf-8")
"\ud55c\uae00".decode('unicode_escape') == u"한글"
# #2 > #1 #2 is faster than #1.
"한글".decode("utf-8") in u"asdf한글asdf" #1
"한글" in u"asdf한글asdf".encode("utf-8") #2