Skip to content

Instantly share code, notes, and snippets.

@amundo
Created June 11, 2012 03:16
Show Gist options
  • Select an option

  • Save amundo/2908340 to your computer and use it in GitHub Desktop.

Select an option

Save amundo/2908340 to your computer and use it in GitHub Desktop.
look up names of characters in text
#!/usr/bin/env python
"""
pathall@gmail.com
Do Whatever the Fuck You Want To License
http://sam.zoy.org/wtfpl/
letters - cat some UTF-8 text to this script, and
it will output the unicode name of the characters in the text, if
there is one.
"""
import sys
from unicodedata import name
import codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
content = sys.stdin.read().strip()
text = content.decode('utf-8')
for letter in text:
try:
uniname = name(letter)
except ValueError:
continue
print letter, uniname, "(U+%.4X)" % ord(letter), 'u"\\u' + "%.4X" % ord(letter) + '"'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment