Skip to content

Instantly share code, notes, and snippets.

@amundo
Created March 2, 2012 18:53
Show Gist options
  • Select an option

  • Save amundo/1960359 to your computer and use it in GitHub Desktop.

Select an option

Save amundo/1960359 to your computer and use it in GitHub Desktop.
letters.py - command line tool to look up the name of Unicode characters in a text
#!/usr/bin/env python
"""
pathall@gmail.com
Do Whatever the Fuck You Want To License
http://sam.zoy.org/wtfpl/
letters - cat some UTF-8 text to this script, and
it will output the unicode name of the characters in the text, if
there is one.
"""
import sys
from unicodedata import name
import codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
content = sys.stdin.read().strip()
text = content.decode('utf-8')
for letter in text:
try:
uniname = name(letter)
except ValueError:
continue
print letter, uniname, "(U+%.4X)" % ord(letter), 'u"\\u' + "%.4X" % ord(letter) + '"'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment