Unicodes can either be integers (“A” is 65
, “B” is 66
, etc) or hex (“A” is 0x41
, “B” is 0x42
, etc).
When scripting with RoboFont or FontTools, a hard thing at first is that different styles come up in different contexts. For example, integers will often be used in scripts, but hex values are shown in UIs and in the TTX output of cmap
(the table that maps unicode values to glyphs). So, it's helpful to know how to go between them to do different types of work.
To go from a string to an unicode integer, you can use ord(), like:
>>> ord("A")
65
To go from an integer to a hex, you can use hex(), like:
>>> hex(65)
'0x41'
To go from an integer or hex to a string, you can use chr(), like:
>>> chr(0x41)
'A'
>>> chr(65)
'A'
To go from a hex value to an integer, use int()
, like:
>>> int(0x0083)
131
>>> int(0x41)
65
I recommend
unicodedata2
(https://github.com/mikekap/unicodedata2) instead of the standard library moduleunicodedata
, as the latter one is often not the latest.Also,
fontTools.unicodedata
(https://github.com/fonttools/fonttools/blob/master/Lib/fontTools/unicodedata/__init__.py) is my favorite kind of wrappedunicodedata
. It prefersunicodedata2
underlyingly and provides some useful, additional tools, such as.script(char: str) -> str
for the Unicode character propertyScript
(https://www.unicode.org/reports/tr24/), and the conversion between Unicode Script codes and OTL script tags:.ot_tags_from_script(script_code: str) -> List[str]
↔.ot_tag_to_script(tag: str) -> str
.