Skip to content

Instantly share code, notes, and snippets.

@user202729
Last active May 12, 2023 15:51
Show Gist options
  • Select an option

  • Save user202729/4a54e705502f10a5cbe16c63f4eb8a20 to your computer and use it in GitHub Desktop.

Select an option

Save user202729/4a54e705502f10a5cbe16c63f4eb8a20 to your computer and use it in GitHub Desktop.
Things that are included in commonunicode package but excluded from unicode-math-input package
#!/bin/python3
from pathlib import Path
import re
specially_handled = {
match[1] for match in
re.finditer(r'\\__umi_special_handle{(.)}', Path("unicode-math-input.sty").expanduser().read_text())
}
a = {match[1]: match[2] for line in
Path("unicode-math-input-table.tex").read_text().splitlines()
for match in [re.fullmatch(
r'\\__umi_define_char(?:_maybe_delimiter)?{(.*?)}{(.*)}', line)] if match}
b = {chr(int(match[1], 16)): match[2] for line in
Path("commonunicode.sty").read_text().splitlines()
for match in [re.fullmatch(
r'\\DeclareUnicodeCharacter{(.*?)}{(.*)}', line)] if match}
{*a}-{*b} # stuff we get better
import unicodedata
import json
# stuff we miss...?
for u in {*b}-{*a,*specially_handled}:
print(f"[{u}] U+{ord(u):04X} {unicodedata.name(u, '???')} → {b[u]}")

Things not supported by unicode-math:

  • things that is not math, ignore
  • circled digits: can search for "circled numerals" in symbols, but overall no specific control sequence I think
  • stigma: LaTeX has \textstigma? some package has this symbol but stigma: $Ϛ$ Ϛ $ϛ$ ϛ does not work by default in unicode-math (character not in font)
  • tried \setmathfont{DejaVu Math TeX Gyre} doesn't have either
  • is non-math version of
  • µ is non-math version of μ
  • is deprecated version of
  • is... not sure? similar to Mapsto but not quite -- the whole block seems to be not available in math fonts https://en.wikipedia.org/wiki/Arrows_%28Unicode_block%29
[’] U+2019 RIGHT SINGLE QUOTATION MARK → \textquoteright
[µ] U+00B5 MICRO SIGN → \textmu
[¤] U+00A4 CURRENCY SIGN → \textcurrency
[“] U+201C LEFT DOUBLE QUOTATION MARK → \textquotedblleft
[̲] U+0332 COMBINING LOW LINE → \ensuremath{\underline{\phantom{x}}}
[‹] U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK → \guilsinglleft
[‣] U+2023 TRIANGULAR BULLET → \ensuremath{\RHD}
[⚠] U+26A0 WARNING SIGN → \ensuremath{\lower .25ex\hbox{\Large $\triangle$\hskip -1.25ex}!\;\,}
[Ϟ] U+03DE GREEK LETTER KOPPA → \ensuremath{K}
[ᇹ] U+11F9 HANGUL JONGSEONG YEORINHIEUH → \COMMONUNICODE@LOCALunknownchar
[Ϛ] U+03DA GREEK LETTER STIGMA → \ensuremath{S}
[‚] U+201A SINGLE LOW-9 QUOTATION MARK → \quotesinglbase
[›] U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK → \guilsinglright
[⑤] U+2464 CIRCLED DIGIT FIVE → \ensuremath{\text{5}}
[㏒] U+33D2 SQUARE LOG → \ensuremath{\log}
[–] U+2013 EN DASH → --
[¢] U+00A2 CENT SIGN → \textcent
[ϡ] U+03E1 GREEK SMALL LETTER SAMPI → \ensuremath{s}
[⁒] U+2052 COMMERCIAL MINUS SIGN → \textdiscount
[̂] U+0302 COMBINING CIRCUMFLEX ACCENT → \ensuremath{\hat{\phantom{x}}}
[‰] U+2030 PER MILLE SIGN → \textperthousand
[©] U+00A9 COPYRIGHT SIGN → \copyright
[»] U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK → \guillemotright
[ϙ] U+03D9 GREEK SMALL LETTER ARCHAIC KOPPA → \ensuremath{q}
[ ] U+202F NARROW NO-BREAK SPACE → \,
[м] U+043C CYRILLIC SMALL LETTER EM → \COMMONUNICODE@LOCALunknownchar
[
] U+000B ??? → ~
[⁅] U+2045 LEFT SQUARE BRACKET WITH QUILL → \textlquill
[П] U+041F CYRILLIC CAPITAL LETTER PE → \COMMONUNICODE@LOCALunknownchar
[¨] U+00A8 DIAERESIS → \"{ }
[☺] U+263A WHITE SMILING FACE → \ensuremath{\smiley}
[р] U+0440 CYRILLIC SMALL LETTER ER → \COMMONUNICODE@LOCALunknownchar
[K] U+212A KELVIN SIGN → \ensuremath{\mathrm K}
[⁈] U+2048 QUESTION EXCLAMATION MARK → {?\kern -.5ex!}
[Ϡ] U+03E0 GREEK LETTER SAMPI → \ensuremath{S}
[ϐ] U+03D0 GREEK BETA SYMBOL → \ensuremath{\beta}
[ ] U+00A0 NO-BREAK SPACE → ~
[‘] U+2018 LEFT SINGLE QUOTATION MARK → \textquoteleft
[и] U+0438 CYRILLIC SMALL LETTER I → \COMMONUNICODE@LOCALunknownchar
[※] U+203B REFERENCE MARK → \textreferencemark
[⁉] U+2049 EXCLAMATION QUESTION MARK → {!\kern -.5ex?}
[‾] U+203E OVERLINE → \ensuremath{\overline{0}}
[⁂] U+2042 ASTERISM → \COMMONUNICODE@LOCALunknownchar
[③] U+2462 CIRCLED DIGIT THREE → \ensuremath{\text{3}}
[—] U+2014 EM DASH → ---
[⑨] U+2468 CIRCLED DIGIT NINE → \ensuremath{\text{9}}
[⑦] U+2466 CIRCLED DIGIT SEVEN → \ensuremath{\text{7}}
[«] U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK → \guillemotleft
[ힰ] U+D7B0 HANGUL JUNGSEONG O-YEO → \COMMONUNICODE@LOCALunknownchar
[®] U+00AE REGISTERED SIGN → \textregistered
[Ϙ] U+03D8 GREEK LETTER ARCHAIC KOPPA → \ensuremath{Q}
[☑] U+2611 BALLOT BOX WITH CHECK → \fbox{\ensuremath{\checkmark}}
[ϟ] U+03DF GREEK SMALL LETTER KOPPA → \ensuremath{k}
[ϛ] U+03DB GREEK SMALL LETTER STIGMA → \ensuremath{s}
[¦] U+00A6 BROKEN BAR → \textbrokenbar
[〚] U+301A LEFT WHITE SQUARE BRACKET → \ensuremath{[}
[〈] U+2329 LEFT-POINTING ANGLE BRACKET → \ensuremath{\langle}
[⑧] U+2467 CIRCLED DIGIT EIGHT → \ensuremath{\text{8}}
[ᄀ] U+1100 HANGUL CHOSEONG KIYEOK → \COMMONUNICODE@LOCALunknownchar
[①] U+2460 CIRCLED DIGIT ONE → \ensuremath{\text{1}}
[㏑] U+33D1 SQUARE LN → \ensuremath{\ln}
[̈] U+0308 COMBINING DIAERESIS → \ensuremath{\ddot{\phantom{x}}}
[℮] U+212E ESTIMATED SYMBOL → \textestimated
[☕] U+2615 HOT BEVERAGE → \COMMONUNICODE@LOCALunknownchar
[②] U+2461 CIRCLED DIGIT TWO → \ensuremath{\text{2}}
[¡] U+00A1 INVERTED EXCLAMATION MARK → \textexclamdown
[⁢��] U+2062 INVISIBLE TIMES → {}
[⑥] U+2465 CIRCLED DIGIT SIX → \ensuremath{\text{6}}
[🌩] U+1F329 CLOUD WITH LIGHTNING → \ensuremath{\lightning}
[в] U+0432 CYRILLIC SMALL LETTER VE → \COMMONUNICODE@LOCALunknownchar
[°] U+00B0 DEGREE SIGN → \textsuperscript{o}
[”] U+201D RIGHT DOUBLE QUOTATION MARK → \textquotedblright
[⸘] U+2E18 INVERTED INTERROBANG → \textinterrobangdown
[„] U+201E DOUBLE LOW-9 QUOTATION MARK → \quotedblbase
[е] U+0435 CYRILLIC SMALL LETTER IE → \COMMONUNICODE@LOCALunknownchar
[☐] U+2610 BALLOT BOX → \fbox{\ensuremath{\phantom{{\checkmark}}}}
[ந] U+0BA8 TAMIL LETTER NA → \COMMONUNICODE@LOCALunknownchar
[☹] U+2639 WHITE FROWNING FACE → \ensuremath{\frownie}
[º] U+00BA MASCULINE ORDINAL INDICATOR → \textordmasculine
[☧] U+2627 CHI RHO → \COMMONUNICODE@LOCALunknownchar
[т] U+0442 CYRILLIC SMALL LETTER TE → \COMMONUNICODE@LOCALunknownchar
[〉] U+232A RIGHT-POINTING ANGLE BRACKET → \ensuremath{\rangle}
[™] U+2122 TRADE MARK SIGN → \texttrademark
[‽] U+203D INTERROBANG → \textinterrobang
[¯] U+00AF MACRON → \textasciimacron
[⇰] U+21F0 RIGHTWARDS WHITE ARROW FROM WALL → \ensuremath{\mapsto}
[⁆] U+2046 RIGHT SQUARE BRACKET WITH QUILL → \textrquill
[¿] U+00BF INVERTED QUESTION MARK → \textquestiondown
[ி] U+0BBF TAMIL VOWEL SIGN I → \COMMONUNICODE@LOCALunknownchar
[‱] U+2031 PER TEN THOUSAND SIGN → \textpertenthousand
[Ω] U+2126 OHM SIGN → \ensuremath{\Omega}
[④] U+2463 CIRCLED DIGIT FOUR → \ensuremath{\text{4}}
[ª] U+00AA FEMININE ORDINAL INDICATOR → \textordfeminine
[〛] U+301B RIGHT WHITE SQUARE BRACKET → \ensuremath{]}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment