Skip to content

Instantly share code, notes, and snippets.

@Aldaviva
Last active October 8, 2022 13:02
Show Gist options
  • Save Aldaviva/f8980d70eb91dd16426333e73b04bacb to your computer and use it in GitHub Desktop.
Save Aldaviva/f8980d70eb91dd16426333e73b04bacb to your computer and use it in GitHub Desktop.
Examples of Unicode codepoints with different UTF-8 and UTF-16 byte counts. Try pasting these into your program to see if it can handle multi-byte characters.
Glyph Unicode codepoint UTF-8 code units UTF-8 bytes UTF-16 code units UTF-16LE bytes
B U+0042 1 0x42 1 0x42 0x00
ÿ U+00FF 2 0xC3 0xBF 1 0xFF 0x00
U+2603 3 0xE2 0x98 0x83 1 0x03 0x26
💩 U+1F4A9 4 0xF0 0x9F 0x92 0xA9 2 0x3D 0xD8 0xA9 0xDC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment