Skip to content

Instantly share code, notes, and snippets.

@tomthorogood
Last active December 16, 2015 00:29
Show Gist options
  • Save tomthorogood/5347875 to your computer and use it in GitHub Desktop.
Save tomthorogood/5347875 to your computer and use it in GitHub Desktop.
Playing with unicode chars for data packing purposes:
#include <iostream>
#include <cstring>
using namespace std;
int main() {
const char* foo = "✓";
const char* foo2 = "\xe2\x9c\x93";
const char* bar = "é";
const char* baz = "a";
cout << "✓ char literal: " << sizeof('✓') << " bytes" << endl;
cout << "✓ str linteral: " << sizeof("✓") << " bytes" << endl;
cout << "✓: " << sizeof(foo[0]) << " bytes" << endl;
cout << "\\x notation (✓): " << sizeof(foo2[0]) << " bytes" << endl;
cout << "é: " << sizeof(bar[0]) << " bytes" << endl;
cout << "a: " << sizeof(bar[0]) << " bytes" << endl;
char32_t check = u'\xe2\x9c\x93';
long checkint = (long) check;
char32_t check2= static_cast<char32_t>(checkint);
cout << "✓ to char16_t output: " << check << ", integral output: " << checkint << endl;
cout << "long to char16_t output: " << check2 << endl;
return 0;
/* OUTPUT:
✓ char literal: 4 bytes
✓ str linteral: 4 bytes
✓: 1 bytes
\x notation (✓): 1 bytes
é: 1 bytes
a: 1 bytes
✓ to char16_t output: 147, integral output: 147
*/
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment