A simple UTF8 encoder.
-
-
Save tsaniel/1086384 to your computer and use it in GitHub Desktop.
UTF8 encoder
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function( | |
a, // the text | |
b, // String.fromCharCode | |
c, // placeholder | |
d, // placeholder | |
e // placeholder | |
){ | |
for (c=e=''; d=a.charCodeAt(c++); ) // get the Unicode value of the current character | |
e += d < 128 ? // U+0000-U+007F | |
b(d) : // 0xxxxxxx | |
(d < 2048 ? // U+0080-U+07FF | |
b(d >> 6 | 192) : // 110xxxxx | |
b(d >> 12 | 224, d >> 6 & 63 | 128) // U+0800-U+FFFF 1110xxxx 10xxxxxx | |
) + b(d & 63 | 128); // 10xxxxxx | |
return e; | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function(a,b,c,d,e){for(c=0,e="";d=a.charCodeAt(c++);)e+=d<128?b(d):(d<2048?b(d>>6|192):b(d>>12|224,d>>6&63|128))+b(d&63|128);return e} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
Version 2, December 2004 | |
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE> | |
Everyone is permitted to copy and distribute verbatim or modified | |
copies of this license document, and changing it is allowed as long | |
as the name is changed. | |
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION | |
0. You just DO WHAT THE FUCK YOU WANT TO. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"name": "utf8encoder", | |
"description": "A simple UTF8 encoder.", | |
"keywords": [ | |
"utf8", | |
"utf-8", | |
"encode", | |
"encoder", | |
"unicode" | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!DOCTYPE html> | |
<title>UTF8 encode</title> | |
<div>Expected value: <b>Normal text</b></div> | |
<div>Actual value: <b id="ret"></b></div> | |
<script> | |
var myFunction = function(a,b,c,d,e){for(c=0,e="";d=a.charCodeAt(c++);)e+=d<128?b(d):(d<2048?b(d>>6|192):b(d>>12|224)+b(d>>6&63|128))+b(d&63|128);return e}; | |
document.getElementById( "ret" ).innerHTML = myFunction('Normal text', String.fromCharCode); | |
</script> |
Yes, especially the String.fromCharCode. I'm thinking if we can save 20 bytes in order to put the String.fromCharCode inside...
It seems that there is a more powerful function...
http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html
c=0,e=""
=>c=e=""
- perhaps this is ripe for some sort of eval?
d>>6
could appear 3 times.
also, can you exploit the fact that String.fromCharCode(a,null) === String.fromCharCode(a)
?
Thanks for your tips, @jed! I think eval is good, but it seems couldn't save bytes with d>>6
.
Anyways, that fact is really awesome. I'm still thinking how to exploit it...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
this is really amazing! - pity that charCodeAt and String.fromCharCode are such byte hoggers.