Created
February 25, 2013 08:53
-
-
Save kiyoto/5028572 to your computer and use it in GitHub Desktop.
UTF-8 heavy languages like Japanese manage to pack in more meaning into 140 characters than English. Here is one farcical attempt to keep them honest: truncate the tweet after 140 bytes, not characters.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
javascript:(function(d){ | |
var tbox, text, len = 140, ii = 0; | |
tbox = d.getElementById('tweet-box-global'); | |
if (!tbox) return; | |
if (!(tbox = tbox.firstChild)) return; | |
if (!(text = tbox.innerHTML)) return; | |
while (true) { | |
len -= encodeURI(text[ii]).replace(/%[A-F\d]{2}/g, 'U').length; | |
if (len < 0) break; | |
ii++; | |
} | |
tbox.innerHTML = text.slice(0, ii); | |
})(document);void(0); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Supposedly there is a lot of subtleties around UTF. This "encodeURI" approach is probably not perfect, but it seems to work with Japanese characters (which is the only language that I know besides English).