Skip to content

Instantly share code, notes, and snippets.

@kiyoto
Created February 25, 2013 08:53
Show Gist options
  • Save kiyoto/5028572 to your computer and use it in GitHub Desktop.
Save kiyoto/5028572 to your computer and use it in GitHub Desktop.
UTF-8 heavy languages like Japanese manage to pack in more meaning into 140 characters than English. Here is one farcical attempt to keep them honest: truncate the tweet after 140 bytes, not characters.
javascript:(function(d){
var tbox, text, len = 140, ii = 0;
tbox = d.getElementById('tweet-box-global');
if (!tbox) return;
if (!(tbox = tbox.firstChild)) return;
if (!(text = tbox.innerHTML)) return;
while (true) {
len -= encodeURI(text[ii]).replace(/%[A-F\d]{2}/g, 'U').length;
if (len < 0) break;
ii++;
}
tbox.innerHTML = text.slice(0, ii);
})(document);void(0);
@kiyoto
Copy link
Author

kiyoto commented Feb 25, 2013

Supposedly there is a lot of subtleties around UTF. This "encodeURI" approach is probably not perfect, but it seems to work with Japanese characters (which is the only language that I know besides English).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment