Created
September 26, 2011 19:50
-
-
Save mathiasbynens/1243213 to your computer and use it in GitHub Desktop.
Escape all characters in a string using both Unicode and hexadecimal escape sequences
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Ever needed to escape '\n' as '\\n'? This function does that for any character, | |
// using hex and/or Unicode escape sequences (whichever are shortest). | |
// Demo: http://mothereff.in/js-escapes | |
function unicodeEscape(str) { | |
return str.replace(/[\s\S]/g, function(character) { | |
var escape = character.charCodeAt().toString(16), | |
longhand = escape.length > 2; | |
return '\\' + (longhand ? 'u' : 'x') + ('0000' + escape).slice(longhand ? -4 : -2); | |
}); | |
} |
Check out jsesc
which solves this problem in a more robust manner.
@mathiasbynens It looks great! I did try to use it but unfortunately I'm not up to date with all the browserify/bundling stuff and just need a vanilla JS script (e.g. no use of Buffer
) to include in a module import and wasn't able to work out how to do that with jsesc
(though I admit I only poked around for a few minutes before deciding to write the function above). Also, out of pure curiosity I'd be interested in cases where the above function fails - I couldn't find any failing cases in my tests.
@josephrocca See https://github.com/mathiasbynens/jsesc#support. TL:DR use v1.3.0.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@mervick @rafaelvanat If I use that function like this:
Then I get:
The following function fixes this by matching all non-ASCII characters after splitting the string in a "unicode-safe" way (using
[...str]
). It then splits each Unicode character up into its code-points, and gets the escape code for each (rather than just grabbing the first char code of each Unicode character):This gives the correct result:
This seems to work fine in all my tests so far, but if I find any bugs I'll add fixes in this gist. Performance doesn't matter for my use-case, so I haven't benchmarked or optimised it at all.