A naive regex to match supplementary characters in the range U+10000–U+EFFFF produces nonsensical results:

    jshell> Pattern.matches("[\u10000-\uEFFFF]+", "abc")
        ==> true

Likewise the version using regex escapes rather than character escapes:

    jshell> Pattern.matches("[\\u10000-\\uEFFFF]+", "abc")
        ==> true

This is presumably because interprets the first four digits as the character code, and the final digit as a separate character:

    jshell> "\u10000".toCharArray()
        ==> char[2] { 'က', '0' } // '\u1000', '0'

    jshell> "\uEFFFF".toCharArray()
        ==> char[2] { '', 'F' } // '\uEFFF', 'F'

According to [_Supplementary Characters in the Java Platform_](http://www.oracle.com/us/technologies/java/supplementary-142654.html), the proper way to escape surrogate characters is with UTF-16 code units.

In UTF-16, [U+10000](http://www.fileformat.info/info/unicode/char/10000/index.htm) is 0xD800 0xDC00, and [U+EFFFF](http://www.fileformat.info/info/unicode/char/EFFFF/index.htm) is 0xDB7F 0xDFFF. This gives us the regex `"[\uD800\uDC00-\uDB7F\uDFFF]"`:

    jshell> Pattern.matches("[\uD800\uDC00-\uDB7F\uDFFF]", "1")
        ==> false

jshell> Pattern.matches("[\uD800\uDC00-\uDB7F\uDFFF]", "\uD9BF\uDFFF") // U+7FFFF
        ==> true