Created
March 23, 2014 02:12
-
-
Save abdullahbutt/9717602 to your computer and use it in GitHub Desktop.
regular expressions character classes and other special escapes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Sequence Note Description | |
| [...] [1] Match a character according to the rules of the | |
| bracketed character class defined by the "...". | |
| Example: [a-z] matches "a" or "b" or "c" ... or "z" | |
| [[:...:]] [2] Match a character according to the rules of the POSIX | |
| character class "..." within the outer bracketed | |
| character class. Example: [[:upper:]] matches any | |
| uppercase character. | |
| (?[...]) [8] Extended bracketed character class | |
| \w [3] Match a "word" character (alphanumeric plus "_", plus | |
| other connector punctuation chars plus Unicode | |
| marks) | |
| \W [3] Match a non-"word" character | |
| \s [3] Match a whitespace character | |
| \S [3] Match a non-whitespace character | |
| \d [3] Match a decimal digit character | |
| \D [3] Match a non-digit character | |
| \pP [3] Match P, named property. Use \p{Prop} for longer names | |
| \PP [3] Match non-P | |
| \X [4] Match Unicode "eXtended grapheme cluster" | |
| \C Match a single C-language char (octet) even if that is | |
| part of a larger UTF-8 character. Thus it breaks up | |
| characters into their UTF-8 bytes, so you may end up | |
| with malformed pieces of UTF-8. Unsupported in | |
| lookbehind. | |
| \1 [5] Backreference to a specific capture group or buffer. | |
| '1' may actually be any positive integer. | |
| \g1 [5] Backreference to a specific or previous group, | |
| \g{-1} [5] The number may be negative indicating a relative | |
| previous group and may optionally be wrapped in | |
| curly brackets for safer parsing. | |
| \g{name} [5] Named backreference | |
| \k<name> [5] Named backreference | |
| \K [6] Keep the stuff left of the \K, don't include it in $& | |
| \N [7] Any character but \n. Not affected by /s modifier | |
| \v [3] Vertical whitespace | |
| \V [3] Not vertical whitespace | |
| \h [3] Horizontal whitespace | |
| \H [3] Not horizontal whitespace | |
| \R [4] Linebreak |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment