Created
September 18, 2017 11:14
-
-
Save aduzsardi/78ba2c415606f9dbbb8f5962165e7ad8 to your computer and use it in GitHub Desktop.
POSIX RegEx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Info 'man 7 regex' | |
POSIX Basic Regular Expressions (BRE) | |
+ - ordinary char | |
? - ordinary char | |
| - ordinary char | |
\{\} - special chars | |
{} - regular chars | |
\(\) - special | |
() - regular | |
Obsolete ("basic") regular expressions differ in several respects. '|', '+', and '?' are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are "\{" | |
and "\}", with '{' and '}' by themselves ordinary characters. The parentheses for nested subexpressions are "\(" and "\)", with '(' and ')' by themselves ordinary characters. '^' is an ordinary | |
character except at the beginning of the RE or(!) the beginning of a parenthesized subexpression, '$' is an ordinary character except at the end of the RE or(!) the end of a parenthesized subexpres- | |
sion, and '*' is an ordinary character if it appears at the beginning of the RE or the beginning of a parenthesized subexpression (after a possible leading '^'). | |
POSIX Extended Regular Expressions (ERE) | |
Special characters | |
* - 0 or more times | |
+ - 1 ore more times | |
? - 0 or 1 time | |
. - any char | |
| - alternating match | |
{1} - one time | |
{1,2} - one ore two times | |
{1,} - at least one time | |
(grouping) - grouping expressions | |
[] - character class | |
A bracket expression is a list of characters enclosed in "[]". It normally matches any single character from the list (but see below). If the list begins with '^', it matches any single character | |
(but see below) not from the rest of the list. If two characters in the list are separated by '-', this is shorthand for the full range of characters between those two (inclusive) in the collating | |
sequence, for example, "[0-9]" in ASCII matches any decimal digit. It is illegal(!) for two ranges to share an endpoint, for example, "a-c-e". Ranges are very collating-sequence-dependent, and | |
portable programs should avoid relying on them. | |
\ - escape character | |
^ - start of the line | |
$ - end of the line | |
Within a bracket expression, the name of a character class enclosed in "[:" and ":]" stands for the list of all characters belonging to that class. Standard character class names are: | |
alnum digit punct | |
alpha graph space | |
blank lower upper | |
cntrl print xdigit |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment