Skip to content

Instantly share code, notes, and snippets.

@aduzsardi
Created September 18, 2017 11:14
Show Gist options
  • Save aduzsardi/78ba2c415606f9dbbb8f5962165e7ad8 to your computer and use it in GitHub Desktop.
Save aduzsardi/78ba2c415606f9dbbb8f5962165e7ad8 to your computer and use it in GitHub Desktop.
POSIX RegEx
# Info 'man 7 regex'
POSIX Basic Regular Expressions (BRE)
+ - ordinary char
? - ordinary char
| - ordinary char
\{\} - special chars
{} - regular chars
\(\) - special
() - regular
Obsolete ("basic") regular expressions differ in several respects. '|', '+', and '?' are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are "\{"
and "\}", with '{' and '}' by themselves ordinary characters. The parentheses for nested subexpressions are "\(" and "\)", with '(' and ')' by themselves ordinary characters. '^' is an ordinary
character except at the beginning of the RE or(!) the beginning of a parenthesized subexpression, '$' is an ordinary character except at the end of the RE or(!) the end of a parenthesized subexpres-
sion, and '*' is an ordinary character if it appears at the beginning of the RE or the beginning of a parenthesized subexpression (after a possible leading '^').
POSIX Extended Regular Expressions (ERE)
Special characters
* - 0 or more times
+ - 1 ore more times
? - 0 or 1 time
. - any char
| - alternating match
{1} - one time
{1,2} - one ore two times
{1,} - at least one time
(grouping) - grouping expressions
[] - character class
A bracket expression is a list of characters enclosed in "[]". It normally matches any single character from the list (but see below). If the list begins with '^', it matches any single character
(but see below) not from the rest of the list. If two characters in the list are separated by '-', this is shorthand for the full range of characters between those two (inclusive) in the collating
sequence, for example, "[0-9]" in ASCII matches any decimal digit. It is illegal(!) for two ranges to share an endpoint, for example, "a-c-e". Ranges are very collating-sequence-dependent, and
portable programs should avoid relying on them.
\ - escape character
^ - start of the line
$ - end of the line
Within a bracket expression, the name of a character class enclosed in "[:" and ":]" stands for the list of all characters belonging to that class. Standard character class names are:
alnum digit punct
alpha graph space
blank lower upper
cntrl print xdigit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment