Skip to content

Instantly share code, notes, and snippets.

@AlexAtkinson
Last active November 25, 2024 15:25
Show Gist options
  • Save AlexAtkinson/d37462adb758e91404f4ee5b32db877d to your computer and use it in GitHub Desktop.
Save AlexAtkinson/d37462adb758e91404f4ee5b32db877d to your computer and use it in GitHub Desktop.
REGEX Cheat

REGEX

Reference

Character Sets/Ranges   []                  : Match any characters within. Required expression.
Qualifiers              {}                  : Quantity or range of an expression. IE: {3}, {2,200}
Groups                  ()                  : Group expressions. Subgroups supported.
Non-Capturing Group     (?:foo)             : Does not "remember" matches. Lower overhead.
Kleen Star              *                   : May occur 0 or more times.
Kleen plus              +                   : May occur 1 or more times.
Wildcard                .                   : Matches exactly 1 character.
Optional                ?                   : Make optional qualifier.
Alternation             |                   : Match two or more subexpressions. IE: (foo|bar)
Negate                  ^                   : NOT. IE: [^aAeEiIoOuU]
Literal                 string              : Exact string match.
Anchors                 ^$                  : Start and end of string.
Shorthand Class         \d                  : [0-9]
Shorthand Class         \D                  : [^0-9]
Shorthand Class         \w                  : [a-zA-Z0-9_]
Shorthand Class         \W                  : [^a-zA-Z0-9_]
Boundary                \b                  : Boundary
Newlines                (\r\n|\r|\n)        : Newlines
Lookbehind              (?<!x)y             : y not preceded by x (Note: aka 'lookarounds')
                        (?<=x)y             : y is preceded by x (Tip: Use with negate.)
Lookahead               x(?!y)              : x not followed by y
                        x(?=y)              : x is followed by y
                        x(?!y)z             : BAD - yz occupy the same position.

Tips

Grep

Grep has three regex capabilities built in: Basic [-G] (enabled by default), Extended [-E], and Perl [-P] (PCRE). Note that advanced regex functions such as lookarounds require PCRE. Don't forget [-o] (only matching, which is useufl.) IE:

user@host:~# echo "foo, 1.2.3,v4.5.6 " | grep -oP "(?<=[\b\s,;])([vV]?([0-9]|[1-9][0-9]*)\.([0-9]|[1-9][0-9]*)\.([1-9]+|[1-9][0-9]*))((?=[\b\s,;\.]|.(?=\s)))"
1.2.3
v4.5.6

Boundary Defense

\b has its uses, but in many occasions it's desirable to be defensive against partial matches. For example, a semver regex that matches 1.2.3, may also match part sof 1.2.3.1.2.3. Use lookarounds to defend against such scenarios. The following from the semver examples ensures that matches are only made when there is a preceeding boundary, space, comma, or semicolon AND a trailing boundary, space, comma, semicolon, or period followed by a space.

(?<=[\b\s,;])<YOUR MATCH REGEX>((?=[\b\s,;]|.(?=\s)))

Password Complexity

RDS Password

(?=.*[a-z]){1,}(?=.*[A-Z]){1,}(?=.*[0-9]){1,}(?=.*[!#$%^&*()<>;?]){1,}[a-zA-Z0-9!#$%^&*()<>;?]{18,64}

Try it at DebuggexBeta

With Non Capturing Groups

(?:(?=.*[a-z]){1,})(?:(?=.*[A-Z]){1,})(?:(?=.*[0-9]){1,})(?:(?=.*[!#$%^&*()<>;?]){1,})(?:[a-zA-Z0-9!#$%^&*()<>;?]{18,64})

Try it at DebuggexBeta

IP

IPv4/6 Private IPs

(((127\.\d{1,3})|(192\.168)|(10\.\d{1,3})|(172\.1[6-9])|(^172\.2[0-9])|(^172\.3[0-1]))\.\d{1,3}\.\d{1,3})|(([fF][cCdD][0-9a-fA-F]{2}:([0-9a-fA-F]{4}:?){7})|::1)

Try it at DebuggexBeta.

CIDR

Amazon's regex:

(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\.d{1,3})/(\\d{1,2})

Domain

(https?://)?([a-zA-Z0-9]+[.])*[a-zA-Z0-9]+[.][a-zA-Z]{2,63}(.(?=\.))?

Try it at DebuggexBeta.

Versioning

Semver - Trunk (Everything is potentially releasable)

[vV]?([0-9]|[1-9][0-9]*)\.([0-9]|[1-9][0-9]*)\.([1-9]+|[1-9][0-9]*)

Try it at DebuggexBeta

With boundary defense

(?<=[\b\s,;])([vV]?([0-9]|[1-9][0-9]*)\.([0-9]|[1-9][0-9]*)\.([1-9]+|[1-9][0-9]*))((?=[\b\s,;\.]|.(?=\s)))

Try it at DebuggexBeta

Maven Madness

(?<=[\b\s,;])([vV]?([0-9]|[1-9][0-9]*)\.([0-9]|[1-9][0-9]*)(\.([1-9]+|[1-9][0-9]*))?)(([-\.][\w]+)+)?((?=[\b\s,;]|.(?=\s)))

Try it at DebuggexBeta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment