Here are some handy regex patterns and portions of patterns. I seem to keep coming back to these so it made sense to document them for the sake of memory.
Signed integers or floats. Skipping the zero before the decimal, for example
.1
, +.1
or -.1
, is valid.
Must Match [-+]?\d*\.?\d+
or Require sign [-+]\d*\.?\d+
Optionally [-+]?\d*\.?\d*
or Require sign [-+]\d*\.?\d+
Starts with Numeric. This is useful for PHP where math between strings or casting of strings to float or int works if strings start with numeric.
Must Match ^[-+]?\d*\.?\d+.*$
w/captures ^([-+]?\d*\.?\d+)(.*)$
cap. space ^([-+]?\d*\.?\d+)(\s*)(.*)$
using *
instead of +
for the space and the ending means the pattern won't
fail just becuase they are missing, you will simply get an empty string for them.
Here is that last one in PHP:
if (preg_match('/^([-+]?\d*\.?\d+)(\s*)(.*)$/', '-1.5 the ending', $capts)) {
$num = $c[0];
$space = $c[1];
$end = $c[2];
} else $num = $space = $end = '';
You could skip the need to intialize the empty variable by taking any string, even
if it does not start with numeric, by changing to ^([-+]?\d*\.?\d*)(\s*)(.*)$
.
Then if $c[0]===''
you know you didn't start with a numeric.
If you know Bash programming you'll know wrapping a portion of a string in backticks or with $( )
causes it to be executed in a subshell. This is like string interpolation but with execution.
You might want to accept a string that has a portion that is processed with a syntax, indicated by
wrapping it in backticks.
Capture All (`.*`)
Just Inner (?:`(.*)`)
That last one works becuase you can nest a capture group ( )
inside a non-capture group (?: )
and
put the backticks in the non-capture but just outside the capture.
Get After ^(?:`(.*)`)?(.+)?$
The Above would split '`cmd`after'
into 'cmd'
and 'after'
. If you
simply had 'after'
it would be grabbed by the second capture. In other words,
the default group when there are no backticks would be the last. If you want them
to be the the first, so 'key'
has the same result as '`key`'
Default to 1st Group ^`?([^`]*)`?(.*)$
Keep in mind you loose any backtick in the captures, so if you feed it '`key'
you would get back 'key'
. In other words, nothing is making sure you wrap with
two backtick for the backticks to be considered delimitors.