The newer versions of bash include a regex operator =~
Simple example
$ re='t(es)t'
$ [[ "test" =~ $re ]]
$ echo $?
0
$ echo ${BASH_REMATCH[1]}
es
Note
- operator returns true if it's able to match
- nested groups are possible (example below shows ordering)
- 0 is the whole match
- optional groups are counted even if not present and will be indexed, but be empty/null
- global match isn't suported, so it only matches once
- the regex must be provided as an unquoted variable reference to the re var
Match fails if re specified directly as string. Seems to want to be unquoted...
[[ "test" =~ 't(es)t' ]]; echo $?;
1
More complex example to parse the http_proxy env var
http_proxy_re='^https?://(([^:]{1,128}):([^@]{1,256})@)?([^:/]{1,255})(:([0-9]{1,5}))?/?'
if [[ "$http_proxy" =~ $http_proxy_re ]]; then
# care taken to skip parent nesting groups 1 and 5
user=${BASH_REMATCH[2]}
pass=${BASH_REMATCH[3]}
host=${BASH_REMATCH[4]}
port=${BASH_REMATCH[6]}
fi
Examples of tricky issues with limitations: