Skip to content

Instantly share code, notes, and snippets.

@johnbartholomew
Last active January 2, 2016 23:49
Show Gist options
  • Save johnbartholomew/8379265 to your computer and use it in GitHub Desktop.
Save johnbartholomew/8379265 to your computer and use it in GitHub Desktop.
bad regex! no cookie!
#!/bin/sh
a5=aaaaa
a27="$a5$a5$a5$a5$a5"aa
aq5='a?a?a?a?a?'
aq27="$aq5$aq5$aq5$aq5$aq5"'a?a?'
pattern="$aq27$a27"
input="$a27"
printf 'Pattern: "%s"\nInput: "%s"\n' "$pattern" "$input"
printf 'Testing grep:\n'
time printf '%s\n' "$input" | grep -oE -e "^$pattern"
printf 'Testing perl:\n'
time perl -E 'say "match" if $ARGV[1] =~ m/^$ARGV[0]/' "$pattern" "$input"
printf 'Testing python:\n'
time python3 -c 'import sys; import re; print(re.match(sys.argv[1], sys.argv[2]))' "$pattern" "$input"
@Sei-Lisa
Copy link

In PHP: time php -r "echo preg_match('/^$pattern/','$input')?'match':'no';"

or if preferred: time php -r 'echo preg_match("/^$argv[1]/", $argv[2])?"match":"no";' "$pattern" "$input"

PHP uses PCRE, so I guess it applies to PCRE in general and not to PHP in particular. Actually it seems it can vary between installations depending on compilation flags, as per http://www.php.net/manual/en/pcre.installation.php

Interestingly, it says no match. Not sure if it's a bug or a limitation.

@Sei-Lisa
Copy link

It returns 'match' for up to 18 a's after the a?'s. Probably a limitation to avoid RE exploits.

@Sei-Lisa
Copy link

Yes, my bad. It's not returning 0 (no match), but false (error). The error is PCRE_BACKTRACK_LIMIT_ERROR http://www.php.net/manual/en/function.preg-last-error.php

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment