Created
June 2, 2010 18:27
-
-
Save johanlindberg/422771 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Experimental extensions to Python's re module. | |
import re | |
def int_class_expansion(match): | |
return "(%s)" % \ | |
('|'.join(reversed( [str(x) for x \ | |
in xrange(int(match.group(1)), | |
int(match.group(2))+ 1)] ))) | |
int_class_rule = ( r"\[(\d+)\.\.(\d+)\]", int_class_expansion) | |
rules = [int_class_rule,] | |
def expand(pattern): | |
result = pattern | |
for rule in rules: | |
match = re.search(rule[0], result) | |
while match: | |
result = result[:match.start()] + \ | |
rule[1](match) + \ | |
result[match.end():] | |
match = re.search(rule[0], result) | |
return result | |
def match(pattern, string, flags=0): | |
return re.match(expand(pattern), string, flags) | |
def search(pattern, string, flags=0): | |
return re.search(expand(pattern), string, flags) | |
# |
I re-read Staffan's post and realized that he actually specifies that it should be greedy so my example above is clearly wrong. I've updated the code to make it behave as it should:
m = extre.match(r'([0..255].){3}[0..255]', '123.125.126.107')
m.group() '123.125.126.107'
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is a rewrite (from memory) of functionality that I toyed around with a few years ago. I cannot find the original code but this shows off the gist of the idea I was working on, applied to Staffan Nötebergs blog post on the Integer class extension for regex (see http://bit.ly/aN2tk1).
The idea is basically to wrap the re module and add a pre-processing stage for regex patterns (the expand function). I've chosen the absolutely simplest way of implementing integer matching as discussed in Staffan's post. It can be made better.
To use it:
~ $ cd Projects/extre/
~/Projects/extre $ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
You can use expand to see what the regex turns into before being sent into the re module.
as expected, expand doesn't touch plain regexes
let's try matching an IP address:
Hm. I'm not sure what to make of this. It's not what Staffan intended but I'm not sure it's "wrong" either. Anyway, it's "easily" solved but that is left as an exercise for the reader.
The reason we get the error message is because m is None.