Created
March 30, 2011 20:15
-
-
Save tomazk/895209 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import difflib | |
# initialize | |
matcher = difflib.SequenceMatcher() | |
ret = ['this', 'is', 'sparta'] | |
rel = ['this', 'here', 'is','not','acutally', 'sparta'] | |
# calculate matching blocks | |
matcher.set_seqs(ret, rel) | |
matches = matcher.get_matching_blocks()[:-1] | |
# get_matching_blocks returns a list of triples (named tuples) | |
# each triple (i, j, n) denoting that ret[i:i+n] == rel[j:j+n] | |
# the last 'dummy' triple is discarded since it's equal to (len(rel), len(ret), 0) | |
rel_intersect_ret = sum(i.size for i in matches) | |
print rel_intersect_ret |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment