Created
February 17, 2015 20:37
-
-
Save dlebech/5b5ebaa3dbb3f7916327 to your computer and use it in GitHub Desktop.
Python difflib test to see the difference between the different ratios.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/env python | |
import difflib | |
import random | |
import string | |
import time | |
repetitions = 100000 | |
# Pre-generate strings between 5 and 30 characters in length. | |
strings = [''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(random.randint(10,30))) for i in range(repetitions*2)] | |
print strings[0] | |
# Test with real_quick, quick and normal in turn. | |
start = time.time() | |
for i in range(repetitions): | |
s1 = strings[i] | |
s2 = strings[-i-1] | |
m = difflib.SequenceMatcher(a=s1,b=s2) | |
m.real_quick_ratio() | |
end = time.time() | |
print 'real quick: ' + str(end-start) | |
start = time.time() | |
for i in range(repetitions): | |
s1 = strings[i] | |
s2 = strings[-i-1] | |
m = difflib.SequenceMatcher(a=s1,b=s2) | |
m.quick_ratio() | |
end = time.time() | |
print 'quick: ' + str(end-start) | |
start = time.time() | |
for i in range(repetitions): | |
s1 = strings[i] | |
s2 = strings[-i-1] | |
m = difflib.SequenceMatcher(a=s1,b=s2) | |
m.ratio() | |
end = time.time() | |
print 'normal: ' + str(end-start) | |
# Test with quick an normal together but only for ratios above 0.6 | |
start = time.time() | |
cnt = 0 | |
for i in range(repetitions): | |
s1 = strings[i] | |
s2 = strings[-i-1] | |
m = difflib.SequenceMatcher(a=s1,b=s2) | |
r = m.quick_ratio() | |
if r > 0.6: | |
m.ratio() | |
cnt+=1 | |
end = time.time() | |
print 'quick and normal: ' + str(end-start) | |
print cnt |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Q0E78IKLK8PP | |
real quick: 1.51300001144 | |
quick: 3.02600002289 | |
normal: 7.86299991608 | |
quick and normal: 3.05700016022 | |
295 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
m = difflib.SequenceMatcher(a=s1,b=s2)
m.ratio()
explain this part ASAP