Last active
September 18, 2019 13:03
-
-
Save Syncrossus/b7dbb063fa227b410563be123cb93549 to your computer and use it in GitHub Desktop.
Levenshtein distance in Python for lists of objects comparable with the `!=` operator. Uses the Wagner-Fischer algorithm.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def levenshtein(l1, l2): | |
""" Computes Levenshtein distance between two list-like objects | |
Args: | |
l1, l2: subscriptable objects compatible with len() | |
containing objects comparable with the `!=` operator. | |
""" | |
# initializing the distance matrix | |
m = np.zeros(shape=(len(l1), len(l2))) | |
m[0, :] = range(0, len(l2)) | |
m[:, 0] = range(0, len(l1)) | |
for i in range(1, len(l1)): | |
for j in range(1, len(l2)): | |
cost = 0 | |
if l1[i] != l2[j]: | |
cost = 1 | |
# distance at i,j is the minimal distance | |
# of the previous steps + current cost | |
m[i, j] = min(m[i - 1, j], m[i, j - 1], m[i - 1, j - 1]) + cost | |
return m[len(l1) - 1, len(l2) - 1] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This code is released under the .