Skip to content

Instantly share code, notes, and snippets.

@Badgerati
Created August 5, 2012 02:09
Show Gist options
  • Select an option

  • Save Badgerati/3261142 to your computer and use it in GitHub Desktop.

Select an option

Save Badgerati/3261142 to your computer and use it in GitHub Desktop.
An implementation of the Levenshtein distance algorithm in LUA.
-- Returns the Levenshtein distance between the two given strings
function string.levenshtein(str1, str2)
local len1 = string.len(str1)
local len2 = string.len(str2)
local matrix = {}
local cost = 0
-- quick cut-offs to save time
if (len1 == 0) then
return len2
elseif (len2 == 0) then
return len1
elseif (str1 == str2) then
return 0
end
-- initialise the base matrix values
for i = 0, len1, 1 do
matrix[i] = {}
matrix[i][0] = i
end
for j = 0, len2, 1 do
matrix[0][j] = j
end
-- actual Levenshtein algorithm
for i = 1, len1, 1 do
for j = 1, len2, 1 do
if (str1:byte(i) == str2:byte(j)) then
cost = 0
else
cost = 1
end
matrix[i][j] = math.min(matrix[i-1][j] + 1, matrix[i][j-1] + 1, matrix[i-1][j-1] + cost)
end
end
-- return the last value - this is the Levenshtein distance
return matrix[len1][len2]
end
@lost-RD
Copy link
Copy Markdown

lost-RD commented Nov 1, 2016

Thanks for this!

@DrkSoulk
Copy link
Copy Markdown

Oustanding!

@ThePC007
Copy link
Copy Markdown

ThePC007 commented Jul 6, 2021

This is great, thanks. :)

@SoapHeadedSoap
Copy link
Copy Markdown

who came here from that one devforum

@gram-signal
Copy link
Copy Markdown

quick note: because in line 35 you only ever refer to matrix[i] and matrix[i-1], you can get away with just storing the two most recent lines of the matrix, rather than computing the whole thing. This gets you down to O(2n) space (while still taking O(n^2) time)

@jarble
Copy link
Copy Markdown

jarble commented Apr 23, 2022

You can use a modified version of this algorithm to search for substrings that closely match another string.

@Chase-Nolan
Copy link
Copy Markdown

Thanks!

@hrsantiago
Copy link
Copy Markdown

Thanks!
matrix declaration in line 5 could appear after the quick cut-offs to prevent Lua from creating the table and later collecting it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment