Skip to content

Instantly share code, notes, and snippets.

@MachinesAreUs
Last active November 11, 2015 20:19
Show Gist options
  • Save MachinesAreUs/7d5e475041b343b87e40 to your computer and use it in GitHub Desktop.
Save MachinesAreUs/7d5e475041b343b87e40 to your computer and use it in GitHub Desktop.
Detect similar strings from a list in a file. using the Fuzzy Match gem
#!/usr/bin/env ruby
require 'ostruct'
require 'fuzzy_match'
file_name = ARGV[0] || 'names.txt'
threshold = ARGV[1].to_i || 0.7
strs = File.readlines(file_name).collect {|s| s.strip }
matches = strs.collect{|s|
OpenStruct.new(
:str => s,
:match => FuzzyMatch.new(strs - [s])
.find_with_score(s)
)
}
interesting = matches.select{|m| m.match[1] > threshold
interesting.each {|m| puts "#{m.str},#{m.match[0]}" }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment