Skip to content

Instantly share code, notes, and snippets.

@yuki24
Last active August 29, 2015 14:22
Show Gist options
  • Save yuki24/b94282b2da162511df18 to your computer and use it in GitHub Desktop.
Save yuki24/b94282b2da162511df18 to your computer and use it in GitHub Desktop.
How accurate is the did_you_mean gem?
# gem 'did_you_mean', '0.9.7'
require 'yaml'
require 'set'
require 'did_you_mean'
class DidYouMean::WordCollection
include DidYouMean::BaseFinder
def initialize(words)
@words = words
end
def similar_to(input, filter = EMPTY)
@suggestions, @input = nil, input
suggestions
end
def searches
{ @input => @words }
end
end if !defined?(DidYouMean::WordCollection)
yaml = open("simple_english_titles.yml").read
yaml = YAML.load(yaml).map{|word| word.downcase.tr(" ".freeze, "_".freeze) }
DICTIONARY = Set.new(yaml)
COLLECTION = DidYouMean::WordCollection.new(DICTIONARY)
INCORRECT_WORDS = YAML.load(open("incorrect_words.yaml").read)
total_count = 0
correct_count = 0
words_not_corrected = {}
puts "total number of test data: #{INCORRECT_WORDS.size}"
puts " did_you_mean version: #{DidYouMean::VERSION}\n\n"
index = 0
INCORRECT_WORDS.each do |correct, incorrect|
if DICTIONARY.include?(correct)
total_count += 1
if COLLECTION.similar_to(incorrect).first == correct
correct_count += 1
else
words_not_corrected[correct] = incorrect
end
end
index += 1
puts "processed #{index} items" if index % 10 == 0
end
filename = "words_not_corrected_#{Time.now.to_i}.yml"
File.open(filename, 'w') {|file| file.write(words_not_corrected.to_yaml) }
puts "\n\n"
puts " total count: #{total_count}"
puts "correct count: #{correct_count}"
puts " accuracy: #{correct_count.to_f / total_count}\n\n"
puts "incorrect suggestions were logged to #{filename}."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment