Skip to content

Instantly share code, notes, and snippets.

@knowtheory
Created September 30, 2013 20:58
Show Gist options
  • Save knowtheory/6770169 to your computer and use it in GitHub Desktop.
Save knowtheory/6770169 to your computer and use it in GitHub Desktop.
Very basic well-formedness test for english.
documents = Document.all(:limit=>10) # get some documents
# use map to iterate over the documents & return the percentage
# of the document's words that are in the spell check dictionary
results = documents.map do |doc|
checked = Spellchecker.check(doc.combined_page_text) # check the text
correct = checked.select{ |entry| entry[:correct] } # get the correct words
percentage = correct.size.to_f / checked.size # find the percentage
puts "#{doc.id}: #{percentage} (#{correct.size} of #{checked.size})" # print out some status info.
[doc.id, {:percent => percentage, :correct => correct.size, :checked => checked.size}] # return the document's id, and it's percentage
end
percentages = Hash[results]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment