Last active
April 3, 2025 15:00
-
-
Save robertjwhitney/4748189 to your computer and use it in GitHub Desktop.
Use Google's undocumented spellcheck API to make spelling correction suggestions.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Unfortunately, Google doesn't return the actual string it is suggesting | |
# replacements for, so this got a little complicated. Fork and improve! | |
require 'httparty' | |
require 'nokogiri' | |
# > Spellcheck.correct("Ths is a tst") | |
# => "This is a test" | |
# > Spellcheck.correct("My nme is Robrt Whtney") | |
# => "My name is Robert Whitney" | |
# > Spellcheck.correct("Chnky Bacn") | |
# => "Chunky Bacon" | |
module Spellcheck | |
def self.correct(text) | |
resp = self.get_corrections(text) | |
result = Nokogiri.XML(resp.body).xpath('spellresult').first | |
chars_checked = result['charschecked'].to_i | |
corrections = result.xpath('c') | |
text_array = text.split(" ") | |
corrections.each do |correction| | |
offset = correction['o'].to_i | |
length = correction['l'].to_i | |
mispelled_word = text[offset..(offset+length)].strip | |
corrected_word = correction.text.split(" ").first | |
text_array[text_array.index(mispelled_word)] = corrected_word | |
end | |
text_array.join(" ") | |
end | |
# Returns possible spelling corrections as xml | |
# | |
# <spellresult error="0" clipped="0" charschecked="12"> | |
# <c o="0" l="3" s="1">This Th's Thus Th HS</c> | |
# <c o="9" l="3" s="1">test tat ST St st</c> | |
# </spellresult> | |
# | |
# o - The offset from the start of the text of the word | |
# l - Length of misspelled word | |
# s - Confidence of the suggestion | |
# text - Tab delimited list of suggestions | |
# | |
def self.get_corrections(text) | |
HTTParty.post( | |
'https://www.google.com/tbproxy/spell?lang=en&hl=en', | |
body: "<spellrequest textalreadyclipped='0' ignoredups='0' ignoredigits='1' ignoreallcaps='1'> | |
<text>#{text}</text> | |
</spellrequest>" | |
) | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It nows renders a 404 😞