Skip to content

Instantly share code, notes, and snippets.

@tommorris
Created February 22, 2009 21:55
Show Gist options
  • Save tommorris/68628 to your computer and use it in GitHub Desktop.
Save tommorris/68628 to your computer and use it in GitHub Desktop.
Ruby en subtype detector
# a simple, ruby, regex-based EN subtype detector
# backstory: I spend a ridiculous amount of time editing text that's written in a
# variety of American and British English, to the point where it ends up screwing with my
# head and I get funny looks from my fellow Brits for writing American words.
# I wrote this so that I could pipe text I'm working on in Vim to it so I can see quickly
# which variant it's in.
def en_gb_or_us (text)
# compiled from https://wiki.ubuntu.com/EnglishTranslation/WordSubstitution
# and http://en.wikipedia.org/wiki/American_and_British_English_spelling_differences
brit_tests = [/\bcolour\b/, /\bfavourite\b/, /\bhonour\b/, /\barmour\b/, /\brumour\b/, /\benrolment\b/, /\bfulfil\b/, /\bskilful\b/, /\bcheque\b/, /\barse\b/, /\bmum\b/, /\bmam\b/, /\btitbit\b/, /\bpernickety\b/, /\baeroplane\b/, /\btheatre\b/, /\bgoitre\b/, /\blitre\b/, /\blustre\b/, /\bmitre\b/, /\bnitre\b/, /\breconnoitre\b/, /\bsaltpetre\b/, /\bspectre\b/, /\bcentre\b/, /\btitre\b/, /\bfibre\b/, /\bsabre\b/, /\bsombre\b/, /\bconnexion\b/, /\binflexion\b/, /\bdeflexion\b/, /\breflexion\b/, /\bgenuflexion\b/, /\bfoetal\b/, /\bfoetus\b/, /\banaemic\b/, /\banaemia\b/, /\banaesthesia\b/, /\banaesthetic\b/, /\bcaesium\b/, /\bdiarrhoea\b/, /\bdiarrhoeic\b/, /\bgynaecology\b/, /\bgynaecologist\b/, /\bhaemophilia\b/, /\bleukaemia\b/, /\boesophagus\b/, /\boestrogen\b/, /\bartefact\b/, /\bkerb\b/, /\bcypher\b/, /\bchequer\b/, /\bgaol\b/, /\bgaoler\b/, /\byoghurt\b/, /\bagendum\b/, /\badrenaline\b/, /\badaptor\b/, /\baluminium\b/, /\bdraught\b/, /\boenology\b/, /\bhomoeopathic\b/, /\bhomoeopathy\b/, /\bhomoeopath\b/, /\bcentimetre\b/, /\bnanometre\b/, /\btrouser[s]?\b/, /\bprise\b/, /\bjumper\b/, /\bpolo neck\b/, /\bdinner jacket\b/, /\bvapour\b/, /\bcourgette\b/, /\bwindscreen\b/]
amer_tests = [/\bcolor\b/, /\bfavorite\b/, /\bhonor\b/, /\barmor\b/, /\brumor\b/, /\benrollment\b/, /\bmom\b/, /\btidbit\b/, /\bpersnickety\b/, /\bairplane\b/, /\btheater\b/, /\bgoiter\b/, /\bliter\b/, /\bluster\b/, /\bmiter\b/, /\bniter\b/, /\breconnoiter\b/, /\bsaltpeter\b/, /\bspecter\b/, /\bcenter\b/, /\btiter\b/, /\bfiber\b/, /\bsaber\b/, /\bsomber\b/, /\bpederast\b/, /\bfetal\b/, /\bfetus\b/, /\banesthesia\b/, /\banesthetic\b/, /\bcesium\b/, /\bdiarrhea\b/, /\bdiarrheic\b/, /\bgynecology\b/, /\bgynecologist\b/, /\bleukemia\b/, /\besophagus\b/, /\bestrogen\b/, /\bartifact\b/, /\bgray\b/, /\bgantlet\b/, /\bdonut\b/, /\bomelet\b/, /\bmollusk\b/, /\benology\b/, /\bcentimeter\b/, /\bnanometer\b/, /\bdiaper[s]?\b/, /\bresume\b/, /\brésumé\b/, /\bsneakers\b/, /\bT\-boned\b/, /\btuxedo\b/, /\bturnpike\b/, /\bturtleneck\b/, /\bvapor\b/, /\bzucchini\b/, /\bZIP code\b/, /\bwindshield\b/]
rating = 0
brit_tests.each do |i|
rating = rating + 1 if text =~ i
end
amer_tests.each do |i|
rating = rating - 1 if text =~ i
end
return "British English" if rating > 0
return "American English" if rating < 0
return "Indistinguishable" if rating == 0 || i.nil?
end
puts en_gb_or_us("This diaper armor is strong, so buy me a donut").to_s
puts en_gb_or_us("I'm going to visit the theatre tonight").to_s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment