Skip to content

Instantly share code, notes, and snippets.

@sriranggd
Last active January 23, 2022 06:14
Show Gist options
  • Save sriranggd/236b230d63e7652ce6780bc0fc603556 to your computer and use it in GitHub Desktop.
Save sriranggd/236b230d63e7652ce6780bc0fc603556 to your computer and use it in GitHub Desktop.
ವರ್ಡಲ್ಲಾ ಸಹಾಯಕ
# encoding: utf-8
#
# The Kannada dictionary file needed for this script is available here : https://github.com/alar-dict/data
# Download the YAML file and place it in the same directory as this script.
#
# Using this script :
# This script is useful in an IRB console to be used interactively.
#
# 1. Lanuch irb
# 2. Load this script with require 'wordalla.rb'
# 3. It will take a few seconds to parse the big Kannada dictionary file.
# 4. After that it will filter out all the 5 letter words and keep it ready for further filtering.
# 5. Based on what the wordalla website is showing you, you can filter out the words by calling the method filter_with_includes_and_excludes.
# 6. Inputs for this method are :
# a. words : Input the list of words to be filtered. You can pass `fives` here. That is where script stores the words of length five
# b. must_include : Array of letters that must be included
# c. must_exclude : Array of letters that must be excluded
#
require 'yaml'
ANUSVARA = "\u0C82".freeze
VISARGA = "\u0C83".freeze
VOTTU = "್".freeze
VOWEL_SIGNS = %w( ಾ ಿ ೀ ು ೂ ೆ ೇ ೈ ೊ ೋ ೌ ೃ).freeze
def kannada_word_length(word)
length = 0
is_vattakshara = false
word.each_char do |letter|
next if (letter == ANUSVARA || letter == VISARGA)
next if VOWEL_SIGNS.include?(letter)
if (letter == VOTTU)
is_vattakshara = true
next
end
if (is_vattakshara)
is_vattakshara = false
else
length+= 1
end
end
return length
end
def filter_with_includes_and_excludes(words, must_include = [], must_exclude = [])
filtered_words = words.select do |w|
has_all_include = true
must_include.each do |letter|
unless w['entry'].include?(letter)
has_all_include = false
break
end
end
next unless has_all_include
has_any_excluded = false
must_exclude.each do |letter|
if w['entry'].include?(letter)
has_any_excluded = true
break
end
end
next if has_any_excluded
true
end
return filtered_words
end
dict = YAML.load_file('alar.yml')
fives = dict.select { |w| kannada_word_length(w['entry']) == 5 }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment