Created
August 4, 2014 14:58
-
-
Save amoose/b58b594b2ec215693d7b to your computer and use it in GitHub Desktop.
ElasticSearch synonym tokenization with Searchkick gem
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'searchkick' | |
module SearchkickSyn | |
def searchkick_index_options | |
# fetch index options | |
options = super | |
# inject Synonym filter for default index and searchkick search | |
options[:settings][:analysis][:analyzer][:default_index][:filter].push "synonym" | |
options[:settings][:analysis][:analyzer][:searchkick_search][:filter].push "synonym" | |
# inject WordNet synonym filter | |
options[:settings][:analysis][:filter][:synonym] = { | |
:type => 'synonym', | |
:synonyms_path => '/var/lib/wn_s.pl' | |
} | |
options | |
end | |
end | |
# This is the 'footprint' definition (including all containing | |
# modules, etc.) of the class we want to 'prepend' with our module | |
module Searchkick | |
module Reindex | |
# This is the line where all the magic happens - we 'prepend' | |
# the module we created above into the class | |
prepend SearchkickSyn | |
end | |
end |
This only works on Ruby 2+
Is it possible to put the wordnet file inside the rails project itself? Not sure Heroku gives access to the var/lib folder.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The Prolog-formatted synset file (wn_s.pl) must exist on the server, as specified on line 15. The file must be identical on all ElasticSearch servers.
The WordNet 3.1 database files can be downloaded from here: http://wordnetcode.princeton.edu/wn3.1.dict.tar.gz