Skip to content

Instantly share code, notes, and snippets.

@ghilead
Last active January 18, 2019 12:08
Show Gist options
  • Save ghilead/6291718b0e97232a09f7 to your computer and use it in GitHub Desktop.
Save ghilead/6291718b0e97232a09f7 to your computer and use it in GitHub Desktop.
Rake task for collocating multiple models as types in a single index.
# A Rake tasks to facilitate importing data from your models into a common Elasticsearch index.
#
# All models should declare a common index_name, and a document_type:
#
# class Article
# include Elasticsearch::Model
#
# index_name 'app_scoped_index'
# document_type 'articles'
#
# mappings do
# ...
# end
# end
#
#
STDOUT.sync = true
STDERR.sync = true
begin; require 'ansi/progressbar'; rescue LoadError; end
namespace :elasticsearch do
task :import => 'import:model'
namespace :import do
desc <<-DESC.gsub(/ /, '')
Import all mappings from `app/models` (or use DIR environment variable) into a single index.
All classes should declare a common `index_name`:
class Article
include Elasticsearch::Model
index_name 'app_scoped_index'
mappings do
...
end
end
Usage:
$ rake environment elasticsearch:import:combined DIR=app/models
DESC
task :combined do
dir = ENV['DIR'].to_s != '' ? ENV['DIR'] : Rails.root.join("app/models")
puts "[IMPORT] Loading models from: #{dir} into a single index."
all_mappings = {}
index_klass = nil
Dir.glob(File.join("#{dir}/**/*.rb")).each do |path|
model_filename = path[/#{Regexp.escape(dir.to_s)}\/([^\.]+).rb/, 1]
next if model_filename.match(/^concerns\//i) # Skip concerns/ folder
begin
klass = model_filename.camelize.constantize
rescue NameError
require(path) ? retry : raise(RuntimeError, "Cannot load class '#{klass}'")
end
# Skip if the class doesn't have Elasticsearch integration
next unless klass.respond_to?(:__elasticsearch__)
next unless klass.respond_to?(:mappings)
puts "[IMPORT] Processing mappings for: #{klass}..."
index_klass = klass
all_mappings.merge! klass.mappings.to_hash
end
## Create the combined index
index_klass.__elasticsearch__.client.indices.create(
{
index: index_klass.index_name,
body: {
mappings: all_mappings
}
})
## Import data into the newly created index
Rake::Task["elasticsearch:import:all"].invoke
puts
end
end
end
@johndobrien
Copy link

I must be missing something, I can't get this to work

@charlotte-miller
Copy link

Thanks for posting this! I made some 'improvements' and wanted to share. This version doesn't care if the classes to have the same index_name or if the index already exists. Using put_mapping allows each class to update the (shared) index instead of create it:

    task :combined => 'environment' do
      dir = ENV['DIR'].to_s != '' ? ENV['DIR'] : Rails.root.join("app/models")

      searchable_classes = Dir.glob(File.join("#{dir}/**/*.rb")).map do |path|
        model_filename = path[/#{Regexp.escape(dir.to_s)}\/([^\.]+).rb/, 1]

        next if model_filename.match(/^concerns\//i) # Skip concerns/ folder

        begin
          klass = model_filename.camelize.constantize
        rescue NameError
          require(path) ? retry : raise(RuntimeError, "Cannot load class '#{klass}'")
        end

        # Skip if the class doesn't have Elasticsearch integration
        next unless klass.respond_to?(:__elasticsearch__) && klass.respond_to?(:mappings)
        klass
      end.compact


      ## Update Each Class
      searchable_classes.each do |klass|
        puts "[IMPORT] Processing mappings for: #{klass}..."

        es_indices = klass.__elasticsearch__.client.indices
        options = {index: klass.index_name}

        # Find or create index
        es_indices.create(options) unless es_indices.exists(options)
        es_indices.put_mapping(options.merge({
          type: klass.document_type,
          body: klass.mappings.to_hash,
          # ignore_conflicts:true,
        }))

      end


      ## Import data into the newly created index
      Rake::Task["elasticsearch:import:all"].invoke

      puts
    end

@jkeam
Copy link

jkeam commented Aug 16, 2018

I like @Chip-Miller's additions. What I particularly like is:

es_indices.put_mapping(options.merge({
  type: klass.document_type,
  body: klass.mappings.to_hash
}))

updating the shared index for each type is super sweet 👍 💯 🥇

Thanks @ghilead and @Chip-Miller!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment