#Elasticsearch with highlight
../Gemfile
#
gem 'elasticsearch'
gem 'elasticsearch-model'
gem 'elasticsearch-rails'
#
gem 'simple_search_filter'
мы используем gem 'simple_search_filter'
(https://github.com/maxivak/simple_search_filter) для того, чтобы он принимал из формы и передавал в Elasticsearch вводимые значения поиска
##Настройка Elasticsearch
../config/initializers/elasticsearch.rb
Elasticsearch::Model.client = Elasticsearch::Client.new host: Rails.configuration.gex_config[:elasticsearch_host]
../config/gex/gex_config.development.yml
...some code
# elasticsearch
elasticsearch_host: '51.1.0.12'
elasticsearch_prefix: 'gex.'
##Model
../app/modes/logdebug.rb
class LogDebug < ActiveRecord::Base
self.table_name = "log_debug"
...some code
### search elasticsearch
include UserElasticsearchSearchable
### search
paginates_per 10
searchable_by_simple_filter
...some code
end
##Concerns in models
../app/models/concerns/log_debug_elasticsearch_searchable.rb
module LogDebugElasticsearchSearchable
extend ActiveSupport::Concern
included do
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
index_name "#{Rails.configuration.gex_config[:elasticsearch_prefix]}log_debug"
settings index: { number_of_shards: 1 } do
mappings dynamic: 'false' do
indexes :id, :index => :not_analyzed, :type => 'integer'
indexes :source_id, :index => :not_analyzed
indexes :type_id, :index => :not_analyzed
indexes :user_id, :index => :not_analyzed
indexes :team_id, :index => :not_analyzed
indexes :cluster_id,:index => :not_analyzed
indexes :node_id, :index => :not_analyzed
indexes :message, :analyzer => 'standard', :boost => 100
indexes :data, :analyzer => 'standard', :boost => 50
indexes :ip, :index => :not_analyzed
indexes :level, :index => :not_analyzed, :type => 'integer'
indexes :created_at,:index => :not_analyzed
end
end
def self.search(filter)
#
q = Gexcore::ElasticSearchHelpers.sanitize_string(filter.v('q'))
#
__elasticsearch__.search(
{
min_score: 0.5,
query: {
filtered: {
query:{
query_string: {
query: '*' + q + '*',
fields: ['message', 'data']
}
},
filter: {
bool: {
must: get_terms(filter)
}
}
}
},
highlight: {
pre_tags: ['<em>'],
post_tags: ['</em>'],
fields: {
message: {},
data: {fragment_size: 80, number_of_fragments: 3}
}
},
sort: get_order(filter)
}
)
end
def self.get_terms(filter)
a = []
elastic_fields = [:source_id, :type_id, :user_id, :team_id, :cluster_id, :node_id]
elastic_fields.each do |name|
v = filter.v(name)
a << {term: {name => v}} if v.present? && v>0
end
# level
level = filter.v(:level)
a << {
range: {
level: {
gte: level
}
}
} if level > 0
# ip
ip = filter.v(:ip)
a << {term: {ip: ip}} if ip.present?
# output
a
end
def self.get_order(filter)
h = filter.order.to_h
# score
return [ '_score' ] if h['score'].present?
# basic
h.map {|colname, dir| {colname => {:order => dir}}}
end
end
end
где
filter
- это -#<SimpleSearchFilter::Filter:0x0000000c1aa138>
- объект класса SimpleSearchFilter:analyzer => 'standard'
ставим только на колонках, которые содержат текстовые данные:type => 'integer'
- по умолчанию для всех колонок стоит:type => 'string'
min_score: 0.5
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-min-score.html:boost => 100
приоритет в поиске. Чем выше, тем больший приоритет. Напримерusername
имеет:boost => 100
, аfirstname
имеет:boost => 99
, количество отображаемых на странице (во вью) найденных объектов = 10. Поисковик нашел 8 записейusername
и 15 записейfirstname
на странице мы видим 8 записейusername
и 2 записиfirstname
query_string
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.htmlfiltered
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html, https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.htmlquery: '*' + q + '*'
- искать совпадения букв в словах. Напримерq = el
, будет находитьelvis, electrostation, etc
fields: ['message', 'data']
- поля, в которых будет искатьq
filter
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.htmlbool
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.htmlmust
- https://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.htmlterm
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html, https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.htmlrange
- https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.htmlhighlight
- http://www.sitepoint.com/full-text-search-rails-elasticsearch/fragment_size: 80
- вместе с найденым словом показывает еще 80 символовnumber_of_fragments: 3
- максимальное количество найденных слов в большом тексте<= 3
, даже если их большеsort
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html
##Lib
../lib/gexcore/elastic_search_helpers.rb
module Gexcore
class ElasticSearchHelpers
# sanitize a search query for Lucene. Useful if the original
# query raises an exception, due to bad adherence to DSL.
# Taken from here:
#
# http://stackoverflow.com/questions/16205341/symbols-in-query-string-for-elasticsearch
#
def self.sanitize_string(str)
# Escape special characters
# http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters
escaped_characters = Regexp.escape('\\+-&|!(){}[]^~*?:\/')
str = str.gsub(/([#{escaped_characters}])/, '\\\\\1')
# AND, OR and NOT are used by lucene as logical operators. We need
# to escape them
['AND', 'OR', 'NOT'].each do |word|
escaped_word = word.split('').map {|char| "\\#{char}" }.join('')
str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ")
end
# Escape odd quotes
quote_count = str.count '"'
str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1
str
end
end
end
##Controller
../app/controllers/log_debug_controller.rb
class Admin::LogDebugController < Admin::MyAdminBaseController
# search
search_filter :index, {save_session: true, search_method: :post_and_redirect, url: :admin_log_debug_index_url, search_url: :search_admin_log_debug_index_url , search_action: :search} do
default_order "id", 'desc'
# fields
field :q, :string, :text, {label: 'Search all', default_value: '', ignore_value: '', condition: :empty, input_html: {style: "width: 130px"}}
field :level, :int, :select, {
label: 'Level',
default_value: 0, ignore_value: 0,
collection: Gexcore::LogLevel.get_all_with_blank, label_method: :name, value_method: :id,
condition: :custom, condition_where: 'level >= ?'
}
field :source, :string, :autocomplete, {label: 'Source', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_log_source_name_admin_log_sources_path, input_html: {style: "width: 150px"}}
field :type, :string, :autocomplete, {label: 'Type', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_log_type_name_admin_log_types_path, input_html: {style: "width: 150px"}}
field :user, :string, :autocomplete, {label: 'User', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_user_username_admin_users_path, input_html: {style: "width: 150px"}}
field :team, :string, :autocomplete, {label: 'Team', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_team_name_admin_teams_path, input_html: {style: "width: 150px"}}
field :cluster, :string, :autocomplete, {label: 'Cluster', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_cluster_name_admin_clusters_path, input_html: {style: "width: 150px"}}
field :node, :string, :autocomplete, {label: 'Node', default_value: '', ignore_value: '', search_by: :id, :source_query => :autocomplete_node_name_admin_nodes_path, input_html: {style: "width: 180px"}}
field :ip, :string, :text, {label: 'IP', default_value: '', ignore_value: '', input_html: {style: "width: 80px"}}
end
def index
@records, @total = Gexcore::LogDebugSearchService.search_by_filter(@filter)
end
...some code
end
где
@records
- это -#<Elasticsearch::Model::Response::Records:0x0000000371add8>
- объект класса Response - https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#search-results-as-database-records
##Search service
../lib/gexcore/log_debug_search_service.rb
module Gexcore
class LogDebugSearchService < BaseSearchService
def self.search_prefix
'log_debug_search_'
end
def self.model
LogDebug
end
def self.search_by_filter(filter)
res_es = model.search(filter).page(filter.page)
items = res_es.records
total = res_es.results.total
return [items, total]
end
end
end
где
filter
- это -#<SimpleSearchFilter::Filter:0x0000000c1aa138>
- объект класса SimpleSearchFilteritems
- это -#<Elasticsearch::Model::Response::Records:0x0000000371add8>
- объект класса Response - https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#search-results-as-database-records
##View
../app/views/admin/log_debug/index.html.haml
= stylesheet_link_tag "tpl_admin", media: "all"
...
.filter
= inline_filter_form_for(@filter)
%br
Found <b>#{@total}</b> records.
.center-block
= paginate @records
%table.table.table-striped.table-bordered.table-hover
%tr
%th= link_to_sortable_column :id, '#'
%th= link_to_sortable_column :created_at, 'Date'
...
%th Message
...
%th= link_to_sortable_column :score, 'Search score'
%th Data
- @records.each_with_hit do |item, hit|
%tr
%td= item.id
%td= item.created_at
...
-# for higlight ElasticSearch
%td
-if hit.try(:highlight).try(:message)
- hit.highlight.message.each do |snippet|
%p
= snippet.html_safe
-else
= item.message
...
%td
= hit._score
%td
= item.data.truncate(240)
%br
=link_to 'More', '#', :data=>{target: '#modLogData', id: item.id}
-# for higlight ElasticSearch
%br
-if hit.try(:highlight).try(:data)
%b Found:<br>
- hit.highlight.data.each do |snippet|
= snippet.html_safe
%br
= paginate @records
...
где
hit
- это -#<Elasticsearch::Model::Response::Result:0x00000002cc5298>
- объект класса Response - https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#search-results-as-database-recordsitem
- это -#<LogDebug:0x0000000423a880>>
- объект класса LogDebughit.try(:highlight).try(:data)
- это массив - http://www.sitepoint.com/full-text-search-rails-elasticsearch/hit.highlight.data
- это массивsnippet
- это 80 символов с искомым словом в тегах<em></em>
. Напримерthe great man <em>elvis</em>
##Stylesheets
../app/assets/stylesheets/tpl_admin.css.scss
...
em {
background-color: yellow;
}
###По итогу у нас есть поиск с фильтром и подсветкой найденных слов