更新: | 2013-09-28 |
---|---|
バージョン: | 0.0.9 |
作者: | @voluntas |
URL: | http://voluntas.github.io/ |
Django + Elasticsearch コトハジメの補足記事です
https://gist.github.com/voluntas/21759d5c45aacc0e6656/
- Haystack から簡単に日本語全文検索が出来るようにする
- Haystack の Kuromoji 対応 Elasticsearch バックエンド作成する
Python: | 2.7.5 |
---|---|
Elasticsearch: | 0.90.5 |
redis: | 2.6.16 |
Elasticsearch は 0.90.5 がインストールされている前提
github: | https://github.com/elasticsearch/elasticsearch-analysis-kuromoji |
---|
インストールはコマンドで一発で行けます。
$ cd elasticsearch-0.90.5 $ bin/plugin -install elasticsearch/elasticsearch-analysis-kuromoji/1.5.0 -> Installing elasticsearch/elasticsearch-analysis-kuromoji/1.5.0... Trying http://download.elasticsearch.org/elasticsearch/elasticsearch-analysis-kuromoji/elasticsearch-analysis-kuromoji-1.5.0.zip... Downloading .......................................DONE Installed elasticsearch/elasticsearch-analysis-kuromoji/1.5.0 into /Users/nakai/src/other/elasticsearch-0.90.5/plugins/analysis-kuromoji
kuromoji を使うよう elasticsearch-0.90.5/config/elasticsearch.yml を編集する
index.analysis.analyzer.default.type: custom
index.analysis.analyzer.default.tokenizer: kuromoji_tokenizer
設定はソース参照、一応ハッシュ付きで URL を張っておく。
url: | https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/blob/fc23bfd8f2fc66b32bec0ab292c2cb9a50ef1783/src/test/java/org/elasticsearch/index/analysis/kuromoji_analysis.json |
---|
{
"index":{
"analysis":{
"filter":{
"kuromoji_rf":{
"type":"kuromoji_readingform",
"use_romaji" : "true"
},
"kuromoji_pos" : {
"type": "kuromoji_part_of_speech",
"enable_position_increment" : "false",
"stoptags" : ["# verb-main:", "動詞-自立"]
},
"kuromoji_ks" : {
"type": "kuromoji_stemmer",
"minimum_length" : 6
}
},
"tokenizer" : {
"kuromoji" : {
"type":"kuromoji_tokenizer"
}
},
"analyzer" : {
"kuromoji_analyzer" : {
"type" : "custom",
"tokenizer" : "kuromoji_tokenizer"
}
}
}
}
}
Kuromoji を追加した SETTINGS を追加する
from haystack.backends.elasticsearch_backend import (
ElasticsearchSearchBackend,
ElasticsearchSearchEngine,
)
class KuromojiElasticBackend(ElasticsearchSearchBackend):
def __init__(self, connection_alias, **connection_options):
super(KuromojiElasticBackend, self).__init__(
connection_alias, **connection_options)
SETTINGS = {
'settings': {
"analysis": {
"analyzer": {
"ngram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_ngram"]
},
"edgengram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_edgengram"]
},
"kuromoji_analyzer" : {
"type" : "custom",
"tokenizer" : "kuromoji_tokenizer"
},
},
"tokenizer": {
"haystack_ngram_tokenizer": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15,
},
"haystack_edgengram_tokenizer": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15,
"side": "front"
},
"kuromoji" : {
"type":"kuromoji_tokenizer"
},
},
"filter": {
"haystack_ngram": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15
},
"haystack_edgengram": {
"type": "edgeNGram",
"min_gram": 5,
"max_gram": 15
},
"kuromoji_rf":{
"type":"kuromoji_readingform",
"use_romaji" : "true"
},
"kuromoji_pos" : {
"type": "kuromoji_part_of_speech",
"enable_position_increment" : "false",
"stoptags" : ["# verb-main:", "動詞-自立"]
},
"kuromoji_ks" : {
"type": "kuromoji_stemmer",
"minimum_length" : 6
},
}
}
}
}
setattr(self, 'DEFAULT_SETTINGS', SETTINGS)
class KuromojiElasticSearchEngine(ElasticsearchSearchEngine):
backend = KuromojiElasticBackend
ELASTICSEARCH_DEFAULT_ANALYZER = "snowball"
github: | https://github.com/mobz/elasticsearch-head |
---|---|
url: | http://mobz.github.io/elasticsearch-head/ |
Elasticsearch Cluster を WebUI から見れるプラグイン。 Elasticsearch のプラグインとしてインストールが可能です。
$ bin/plugin -install mobz/elasticsearch-head $ open http://127.0.0.1:9200/_plugin/head/
- elasticsearch/elasticsearch-py
- https://github.com/elasticsearch/elasticsearch-py
- Python Elasticsearch Client — Elasticsearch 0.4.1 documentation
- http://elasticsearch-py.readthedocs.org/en/latest/
- Stretching Haystack's ElasticSearch Backend — The Wellfire Blog
- http://www.wellfireinteractive.com/blog/custom-haystack-elasticsearch-backend/
- ElasticSearch で kuromoji を使う (ES 0.90.Beta1 + kuromoji 1.2.0篇) - Qiita [キータ]
- http://qiita.com/hotchpotch/items/134b049a59fe396c9475
- elasticsearch での Kuromoji の使い方 - akishin999の日記
- http://d.hatena.ne.jp/akishin999/20130307/1362611100
- elasticsearchとkuromojiプラグインで日本語の全文検索 - yuhei.kagaya
- http://yuheikagaya.hatenablog.jp/entry/2013/08/06/012150
- elasticsearchのGUI「elasticsearch-head」がとても便利 - yuhei.kagaya
- http://yuheikagaya.hatenablog.jp/entry/2013/07/14/185752
- elasticsearch - EdgeNgramField min and max letters in django haystack - Stack Overflow
- http://stackoverflow.com/questions/18908131/edgengramfield-min-and-max-letters-in-django-haystack