These are field notes gathered during installation of website search facility for the ElasticSearch website.
You may re-use it to put a similar system in place.
The following assumes:
These are field notes gathered during installation of website search facility for the ElasticSearch website.
You may re-use it to put a similar system in place.
The following assumes:
| # Run me with: | |
| # | |
| # $ nginx -p /path/to/this/file/ -c nginx.conf | |
| # | |
| # All requests are then routed to authenticated user's index, so | |
| # | |
| # GET http://user:password@localhost:8080/_search?q=* | |
| # | |
| # is rewritten to: | |
| # |
| #!/bin/bash | |
| # herein we backup our indexes! this script should run at like 6pm or something, after logstash | |
| # rotates to a new ES index and theres no new data coming in to the old one. we grab metadatas, | |
| # compress the data files, create a restore script, and push it all up to S3. | |
| TODAY=`date +"%Y.%m.%d"` | |
| INDEXNAME="logstash-$TODAY" # this had better match the index name in ES | |
| INDEXDIR="/usr/local/elasticsearch/data/logstash/nodes/0/indices/" | |
| BACKUPCMD="/usr/local/backupTools/s3cmd --config=/usr/local/backupTools/s3cfg put" | |
| BACKUPDIR="/mnt/es-backups/" | |
| YEARMONTH=`date +"%Y-%m"` |
| Why is there no such DataImportHandler thing in ElasticSearch? Uhm, well ... but because: | |
| 1. You should really consider your own scripts | |
| (be it jvm based, perl, ruby, php, nodejs/javascript) | |
| to feed ElasticSearch via bulk indexing: | |
| http://www.elasticsearch.org/guide/reference/java-api/bulk.html | |
| 2. There are two projects doing it already: | |
| * http://code.google.com/p/sql-to-nosql-importer/ | |
| * https://github.com/Aconex/scrutineer (keeps DB in synch with ES or solr!) |
Yesterday I upgraded our running elasticsearch cluster on a site which serves a few million search requests a day, with zero downtime. I've been asked to describe the process, hence this blogpost.
To make it more complicated, the cluster was running elasticsearch version 0.17.8 (released 6 Oct 2011) and I upgraded it to the latest 0.19.10. There have been 21 releases between those two versions, with a lot of functional changes, so I needed to be ready to roll back if necessary.
We run elasticsearch on two biggish boxes: 16 cores plus 32GB of RAM. All indices have 1 replica, so all data is stored on both boxes (about 45GB of data). The primary data for our main indices is also stored in our database. We have a few other indices whose data is stored only in elasticsearch, but are updated once daily only. Finally, we store our sessions in elasticsearch, but active sessions are cached in memcached.
| cd ~ | |
| sudo apt-get update | |
| sudo apt-get install openjdk-7-jre-headless -y | |
| # Download the compiled elasticsearch rather than the source. | |
| wget http://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.20.2.tar.gz -O elasticsearch.tar.gz | |
| tar -xf elasticsearch.tar.gz | |
| rm elasticsearch.tar.gz | |
| sudo mv elasticsearch-* elasticsearch | |
| sudo mv elasticsearch /usr/local/share |
| #!/bin/bash | |
| set -e | |
| if [ "x$1" == "x-h" ] ; then | |
| echo "Usage: $0 version destdir plugins" | |
| exit | |
| fi | |
| CURRENT="0.90.0.RC1" |
| VERSION=0.20.6 | |
| sudo apt-get update | |
| sudo apt-get install openjdk-6-jdk | |
| wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-$VERSION.deb | |
| sudo dpkg -i elasticsearch-$VERSION.deb | |
| # be sure you add "action.disable_delete_all_indices" : true to the config!! |
| curl -XDELETE "http://localhost:9200/test?pretty" | |
| curl -XPOST "http://localhost:9200/test?pretty" -d '{ | |
| "settings": { | |
| "index": { | |
| "number_of_shards": 1, | |
| "number_of_replicas": 0, | |
| "analysis":{ | |
| "analyzer":{ | |
| "suggest":{ | |
| "type": "custom", |
| // Set codec, dir and segmentName accordingly to the segment you are trying to restore | |
| Codec codec = new Lucene42Codec(); | |
| Directory dir = FSDirectory.open(new File("/tmp/test")); | |
| String segmentName = "_0"; | |
| IOContext ioContext = new IOContext(); | |
| SegmentInfo segmentInfos = codec.segmentInfoFormat().getSegmentInfoReader().read(dir, segmentName, ioContext); | |
| Directory segmentDir; | |
| if (segmentInfos.getUseCompoundFile()) { | |
| segmentDir = new CompoundFileDirectory(dir, IndexFileNames.segmentFileName(segmentName, "", IndexFileNames.COMPOUND_FILE_EXTENSION), ioContext, false); |