Field notes gathered during installing and configuring ElasticSearch for Website Search: Field Notes

These are field notes gathered during installation of website search facility for the ElasticSearch website.

You may re-use it to put a similar system in place.

The following assumes:

  • You are on a Ubuntu Linux system, or compatible/similar
  • You have sudo permisssions for the system

Update System

sudo apt-get update
sudo apt-get upgrade

Install Tools

sudo apt-get install build-essential curl vim nmap
sudo apt-get install ruby ruby-dev libopenssl-ruby

Install Git

We cannot install Git from packages, at the moment Ubuntu comes with, a year old version. Unbelievable.

sudo apt-get install libz-dev tk
cd ~
./configure --prefix=/usr/local
sudo make install clean

Install Java

Install Sun Java.

Anectodal evidence suggests that any “open Java” will break stuff. Any evidence to the contrary seeked and desired.

sudo vim /etc/apt/sources.list
deb lucid partner
deb-src lucid partner

sudo apt-get install sun-java6-jdk
java -version

Install ElasticSearch a.k.a. “Let's define easy”

cd /usr/local/lib

sudo curl -k -L -o elasticsearch-0.15.0.tar.gz
sudo tar -zxvf elasticsearch-0.15.2.tar.gz
rm elasticsearch-0.15.2.tar.gz

Configure ElasticSearch

Add user for ElasticSearch and other associated services:

sudo adduser --home /home/elasticsearch --disabled-password --system --group elasticsearch

Important! Increase the open files limit for the elasticsearch user:

sudo vim /etc/security/limits.conf
elasticsearch     -    nofile    32000
elasticsearch     -    memlock    unlimited

sudo vim /etc/pam.d/su
session    required

Set cluster name, paths where you want to store logs and data and other options for ElasticSearch:

cd /usr/local/lib/elasticsearch-0.15.2

sudo vim config/elasticsearch.yml
# Cluster Settings
  name: elasticsearch_website

  logs: /var/log/elasticsearch
  data: /var/data/elasticsearch

  mlockall: true

Make sure proper permissions are set:

sudo mkdir -p /var/log/elasticsearch
sudo chown -R elasticsearch:admin /var/log/elasticsearch
sudo chmod -R ug+rw /var/log/elasticsearch/

sudo mkdir -p /var/data/elasticsearch
sudo chown -R elasticsearch:admin /var/data/elasticsearch
sudo chmod -R ug+rw /var/data/elasticsearch

sudo mkdir -p /var/run/elasticsearch
sudo chown -R elasticsearch:admin /var/run/elasticsearch
sudo chmod -R ug+rw /var/run/elasticsearch

Start ElasticSearch

sudo -H -u elasticsearch /usr/local/lib/elasticsearch-0.15.2/bin/elasticsearch -p /var/run/elasticsearch/
curl http://localhost:9200

Setup Website

cd /var/data
sudo git clone git:// elasticsearch_website

sudo chown -R elasticsearch:admin /var/data/elasticsearch_website
sudo chmod -R ug+rw /var/data/elasticsearch_website

sudo gem install jekyll

Setup Hide

Hide is tiny application to allow importing the Jekyll website data into ElasticSearch and to receive Github HTTP post-receive notifications.

sudo mkdir -p /var/applications
cd /var/applications/
sudo git clone git://

sudo chown -R elasticsearch:admin /var/applications
sudo chmod -R ug+rw /var/applications

cd /var/applications/hide
sudo cp config.example.rb config.rb
sudo vim config.rb
:path        => '/var/data/elasticsearch_website'
sudo chown -R elasticsearch:admin /var/applications/hide/config.rb

sudo gem install bundler -v 1.0.10
sudo -H -u elasticsearch bundle install

Import website data into ElasticSearch:

sudo -H -u elasticsearch bundle exec rake index:destroy index:setup index:import

Start the post-receive hook server:

sudo -H -u elasticsearch /usr/bin/env BUNDLE_GEMFILE=/var/applications/hide/Gemfile /usr/bin/bundle exec thin --chdir /var/applications/hide --rackup /var/applications/hide/ --port 5000 --log /var/applications/hide/log/thin.log --pid /var/applications/hide/tmp/ --environment production --tag hide --daemonize start

Test the post-receive hook via Github ( You can just click it.

Install Varnish

We will use Varnish to serve as a restricting proxy for ElasticSearch. (Of course, we could also use Nginx, Apache, etc. as a proxy.)

We will allow only GET requests to the _search endpoint. In the future, we may do more interesting tricks.


curl | sudo apt-key add -
sudo vim /etc/apt/sources.list
deb lucid varnish-2.1

sudo apt-get update
sudo apt-get install varnish


sudo chown -R elasticsearch:admin /etc/varnish
sudo chmod -R ug+rw /etc/varnish

sudo chown -R elasticsearch:admin /var/lib/varnish/
sudo chmod -R ug+rw /var/lib/varnish/

sudo vim /etc/varnish/default.vcl

backend default {
    .host = "";
    .port = "9200";

sub vcl_recv {
  if (req.request != "GET" || req.url !~ "/_search") {
    error 403;

sub vcl_fetch {
    set beresp.grace = 30m;

sub vcl_error {
    set obj.http.Content-Type = "text/html; charset=utf-8";
    synthetic {"
<!DOCTYPE html>
    <title>"} obj.status " " obj.response {"</title>
    <h1>Error "} obj.status " " obj.response {"</h1>
    <p>Use the <a href='/_search?pretty=true&q=*'>/<code>_search</code></a> API.</p>
    <p><a href=''></a></p>
    return (deliver);


sudo mkdir -p /var/run/varnish/
sudo chown -R elasticsearch:admin /var/run/varnish
sudo chmod -R ug+rw /var/run/varnish

sudo su - elasticsearch -c "/usr/sbin/varnishd -f /etc/varnish/default.vcl -a -P /var/run/varnish/"

Setup Monit

We will put the system under surveillance with Monit.

Install and enable:

sudo apt-get install monit

sudo vim /etc/default/monit
# You must set this variable to for monit to start

sudo /etc/init.d/monit start


sudo vim /etc/monit/monitrc

# ###################
# Monit Configuration
# ###################

set daemon 120
  with start delay 240

set alert [email protected]
set mailserver localhost

set httpd port 2812 and
   use address localhost
   allow localhost

check system
  if loadavg (5min) > 10 then alert
  if memory usage > 80% then alert
  if cpu usage (user) > 90% then alert

check filesystem data with path /var
  if space usage > 80% for 5 times within 15 cycles then alert
  if inode usage > 90% then alert
  if space usage > 99% then stop
  if inode usage > 99% then stop
  group filesystem

check host elasticsearch with address
  if failed url with timeout 15 seconds then alert
  group elasticsearch

check process elasticsearch1 with pidfile /var/run/elasticsearch/
  start program = "/usr/bin/sudo -H -u elasticsearch /usr/local/lib/elasticsearch-0.15.2/bin/elasticsearch -p /var/run/elasticsearch/" with timeout 60 seconds
  stop program  = "/bin/kill $(/bin/cat /var/run/elasticsearch/"
  if cpu > 90% for 5 cycles then restart
  if totalmem > 2 GB for 5 cycles then restart
  if loadavg(5min) greater than 10 for 8 cycles then stop
  if 3 restarts within 5 cycles then timeout
  group elasticsearch

check process varnishd with pidfile /var/run/varnish/
  start program = "/usr/sbin/varnishd -f /etc/varnish/default.vcl -a -P /var/run/varnish/" with timeout 60 seconds
  stop program  = "/bin/kill $(/bin/cat /var/run/varnish/"
  if cpu > 90% for 5 cycles then restart
  if totalmem > 500 MB for 5 cycles then restart
  if loadavg(5min) greater than 10 for 8 cycles then stop
  if 3 restarts within 5 cycles then timeout
  group elasticsearch

check process post_receive_server with pidfile   /var/applications/hide/tmp/
  start program = "/usr/bin/sudo -H -u elasticsearch /usr/bin/env BUNDLE_GEMFILE=/var/applications/hide/Gemfile /usr/bin/bundle exec thin --chdir /var/applications/hide --rackup /var/applications/hide/ --port 5000 --log /var/applications/hide/log/thin.log --pid /var/applications/hide/tmp/ --environment production --tag hide --daemonize start" with timeout 60 seconds
  stop program  = "/bin/kill $(/bin/cat /var/applications/hide/tmp/"
  if cpu > 90% for 5 cycles then restart
  if totalmem > 2 GB for 5 cycles then restart
  if loadavg(5min) greater than 10 for 8 cycles then stop
  if 3 restarts within 5 cycles then timeout
  group git

Use SSH tunnel to connect to Monit GUI:

ssh elasticsearch -L 2812:localhost:2812
open http://localhost:2812

Otherwise, just check it on the CLI:

sudo monit status

To reload Monit configuration, use:

sudo monit reload

To start all services, use:

sudo monit start all

Wrap Up

Congratulations! You now have “continuous indexing” system set up for searching your Jekyll website with ElasticSearch.

Author: Karel Minarik

Copy link

emgiezet commented May 8, 2012

Nice work!

Copy link

jboren commented Sep 25, 2013

Fantastic! Thanks for posting this, very helpful.

