Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save LCHCAPITALHUMAIN/67a83d9609dd5488c077e6c6c3b02fa7 to your computer and use it in GitHub Desktop.
Save LCHCAPITALHUMAIN/67a83d9609dd5488c077e6c6c3b02fa7 to your computer and use it in GitHub Desktop.
Setting up an external Zookeeper Solr Cluster

Setting up an external Zookeeper Solr Cluster

This is a step by step instruction on how to create a cluster that has three Solr nodes running in cloud mode. These instructions should work on both a local cluster (for testing) and a remote cluster where each server runs in its own physical machine. This was tested on Solr version 5.4.1 and Zookeeper version 3.4.6

Installing Solr and Zookeeper

  • Download and extract Solr:
    • curl -O http://apache.arvixe.com/lucene/solr/5.4.1/solr-5.4.1.tgz
    • mkdir /opt/solr
    • tar -zxvf solr-5.4.1.tgz -C /opt/solr --strip-components=1
  • Download and extract ZooKeeper:
    • curl -O http://apache.arvixe.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
    • mkdir /opt/zookeeper
    • tar -zxvf zookeeper-3.4.6.tar.gz -C /opt/zookeeper --strip-components=1

Configuring a ZooKeeper quorum

  • cd /opt/zookeeper

  • Create our data directories:

    • mkdir -p /data/zookeeper/z1/data
    • mkdir -p /data/zookeeper/z2/data
    • mkdir -p /data/zookeeper/z3/data
  • Add our server ids at each data directory:

    echo 1 > /data/zookeeper/z1/data/myid
    echo 2 > /data/zookeeper/z2/data/myid
    echo 3 > /data/zookeeper/z3/data/myid
  • Copy the sample config file into the first data directory:

cp ./conf/zoo_sample.cfg /data/zookeeper/z1/zoo.cfg

  • Open /data/zookeeper/z1/zoo.cfg in your text editor and update/add the following values:

    dataDir=/data/zookeeper/z1/data
    clientPort=2181
    # Our zookeeper quorum:
    # Note: If you want to run each server in its own machine,
    # change the ip address in each entry; Obviously, you
    # do not have to use a different port number for each server.
    server.1=127.0.0.1:2222:2223
    server.2=127.0.0.1:3333:3334
    server.3=127.0.0.1:4444:4445
  • Copy the config file into the data directories of the other two servers:

    cp /data/zookeeper/z1/zoo.cfg /data/zookeeper/z2/
    cp /data/zookeeper/z1/zoo.cfg /data/zookeeper/z3/
  • [This step is ONLY required when running the servers on the same host] Change both the dataDir and clientPort for the last two servers:

    • Server 2:
      $ vi /data/zookeeper/z2/zoo.cfg
      dataDir=/data/code/zookeeper/z2/data
      clientPort=2182
    • Server 3:
      $ vi /data/zookeeper/z3/zoo.cfg
      dataDir=/data/code/zookeeper/z3/data
      clientPort=2183
  • Start the servers:

     $ ./bin/zkServer.sh start /data/zookeeper/z1/zoo.cfg
     $ ./bin/zkServer.sh start /data/zookeeper/z2/zoo.cfg
     $ ./bin/zkServer.sh start /data/zookeeper/z3/zoo.cfg
  • Ensure that there are no major errors in the log file:

    • $ cat zookeeper.out

Configuring Solr

  • cd /opt/solr
  • Start three Solr instances and have them point at our Zookeeper instances:
# If you are running the Zookeeper servers on remote machines, use 
# the IP address of each server instead of the localhost.
$ ./bin/solr start -c -p 8983 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
$ ./bin/solr start -c -p 8984 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
$ ./bin/solr start -c -p 8985 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
  • Upload our collection configuration to ZooKeeper:
$ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
-zkhost 127.0.0.1:2181 \
-confdir ./server/solr/configsets/data_driven_schema_configs/conf/ \
-confname my-config
  • Create a Solr collection using the uploaded configuration.

    curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'

Note: If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment