Skip to content

Instantly share code, notes, and snippets.

Revisions

  1. @kalharbi kalharbi revised this gist Oct 6, 2016. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -46,7 +46,8 @@ This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`
    server.2=127.0.0.1:3333:3334
    server.3=127.0.0.1:4444:4445
    ```
    `dataDir` tells Zookeeper where to store its data, and `clientPort` is for the port on which a client (Solr in this case) should connect to Zookeeper. Lastly, we defined our zookeeper quorum. Each entry is defined as **server.id=host:port:port**, where id is the server id number that we previously defined in the `myid` file and the host is for the IP address of the Zookeeper instance to communicate with other Zookeeper instances. The port has two parts. The first is the port that followers use to connect to the leader, and the second is for leader election. Refer to Zookeeper's [configuration parameters documentation](http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_configuration)
    We configured the `dataDir`, which tells Zookeeper where to store its data, and `clientPort`, which assigns the port on which a client (Solr in this case) should use to connect to Zookeeper.
    Lastly, we defined the entries of our zookeeper quorum. Each entry is defined as **server.id=host:port:port**, where id is the server id number that we previously defined in the `myid` file and the host is for the IP address of the Zookeeper instance to communicate with other Zookeeper instances. The port has two parts. The first is the port that followers use to connect to the leader, and the second is for leader election. For more information, refer to Zookeeper's [configuration parameters documentation](http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_configuration).

    - Copy the config file into the data directories of the other two servers:

  2. @kalharbi kalharbi revised this gist Oct 6, 2016. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -40,10 +40,13 @@ This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`
    # Note: If you want to run each server in its own machine,
    # change the ip address in each entry; Obviously, you
    # do not have to use a different port number for each server.
    # each entry is defined as follows:
    # server.<value_of_myid_file_in_data_dir>:
    server.1=127.0.0.1:2222:2223
    server.2=127.0.0.1:3333:3334
    server.3=127.0.0.1:4444:4445
    ```
    `dataDir` tells Zookeeper where to store its data, and `clientPort` is for the port on which a client (Solr in this case) should connect to Zookeeper. Lastly, we defined our zookeeper quorum. Each entry is defined as **server.id=host:port:port**, where id is the server id number that we previously defined in the `myid` file and the host is for the IP address of the Zookeeper instance to communicate with other Zookeeper instances. The port has two parts. The first is the port that followers use to connect to the leader, and the second is for leader election. Refer to Zookeeper's [configuration parameters documentation](http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_configuration)

    - Copy the config file into the data directories of the other two servers:

  3. @kalharbi kalharbi revised this gist Oct 6, 2016. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -6,11 +6,11 @@ This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`
    ## Installing Solr and Zookeeper

    - Download and extract Solr:
    - `curl -O http://apache.arvixe.com/lucene/solr/5.4.1/solr-5.4.1.tgz`
    - `curl -O http://archive.apache.org/dist/lucene/solr/5.5.3/solr-5.5.3.tgz`
    - `mkdir /opt/solr`
    - `tar -zxvf solr-5.4.1.tgz -C /opt/solr --strip-components=1`
    - Download and extract ZooKeeper:
    - `curl -O http://apache.arvixe.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz`
    - `curl -O http://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz`
    - `mkdir /opt/zookeeper`
    - `tar -zxvf zookeeper-3.4.6.tar.gz -C /opt/zookeeper --strip-components=1`

  4. @kalharbi kalharbi revised this gist Sep 1, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -103,4 +103,4 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    **Notes:**
    - If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
    - In Solr, the default maxShardsPerNode is one shard per node. In this setup, we had 3 nodes, so should not attempt to add more replicas to a collection (e.g., numShards=2 & replicationFactor=2 will result in four shards in total spreaded across three nodes). This would cause a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
    - In Solr, the default maxShardsPerNode is one shard per node. In this setup, we had 3 nodes, so we should not attempt to add more replicas to a collection (e.g., numShards=2 & replicationFactor=2 will result in four shards in total spreaded across three nodes). This would cause a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
  5. @kalharbi kalharbi revised this gist Sep 1, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -103,4 +103,4 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    **Notes:**
    - If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
    - The default maxShardsPerNode is 1 shard per node. In this setup, we have 3 nodes; thus, when creating a collection and passing **numShards=2&replicationFactor=2** (that's 4 shards in total, and we have 3 nodes) will result in a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
    - In Solr, the default maxShardsPerNode is one shard per node. In this setup, we had 3 nodes, so should not attempt to add more replicas to a collection (e.g., numShards=2 & replicationFactor=2 will result in four shards in total spreaded across three nodes). This would cause a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
  6. @kalharbi kalharbi revised this gist Aug 31, 2016. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -101,4 +101,6 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `
    **Note:** If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
    **Notes:**
    - If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
    - The default maxShardsPerNode is 1 shard per node. In this setup, we have 3 nodes; thus, when creating a collection and passing **numShards=2&replicationFactor=2** (that's 4 shards in total, and we have 3 nodes) will result in a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
  7. @kalharbi kalharbi revised this gist Aug 31, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -98,7 +98,7 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    ```
    - Create a Solr collection using the uploaded configuration.
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=2&collection.configName=my-config'
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `
    **Note:** If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
  8. @kalharbi kalharbi revised this gist Aug 28, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -98,7 +98,7 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    ```
    - Create a Solr collection using the uploaded configuration.
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=2&collection.configName=my-config'
    `
    **Note:** If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
  9. @kalharbi kalharbi revised this gist Apr 1, 2016. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -100,4 +100,5 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `
    *Note:* If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
    **Note:** If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
  10. @kalharbi kalharbi revised this gist Apr 1, 2016. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -99,4 +99,5 @@ $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    - Create a Solr collection using the uploaded configuration.
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `
    `
    *Note:* If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
  11. @kalharbi kalharbi revised this gist Mar 31, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -93,7 +93,7 @@ $ ./bin/solr start -c -p 8985 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
    ```
    $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    -zkhost 127.0.0.1:2181 \
    -confdir ./server/solr/configsets/data_driven_schema_config./s/conf/ \
    -confdir ./server/solr/configsets/data_driven_schema_configs/conf/ \
    -confname my-config
    ```
    - Create a Solr collection using the uploaded configuration.
  12. @kalharbi kalharbi revised this gist Feb 13, 2016. 1 changed file with 8 additions and 7 deletions.
    15 changes: 8 additions & 7 deletions zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -28,7 +28,8 @@ This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`
    echo 2 > /data/zookeeper/z2/data/myid
    echo 3 > /data/zookeeper/z3/data/myid
    ```
    - copy the sample config file into the first data directory:
    - Copy the sample config file into the first data directory:

    `cp ./conf/zoo_sample.cfg /data/zookeeper/z1/zoo.cfg`
    - Open /data/zookeeper/z1/zoo.cfg in your text editor and update/add the following values:

    @@ -37,25 +38,25 @@ This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`
    clientPort=2181
    # Our zookeeper quorum:
    # Note: If you want to run each server in its own machine,
    # change the ip address in each entry and obviously you
    # do not to use a different port number for each server.
    # change the ip address in each entry; Obviously, you
    # do not have to use a different port number for each server.
    server.1=127.0.0.1:2222:2223
    server.2=127.0.0.1:3333:3334
    server.3=127.0.0.1:4444:4445
    ```

    - Copy the config file to the data directory of the other two servers:
    - Copy the config file into the data directories of the other two servers:

    ```shell
    cp /data/zookeeper/z1/zoo.cfg /data/zookeeper/z2/
    cp /data/zookeeper/z3/zoo.cfg /data/zookeeper/z3/
    cp /data/zookeeper/z1/zoo.cfg /data/zookeeper/z3/
    ```

    - This step is ONLY required when running the servers on the same host. Change both the dataDir and clientPort for the last two servers:
    - [This step is ONLY required when running the servers on the same host] Change both the dataDir and clientPort for the last two servers:
    - Server 2:

    ```shell
    $ vi /data/zookeeper/z1/zoo.cfg
    $ vi /data/zookeeper/z2/zoo.cfg
    dataDir=/data/code/zookeeper/z2/data
    clientPort=2182
    ```
  13. @kalharbi kalharbi created this gist Feb 13, 2016.
    101 changes: 101 additions & 0 deletions zookeeper-solr-cloud.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,101 @@
    ## Setting up an external Zookeeper Solr Cluster

    This is a step by step instruction on how to create a cluster that has three Solr nodes running in cloud mode. These instructions should work on both a local cluster (for testing) and a remote cluster where each server runs in its own physical machine.
    This was tested on `Solr version 5.4.1` and `Zookeeper version 3.4.6`

    ## Installing Solr and Zookeeper

    - Download and extract Solr:
    - `curl -O http://apache.arvixe.com/lucene/solr/5.4.1/solr-5.4.1.tgz`
    - `mkdir /opt/solr`
    - `tar -zxvf solr-5.4.1.tgz -C /opt/solr --strip-components=1`
    - Download and extract ZooKeeper:
    - `curl -O http://apache.arvixe.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz`
    - `mkdir /opt/zookeeper`
    - `tar -zxvf zookeeper-3.4.6.tar.gz -C /opt/zookeeper --strip-components=1`

    ## Configuring a ZooKeeper quorum

    - `cd /opt/zookeeper`
    - Create our data directories:
    - `mkdir -p /data/zookeeper/z1/data`
    - `mkdir -p /data/zookeeper/z2/data`
    - `mkdir -p /data/zookeeper/z3/data`
    - Add our server ids at each data directory:

    ```shell
    echo 1 > /data/zookeeper/z1/data/myid
    echo 2 > /data/zookeeper/z2/data/myid
    echo 3 > /data/zookeeper/z3/data/myid
    ```
    - copy the sample config file into the first data directory:
    `cp ./conf/zoo_sample.cfg /data/zookeeper/z1/zoo.cfg`
    - Open /data/zookeeper/z1/zoo.cfg in your text editor and update/add the following values:

    ```shell
    dataDir=/data/zookeeper/z1/data
    clientPort=2181
    # Our zookeeper quorum:
    # Note: If you want to run each server in its own machine,
    # change the ip address in each entry and obviously you
    # do not to use a different port number for each server.
    server.1=127.0.0.1:2222:2223
    server.2=127.0.0.1:3333:3334
    server.3=127.0.0.1:4444:4445
    ```

    - Copy the config file to the data directory of the other two servers:

    ```shell
    cp /data/zookeeper/z1/zoo.cfg /data/zookeeper/z2/
    cp /data/zookeeper/z3/zoo.cfg /data/zookeeper/z3/
    ```

    - This step is ONLY required when running the servers on the same host. Change both the dataDir and clientPort for the last two servers:
    - Server 2:

    ```shell
    $ vi /data/zookeeper/z1/zoo.cfg
    dataDir=/data/code/zookeeper/z2/data
    clientPort=2182
    ```
    - Server 3:

    ```shell
    $ vi /data/zookeeper/z3/zoo.cfg
    dataDir=/data/code/zookeeper/z3/data
    clientPort=2183
    ```
    - Start the servers:

    ```shell
    $ ./bin/zkServer.sh start /data/zookeeper/z1/zoo.cfg
    $ ./bin/zkServer.sh start /data/zookeeper/z2/zoo.cfg
    $ ./bin/zkServer.sh start /data/zookeeper/z3/zoo.cfg
    ```
    - Ensure that there are no major errors in the log file:
    - `$ cat zookeeper.out`

    ## Configuring Solr
    - `cd /opt/solr`
    - Start three Solr instances and have them point at our Zookeeper instances:

    ```shell
    # If you are running the Zookeeper servers on remote machines, use
    # the IP address of each server instead of the localhost.
    $ ./bin/solr start -c -p 8983 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
    $ ./bin/solr start -c -p 8984 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
    $ ./bin/solr start -c -p 8985 -z 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183
    ```
    - Upload our collection configuration to ZooKeeper:

    ```
    $ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig \
    -zkhost 127.0.0.1:2181 \
    -confdir ./server/solr/configsets/data_driven_schema_config./s/conf/ \
    -confname my-config
    ```
    - Create a Solr collection using the uploaded configuration.
    `curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
    `