Configuring Solr for HADR

The i2 Analyze configuration specifies the structure of your Solr cluster. To provide high availability, configure a Solr cluster over at least two Solr servers.

About this task

In the i2 Analyze configuration, provide the details about the Solr servers in your environment. To provide high availability of the content of the Solr indexes, specify the number of replicas, the minimum replication factor, and the location of the replicas.

For more information about SolrCloud, see How SolrCloud Works.

Procedure

  1. Specifying the Solr nodes.
    To deploy Solr for high availability, you must have at least two Solr nodes on separate hosts.

    1. Add the Solr nodes to your topology.xml file.
      For example:

      <solr-nodes>
        <solr-node
          memory="2g"
          data-dir="C:/i2/i2analyze/data/solr"
          host-name="solr_server1_host_name"
          id="node1"
          port-number="8983"
        />
        <solr-node
          memory="2g"
          data-dir="C:/i2/i2analyze/data/solr"
          host-name="solr_server2_host_name"
          id="node2"
          port-number="8983"
        />
      </solr-nodes>

      Where solr_serverx_host_name is the hostname of a Solr server.
      For more information about the possible values for each attribute, see Solr and ZooKeeper.

  2. Configuring Solr replicas.
    In Solr, the data is stored as documents in shards. Every shard consists of at least one replica. For a highly available solution, you must have more than one replica of each shard and these replicas must be distributed across the servers that host the Solr nodes.

    1. Specifying the replication factor in your topology.xml.
      The replication factor is the number of replicas to be created for each shard. For high availability, this must be 2 or more. You specify the replication factor in the num-replicas attribute of the <solr-collection> element.

    2. Specifying the minimum replication factor in your topology.xml.
      The minimum replication factor defines when data is successfully replicated in Solr. If you have three replicas for a shard and a minimum replication factor of 2, a write operation is deemed successful if the data is written to at least two replicas. You specify the minimum replication factor in the min-replication-factor attribute of the <solr-collection> element.
      The following extract from a topology.xml file shows an example of the num-replicas and min-replication-factor attributes:

      <solr-collections>
        <solr-collection 
        num-replicas="2"
        min-replication-factor="1"
        id="main_index"
        type="main"
        num-shards="1"
        />
      ...
      </solr-collections>
  3. To configure the replica placement plugins, modify the configuration/solr/solr.replica.placement.plugin.json file.

    By default, i2 Analyze uses Solr's Affinity Placement plugin to place the Solr replicas. This plugin attempts to avoid placing replicas on a single node. To aid the placement of replicas across your Solr servers, the host name of your Solr nodes is used as an availability zone.
    For more information about the Solr plugins, see Replica Placement Plugins.

    The replica placement is updated when you create the Solr cluster as part of the deployment steps. Additionally, you can update the replica placement by running the updateSolrReplicaPlacementPlugin toolkit task.

What to do next?

Continue configuring the i2 Analyze configuration. For more information, see Deploying i2 Analyze with high availability.