Solr and ZooKeeper
i2 Analyze uses Solr for text indexing and search capabilities. The ZooKeeper service maintains configuration information and distributed synchronization across Solr and Liberty.
The topology.xml file for a deployment that includes the opal-server application also includes the <zookeepers> and <solr-clusters> elements. The <solr-clusters> and <zookeepers> elements define the Solr cluster that is used in a deployment and the ZooKeeper instance that manages it.
<solr-clusters>
In the supplied topology.xml file that includes the opal-server application with the opal-services-is WAR, the <solr-clusters> definition is:
<solr-clusters>
<solr-cluster id="is_cluster" zookeeper-id="zoo">
<solr-collections>
<solr-collection
id="main_index" type="main"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="match_index1" type="match"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="match_index2" type="match"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="highlight_index" type="highlight"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="chart_index" type="chart"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="vq_index" type="vq"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
<solr-collection
id="recordshare_index" type="recordshare"
lucene-match-version=""
max-shards-per-node="4" num-shards="4" num-replicas="1"
/>
</solr-collections>
<solr-nodes>
<solr-node
memory="2g"
id="node1"
host-name=""
data-dir=""
port-number="8983"
/>
</solr-nodes>
</solr-cluster>
</solr-clusters>
The <solr-clusters> element includes a child <solr-cluster> element. The id attribute of the <solr-cluster> element is a unique identifier for the Solr cluster. To associate the Solr cluster with the ZooKeeper instance, the value of the zookeeper-id attribute must match the value of the id attribute of the <zookeeper> element.
<solr-collections>
The <solr-collections> element is a child of the <solr-cluster> element. The <solr-collections> element has child <solr-collection> elements.
Depending on the WARs that are included in the application, the number and type of required child <solr-collection> elements is different.
opal-services-is
In the opal-services-is WAR, you must have <solr-collection> elements with each of the following values for the type attribute:
main - you must have either one or two collections of type main
match - you must have two collections of type match
highlight - you must have one collection of type highlight
chart - you must have one collection of type chart
vq - you must have one collection of type vq
recordshare - you must have one collection of type recordshare
opal-services-daod
In the opal-services-daod WAR, you must have one <solr-collection> element with a value of daod for the type attribute.
opal-services-is-daod
In the opal-services-is-daod WAR, you must have <solr-collection> elements with each of the following values for the type attribute:
main - you must have either one or two collections of type main
daod - you must have one collection of type daod
match - you must have two collections of type match
highlight - you must have one collection of type highlight
chart - you must have one collection of type chart
vq - you must have one collection of type vq
recordshare - you must have one collection of type recordshare
The <solr-collection> element has the following attributes:
Attribute | Description |
---|---|
id | An identifier that is used to identify the Solr collection. |
type | The type of the collection. The possible values are: main, daod, match, highlight, chart, vq, recordshare. |
lucene-match-version | The Lucene version that is used for the collection. At this release, the value is populated when you deploy i2 Analyze. |
num-shards | The number of logical shards that are created as part of the Solr collection. |
num-replicas | The number of physical replicas that are created for each logical shard in the Solr collection. |
max-shards-per-node | The maximum number of shards that are allowed on each Solr node. This value is the result of num-shards multiplied by num-replicas. |
min-replication-factor | The minimum number of replicas that an update must be replicated to for the operation to succeed. This optional value must be greater than 0 and less than or equal to the value of num-replicas. |
num-csv-write-threads | The number of threads that are used to read from the database and write to the temporary CSV file when indexing data in the Information Store. This attribute is optional, and applies to Solr collections of type main and match only. The total of num-csv-write-threads and num-csv-read-threads must be less than the number of cores available on the Liberty server. |
num-csv-read-threads | The number of threads that are used to read from the temporary CSV file and write to the index when indexing data in the Information Store. This attribute is optional, and applies to Solr collections of type main and match only. This value must be less than the value of num-shards. The total of num-csv-write-threads and num-csv-read-threads must be less than the number of cores available on the Liberty server. |
<solr-nodes>
The <solr-nodes> element is a child of the <solr-cluster> element. The <solr-nodes> element can have one or more child <solr-node> elements. Each <solr-node> element has the following attributes:
Attribute | Description |
---|---|
id | A unique identifier that is used to identify the Solr node. |
memory | The amount of memory that can be used by the Solr node. |
host-name | The hostname of the Solr node. |
data-dir | The location where Solr stores the index. |
port-number | The port number of the Solr node. |
<zookeepers>
In the supplied topology.xml file that includes the opal-server application, the <zookeepers> definition is:
<zookeepers>
<zookeeper id="zoo">
<zkhosts>
<zkhost
id="1"
host-name=""
data-dir=""
port-number="9983"
quorum-port-number=""
leader-port-number=""
/>
</zkhosts>
</zookeeper>
</zookeepers>
The <zookeepers> element includes a child <zookeeper> element. The id attribute of the <zookeeper> element is a unique identifier for the ZooKeeper instance. To associate the ZooKeeper instance with the Solr cluster, the value of the id attribute must match the value of the zookeeper-id attribute of the <solr-cluster> element.
The <zkhosts> element is a child of the <zookeeper> element. The <zkhosts> element can have one or more child <zkhost> elements. Each <zkhost> element has the following attributes:
Attribute | Description |
---|---|
id | A unique identifier that is used to identify the ZooKeeper host. This value must be an integer in the range 1 - 255. |
host-name | The hostname of the ZooKeeper host. |
data-dir | The location that ZooKeeper uses to store data. |
port-number | The port number of the ZooKeeper host. |
quorum-port-number | The port number that is used for ZooKeeper quorum communication. By default, the value is 10483. |
leader-port-number | The port number that is used by ZooKeeper for leader election communication. By default, the value is 10983. |