Monitor the system availability
i2 Analyze reports on the availability of each component in the environment so that the state can be monitored. The availability is logged on each Liberty server in the environment.
To monitor the status of your deployment, use the i2_Component_Availability.log file. This log file is located in the wlp/servers/opal-server/opal-services/logs directory on each Liberty server in your deployment. Therefore, you must monitor the log file on each Liberty server in your deployment.
The log contains messages about the Liberty leadership status, Solr cluster status, and the availability of each component.
Liberty leadership messages
If the Liberty starting the election process, the following message is displayed:
INFO ServerLeadershipMonitor - We are running for Liberty leader
If the Liberty is elected leader, the following messages are displayed:
INFO ServerLeadershipMonitor - We are the Liberty leader
INFO ApplicationStateHandler - I2ANALYZE_STATUS:0065 - Application is entering leader mode
If the Liberty is not elected leader, the following messages are displayed:
INFO ServerLeadershipMonitor - We are not the Liberty leader
INFO ApplicationStateHandler - I2ANALYZE_STATUS:0063 - Application is entering non-leader mode to serve requests
Component messages
If all of the components are available to the Liberty server, the following message is displayed:
INFO ComponentAvailabilityCheck - All components are available
If at least one of the components in not available to the Liberty server, the following message is displayed:
WARN ComponentAvailabilityCheck - Not all components are availale
When the Liberty server continues to check the availability of each component, the following message is displayed:
INFO ComponentAvailabilityCheck - I2ANALYZE_STATUS:0068 - The application is waiting for all components to be available
Solr messages
If the Liberty server cannot connect to the Solr cluster, the following message is displayed:
WARN ComponentAvailabilityCheck - Unable to connect to Solr cluster
WARN ComponentAvailabilityCheck - The Solr cluster is unavailable
The following messages describe the state of the Solr cluster in the deployment:
ALL_REPLICAS_ACTIVE - The named Solr collection is healthy. All replicas in the collection are active.
For example:
INFO SolrHealthStatusLogger - 'main_index','ALL_REPLICAS_ACTIVE'
DEGRADED - The named Solr collection is degraded. The minimum replication factor can be achieved, but at least one replica is down or failed to recover.
For example:
WARN SolrHealthStatusLogger - 'main_index','DEGRADED'
When the status is DEGRADED, data can still be written to the Solr index. When the status returns to ALL_REPLICAS_ACTIVE, the data is synchronized in the Solr index as described in Data synchronization in i2 Analyze.
The deployment can still be used with the collection in a degraded state, however the deployment can now sustain fewer Solr server failures. If a degraded state is common, or lasts for an extended time, you should investigate the Solr logs to improve the stability of the system.
RECOVERING - The named Solr collection is recovering. The minimum replication factor cannot be achieved, because too many replicas are currently in recovery mode.
For example:
INFO SolrHealthStatusLogger - 'main_index','RECOVERING'
If all of the replicas recover, the status changes to ALL_REPLICAS_ACTIVE. If the replicas fail to recover, the status changes to DEGRADED or DOWN.
DOWN - The named Solr collection is down. The minimum replication factor cannot be achieved because too many replicas are down or have failed to recover.
For example:
WARN SolrHealthStatusLogger - 'main_index','DOWN'
When the status is DOWN, data cannot be written to the Solr index. You should attempt to resolve the issue. For more information, see Solr.
UNAVAILABLE - The named Solr collection is unavailable. The application cannot connect to the collection.
For example:
WARN SolrHealthStatusLogger - 'main_index','UNAVAILABLE'
When the status is UNAVAILABLE, data cannot be written to the Solr index. You should attempt to resolve the issue. For more information, see Solr.
ZooKeeper status messages
If the connection to ZooKeeper is lost, the following messages are displayed:
WARN ComponentAvailabilityCheck - Unable to connect to ZooKeeper
WARN ComponentAvailabilityCheck - The ZooKeeper quorum is unavailable
When the connection to the database is restored, the all components are active message is displayed.
Database management system messages
If the connection to the database is lost, the following messages are displayed:
WARN ComponentAvailabilityCheck - Information Store database appears to be offline, retrying...
WARN ComponentAvailabilityCheck - Unable to connect to the Information Store database.
java.sql.SQLException: The TCP/IP connection to the host has failed.
Error: "Socket operation on nonsocket: configureBlocking. Verify the connection properties.
Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port.
Make sure that TCP connections to the port are not blocked by a firewall.". DSRA0010E: SQL State = 08S01, Error Code = 0
WARN ComponentAvailabilityCheck - The Information Store database is unavailable
When the connection to the database is restored, the all components are active message is displayed.