Database management system
In a deployment that provides high availability, the Information Store database is deployed in an active/passive pattern. In the active/passive pattern, there is a primary database instance and a number of standbys.
Detecting server failure
There are a number of different ways to detect
if there has been a database failure.
- The i2_Component_Availability.log contains messages that report the connection status for the Information Store database. For more information about the messages that are specific to the database, see Database management system messages.
- Db2:
- Db2 fault monitor on Linux
- Heartbeat monitoring in clustered environments
- Monitoring Db2 High Availability Disaster Recovery (HADR) databases
- SQL Server
- To understand when SQL Server initiates failover, see How Automatic Failover Works.
Manual and automatic failover
When the primary server fails, the system must fail over and use the remaining servers to complete operations.- Db2
- If you are using an automated-cluster controller, failover to a standby instance is automatic. For more information, see .
- If you are using client-side automatic client rerouting, then you must manually force a standby instance to become the new primary. For more information about initiating a takeover, see Performing an HADR failover operation.
- SQL Server
- When SQL Server is configured for high availability in an availability group with three servers, failover is automatic. For more information about failover, see Automatic Failover.
Recovering failed servers
There are a number of reasons why a server might fail. You can use the logs from the database server to diagnose and solve the issue. For example, you might need to restart the server, increase the hardware specification, or replace hardware components that caused the issue.When the server is back online and functional, you can recover it to become the primary server again. Alternatively, you can recover it to become the standby for the new primary that you failed over to.
This might include recovering a back up of the Information Store database from before the server failed.
Reinstating high availability
- Db2
- Some toolkit tasks only work on the original
primary database server when you are not using an
automated-cluster controller such as TSAMP. To use
those toolkit tasks, you must revert to using the
original primary database server after a failure
or redeploy your system to use the new primary
database server.
- For more information about the process to make the recovered database the primary again, see Reintegrating a database after a takeover operation.
- To redeploy with the new primary, update the
topology.xml on each Liberty
to reference the new server in the
host-name
andport-number
and redeploy each Liberty.
- Some toolkit tasks only work on the original
primary database server when you are not using an
automated-cluster controller such as TSAMP. To use
those toolkit tasks, you must revert to using the
original primary database server after a failure
or redeploy your system to use the new primary
database server.
- SQL Server
- On SQL Server it is not required to return to the previous primary, however you might choose to do so. You can initiate a planned manual failover to return the initial primary server. For more information, see Planned Manual Failover (Without Data Loss).