In a deployment that provides high availability, multiple Liberty servers are used to provide active/active availability. The active/active pattern allows multiple Liberty servers to allow connections from clients and continue to function when some of the Liberty servers fail.

Detecting server failure

To detect if a Liberty server has failed, you can use logs in your load balancer. In the load balancer, monitor the response where you use the health/live endpoint to determine which Liberty servers to route clients to. If the response from the health/live endpoint for a particular Liberty server is 503, that Liberty server may have failed.

For more information about the endpoint and its responses, see The health/live endpoint.

The Liberty server can return a 503 if it is not started, in the startup process, or there is a temporary loss of connection to another component. Liberty can restart as a non-leader if it temporarily loses the connection to a component and loses leadership. If this is the reason for the 503, the following message is displayed in the IBM_i2_Component_Availability.log file:
INFO  ApplicationStateHandler        - I2ANALYZE_STATUS:0066 - Application is entering non-leader mode to serve requests
For more information about the log file, see Monitor the system availability.

Automatic failover

If a Liberty server fails, the other Liberty servers continue to function as usual. If a client was connected to the failed Liberty server, the analyst might have to log out from the client and log in again for the load balancer to route the request to one of the live Liberty servers.

If it was the leader Liberty server that failed, one of the remaining Liberty servers is elected leader. For more information about the leadership process, see Liberty leadership configuration.

Recovering failed servers

There are a number of reasons why a server might fail. Use the logs from the failed server to diagnose and solve the issue. For example, you might need to restart the server, increase the hardware specification, or replace hardware components.
  • The Liberty logs are in the deploy\wlp\usr\servers\opal-server\logs directory.
  • The deployment toolkit logs are in the toolkit\configuration\logs directory.

For more information about the different log files and their contents, see Deployment log files.

Reinstating high availability

On the recovered Liberty server, run setup -t startLiberty to restart the server and i2 Analyze application.

You can use the load balancer logs to ensure that the Liberty server is now returning 200 from the health/live endpoint.