Caucho maker of Resin Server | Application Server (Java EE Certified) and Web Server


 

Resin Documentation

home company docs 
app server 
 Resin Server | Application Server (Java EE Certified) and Web Server
 

health checking


Resin Professional includes a powerful and configurable system for monitoring application server health. The system is intentionally similar to the Resin's "URL Rewrite" rules, based on a configurable set of checks, conditions, and actions. The health checking system runs internal to Resin on a periodic basis. Checks are generally always performed on the local Resin node, and if actions are to be taken, they are performed against the local Resin node as well.

Configuration

Health configuration is an extension of the standard Resin configuration file resin.xml. Because Resin uses CanDI to create and update Java objects, each XML tag exactly matches either a Java class or a Java property. As a result, the HealthSystem JavaDoc and the JavaDoc of the various checks, actions, and predicates help to supplement the documentation as much as this reference.

health.xml

Resin version 4.0.16 and later includes health.xml as a standard Resin configuration file alongside resin.xml and app-default.xml.

health.xml is imported into resin.xml as a child of <cluster> or <cluster-default>.

Example: importing health.xml into resin.xml
<resin xmlns="http://caucho.com/ns/resin"
       xmlns:resin="urn:java:com.caucho.resin">
  <cluster-default>  
    ...
    <!--
       - Admin services
      -->
    <resin:DeployService/>
    
    <resin:if test="${resin.professional}">
      <resin:AdminServices/>
    </resin:if>

    <!--
       - Configuration for the health monitoring system
      -->
    <resin:if test="${resin.professional}">
      <resin:import path="${__DIR__}/health.xml" optional="true"/>
    </resin:if>
    ...
  </cluster-default>
</resin>
Example: simple health.xml
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:Restart>
    <health:IfHealthFatal/>
  </health:Restart>
  
</cluster>

health: namespace

health.xml introduces a new XML namespace, health:, defined by xmlns:health="urn:java:com.caucho.health". health: separates health objects from standard resin: elements for clarity and performance. The packages references by health: are:

Health check naming

ee: namespace

The ee: namespace is used for naming objects, for example ee:Named="someName", so that they may be referred to by name later in the configuration. This is sometimes necessary as some health conditions permit referencing a specific health check, as demonstrated in the following example.

Example: referencing named objects
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:HttpStatusHealthCheck ee:Named="pingJspCheck">
    <url>http://localhost:8080/test-ping.jsp</url>
  </health:HttpStatusHealthCheck>
  
  <health:Restart>
    <health:IfHealthCritical healthCheck="${pingJspCheck}"/>
    <health:IfRechecked/>
  </health:Restart>
  
</cluster>

In this example, an instance of HttpStatusHealthCheck is named 'pingJspCheck' and referred to by name in the IfHealthCritical criteria using an EL expression. The Restart action will only trigger if the health status is CRITICAL for this specific health check and no others.

Default names

All health checks classes are annotated with @Named, and therefore have a default name that corresponds to their bean name. For example <health:CpuHealthCheck/> can be referred to by ${cpuHealthCheck} without the use of ee:Named.

Example: default health check name
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:CpuHealthCheck>
    <warning-threshold>95</warning-threshold>
  </health:CpuHealthCheck>
  
  <health:DumpThreads>
    <health:IfHealthWarning healthCheck="${cpuHealthCheck}"/>
  </health:DumpThreads>  
  
</cluster>

Duplicate names

Duplicate health check names are not permitted. Resin will fail to startup due to invalid configuration in this case. This can be caused by configuring duplicate checks without using ee:Named, or by configuring more than one check with the same name. The following examples demonstrate both illegal cases.

Example: illegal unnamed duplicate checks
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:HttpStatusHealthCheck">
    <url>http://localhost:8080/test1.jsp</url>
  </health:HttpStatusHealthCheck>
  
  <health:HttpStatusHealthCheck">
    <url>http://localhost:8080/test2.jsp</url>
  </health:HttpStatusHealthCheck>
  
</cluster>

In the preceding example, use of ee:Named is required.

Example: illegal duplicate names
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:HttpStatusHealthCheck" ee:Named="healthCheck">
    <url>http://localhost:8080/test1.jsp</url>
  </health:HttpStatusHealthCheck>
  
  <health:CpuHealthCheck ee:Named="healthCheck">
    <warning-threshold>95</warning-threshold>
  </health:CpuHealthCheck>
  
</cluster>

In the preceding example, the health check names must be different, regardless of the type of check.

Default health configuration

If for any reason you are missing health.xml, for example you are upgrading from an older version of Resin and don't have the health.xml import in resin.xml, there's no need to worry. Resin creates some checks by default regardless of the presence of health.xml. Furthermore, Resin will detect if no checks are configured and setup default actions and conditions.

Standard health checks

The following health checks are considered critical to standard operation and thus will be created by Resin regardless of the presence of health.xml. If you wish to disabled any of these standard health checks, configure the check in health.xml and set the attribute enabled="false".

Default actions

If any health checks are configured besides the standard checks mentioned above, Resin will assume the user is using health.xml and will not setup any health actions. If however health.xml is missing or empty, the following basic actions will be created.

  <health:Restart>
    <health:IfHealthFatal/>
  </health:Restart>

Health checks

Health checks are status monitors which are executed on a periodic basis by the health system to determine an individual health status. Health checks are designed to be simple; repeatedly evaluating the same data. The health system determines an overall Resin health status by aggregating the results of all the configured health checks.

Health status

Every time a health check executes it produces a HealthStatus and a message. The following is a list of all health statuses and their generally implied meaning.

HealthStatus
NAMEORDINAL VALUEDESCRIPTION
UNKNOWN0Health check has not yet executed or failed to execute properly; status is inconclusive.
OK1Health check reported healthy status. This does not imply recovery.
WARNING2Health check reported warning threshold reached or critical is possible.
CRITICAL3Health check reported critical status; action should be taken.
FATAL4Health check reported fatal; restart expected.

The descriptions above should be understood to be entirely dependent on health action and predicate configuration. For example, a FATAL status does not imply a restart will occur unless health:Restart is configured with the health:IfHealthFatal predicate, as it is in the default health.xml.

System checks

System checks are health checks that can only exist once per JVM due to the nature of the data they sample. Most system checks are pre-configured in the default health.xml.

Note: System checks are singletons. Configuring duplicate system checks with different names will not result in the creation of duplicate system checks. The following is technically valid configuration, but results in configuring the same system check twice.

Example: duplicate system checks
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <health:CpuHealthCheck ee:Named="cpuCheck1">
    <warning-threshold>95</warning-threshold>
  </health:CpuHealthCheck>
  
  <health:CpuHealthCheck ee:Named="cpuCheck2">
    <warning-threshold>99</warning-threshold>
  </health:CpuHealthCheck>

</cluster>

In this example, warning-threshold will be set to 95 and then overrided to 99.

User checks

User checks are not pre-defined in health.xml; an administrator must configure them in health.xml as appropriate for an application. User checks are not singletons; the same check type can be configured in health.xml more than once provided they have different names.

Example: duplicate user checks
<cluster xmlns="http://caucho.com/ns/resin"
         xmlns:resin="urn:java:com.caucho.resin"
         xmlns:health="urn:java:com.caucho.health"
         xmlns:ee="urn:java:ee">

  <!-- Http status check 1 for database with email to database admin -->
  
  <health:HttpStatusHealthCheck ee:Named="databaseCheck">
    <url>http://localhost:8080/databaseCheck.jsp</url>
  </health:HttpStatusHealthCheck>
  
  <health:SendMail>
    <to>database_team@yourdomain.com</to>
    <health:IfHealthCritical healthCheck="${databaseCheck}"/>
    <health:IfRechecked/>
  </health:SendMail>
  
  <!-- Http status check 2 for application with email to application admin -->

  <health:HttpStatusHealthCheck" ee:Named="appCheck">
    <url>http://localhost:8080/applicationTest.jsp</url>
  </health:HttpStatusHealthCheck>
  
  <health:SendMail>
    <to>app_team@yourdomain.com</to>
    <health:IfHealthCritical healthCheck="${appCheck}"/>
    <health:IfRechecked/>
  </health:SendMail>

</cluster>

Health actions

Health actions perform a task, usually in response to specific conditions, or as remediation for a health check status. Like health checks, health actions are configured in health.xml and executed by the health system on a periodic basis. Health actions are usually accompanied by one or more conditions, or predicates, but this is not required. All actions have the potential to be executed once per period, determined by evaluation of associated conditions. A health action with no conditions will execute once per period.

Health conditions

Health condition, or predicates, qualify an action to execute based on a set of criteria. The action/condition pattern is intentionally similar to Resin's rewrite dispatch/condition pattern, so it should be familiar to some users. Health actions are evaluated every period. Conditions prevent the execution of an action unless all condition evaluate to true. A health action with no conditions will execute once per period. When more than one condition is present for an action, the default combining condion is <health;And>.

Basic conditions

Basic conditions evaluate some general criteria and return true if the condition matches. Basic conditions do not evaluate the status of a health check. Instead they evaluate some general criteria like the time of day.

Combining conditions

General condition or health check conditions can be combined or negated using these conditions.

Health check conditions

All health check conditions evaluate some aspect of the results of a health check. All optionally accept the parameter health-check, which can reference a specific named health check. In absence of this parameter, overall aggregated Resin health will be used.

Lifecycle conditions

Lifecycle conditions evaluate the current state of Resin, qualifying actions to execute only during a Resin lifecycle state change.


Copyright © 1998-2015 Caucho Technology, Inc. All rights reserved. Resin ® is a registered trademark. Quercustm, and Hessiantm are trademarks of Caucho Technology.