PreviousNextIndex

Chapter 7: Planning for Cluster Events


This chapter describes the HACMP cluster events.

Prerequisites

Before reading this chapter, read the overview information about cluster events in the Concepts and Facilities Guide.

By now, you should have completed the planning steps in the previous chapters:

  • Chapter 2: Initial Cluster Planning
  • Chapter 3: Planning Cluster Network Connectivity
  • Chapter 4: Planning Shared Disk and Tape Devices
  • Chapter 5: Planning Shared LVM Components
  • Chapter 6: Planning Resource Groups.
  • Overview

    The following sections describe the main types of HACMP events and the logic for recovery actions.

    This chapter contains the following sections:

  • Planning Site and Node Events
  • Planning node_up and node_down Events
  • Network Events
  • Network Interface Events
  • Cluster-Wide Status Events
  • Resource Group Event Handling and Recovery
  • Customizing Cluster Event Processing
  • Custom Remote Notification of Events
  • Customizing Event Duration Time Until Warning
  • User-Defined Events
  • Event Summaries and Preamble
  • Completing the Cluster Event Worksheet
  • HACMP provides two ways to manage event processing:

  • Customize predefined events
  • Define new events.
  • In HACMP 5.4, resource groups are processed in parallel by default, if possible, unless you specify a customized serial processing order for all or some of the resource groups in the cluster.

    The logic and sequence of events as described in examples may not list all events. For additional information on event processing, see Appendix B, Resource Group Behavior during Cluster Events in the Administration Guide and Chapter: 2 Using Cluster Log Files in the Troubleshooting Guide.

    See Chapter 8: Starting and Stopping Cluster Services in the Administration Guide for information on:

  • Steps you take to start and stop cluster services
  • Interaction with the AIX 5L shutdown command and interaction of HACMP cluster services with RSCT.
  • Planning Site and Node Events

    Defining a site is optional unless you have installed HACMP/XD or are using cross-site mirroring. HACMP/XD includes HACMP/XD for Metro Mirror, GLVM or HAGEO; all of these make use of sites for replicated resources.

    Site event scripts are included in the HACMP software. If sites are not defined, no site events are generated. The HACMP site_event scripts run as follows if sites are defined:

  • The first node in a site runs site_up before it completes node_up event processing. The site_up_complete event runs after node_up_complete.
  • When the last node in a site goes down, the site_down event runs before node_down, and site_down_complete runs after node_down_complete.
  • Without installing HACMP/XD, you can define pre- and post-events to run when a site changes state. In this case, you define all site-related processes.

    Site events (including check_for_site_up and check_for_site_down) are logged in the hacmp.out log file. Some other site events, such as site_isolation and site_merge, may occur in clusters with sites. HACMP does not run event scripts and no additional events are initiated for these events. These are places where you may want to customize actions for your site.

    If sites are defined, then site_up runs when the first node in the site comes up and site_down runs when the last node in the site goes down. The event script sequence for handling resource groups in general is:

    site_up

    site_up_remote

    node_up

    rg_move events to process resource group actions

    node_up_complete

    site_up_complete

    site_up_remote_complete

    --------------------------------------

    site_down

    site_down_remote

    node_down

    rg_move events to process resource group actions

    node_down_complete

    site_down

    site_down_remote_complete

    Planning node_up and node_down Events

    A node_up event is initiated by a node joining the cluster at cluster startup, or by rejoining the cluster at a later time.

    Establishing Initial Cluster Membership

    This section describes the steps taken by the Cluster Manager on each node when the cluster starts and the initial membership of the cluster is established. It shows how the Cluster Managers establish communication among the member nodes, and how the cluster resources are distributed as the cluster membership grows.

    First Node Joins the Cluster

      1. HACMP cluster services are started on Node A. The RSCT subsystem examines the state of the network interfaces and begins communicating with the RSCT subsystem on the other cluster nodes. The Cluster Manager on Node A accumulates the initial state information and then broadcasts a message indicating that it is ready to join the cluster on all configured networks to which it is attached (node_up).
      2. Node A interprets the lack of a response to mean that it is the first node in the cluster.
      3. Node A then initiates a process_resources script, which processes the node’s resource configuration information.
      4. When the event processing has completed, Node A becomes a member of the cluster. HACMP runs node_up_complete.
    All resource groups defined for Node A are available for clients at this point.
    If the Online on First Available Node is specified as a startup behavior for resource groups, then Node A will take control of all these resource groups.
    If Node A is defined as part of a non-concurrent resource group that has the Online Using Node Distribution Policy startup, then this node takes control of the first resource group listed in the node environment.
    If Node A is defined as part of a concurrent access resource configuration, it makes those concurrent resources available.
    For resource groups with Online on First Available Node startup policy and the settling time configured, Node A waits for the settling time interval before acquiring such resource groups. The settling time facilitates waiting for a higher priority node to join the cluster.

    Second Node Joins the Cluster

      5. HACMP Cluster Services are started on Node B. Node B broadcasts a message indicating that it is ready to join the cluster on all configured networks to which it is attached (node_up).
      6. Node A receives the message and sends an acknowledgment.
      7. Node A adds Node B to a list of active nodes, starts keepalive communications with Node B.
      8. Node B receives the acknowledgment from Node A. The message includes information identifying Node A as the only other member of the cluster. (If there were other members, Node B would receive the list of members.)
      9. Node B processes the process_resources script and sends a message to let other nodes know when it is finished.
    Processing the process_resources script may include Node A releasing resources it currently holds, if both nodes are in the resource group nodelist for one or more resource groups and Node B has a higher priority for one or more of those resources. This is true only for resource groups that support fallback.
    Note that if the delayed fallback timer is configured, any resource group that is online on node A and for which Node B is a higher priority node will fall back to node B at the time specified by the delayed fallback timer.
      10. Meanwhile, Node B has been monitoring and sending keepalives, and waiting to receive messages about changes in the cluster membership. When Node B finishes its own process_resources script, it notifies Node A.
    During its node_up processing, Node B claims all resource groups configured for it (see step 3). Note that if the delayed fallback timer is configured, the resource group will fall back to a higher priority node at the time specified by the timer.
      11. Both nodes process a node_up_complete event simultaneously.
    At this point, Node B includes Node A in its list of member nodes and its list of keepalive.
      12. Node B sends a “new member” message to all possible nodes in the cluster.
      13. When Node A gets the message, it moves Node B from its list of active nodes to its list of member nodes.
    At this point, all resource groups configured for Node A and Node B are available to cluster clients.

    Remaining Nodes Join the Cluster

      14. As HACMP cluster Services start on each remaining cluster node, steps 4 through 9 repeat, with each member node sending and receiving control messages, and processing events in the order outlined. Note especially that all nodes must confirm the node_up_complete event before completing the processing of the event and moving the new node to the cluster member list.
    As new nodes join, the RSCT subsystem on each node establishes communications and begins sending heartbeats. Nodes and adapters join RSCT heartbeat rings that are formed according to the definition in the HACMP configuration. When the status of a NIC or node changes, the Cluster Manager receives the state change and generates the appropriate event.

    Rejoining the Cluster

    When a node rejoins the cluster, the Cluster Managers running on the existing nodes initiate a node_up event to acknowledge that the returning node is up. When these nodes have completed processing their process_resources script, the new node then processes a node_up event so that it can resume providing cluster services.

    This processing is necessary to ensure the proper balance of cluster resources. As long as the existing Cluster Managers first acknowledge a node rejoining the cluster, they can release any resource groups belonging to that node if necessary. Whether or not the resource groups are actually released in this situation depends on how the resource groups are configured for takeover (or dependencies). The new node can then start its operations.

    Sequence of node_up Events

    The following list describes the sequence of node_up events:

    node_up

    This event occurs when a node joins or rejoins the cluster.

    process_resources
    This script calls the sub_events needed for the node to acquire the service address (or shared address), gets all its owned (or shared) resources, and take the resources. This includes making disks available, varying on volume groups, mounting filesystems, exporting filesystems, NFS-mounting filesystems, and varying on concurrent access volume groups.
    It may take other actions depending on the resource type configuration, such as getting Fast Connect, and so forth.
    process_resources_complete
    Each node runs this script when resources have been processed.
    node_up_complete
    This event occurs after the resources are processed and the node_up event has successfully completed. Depending on whether the node is local or remote, this event calls the start_server script to start application servers on the local node, or allows the local node to do an NFS mount only after the remote node is completely up.

    node_up Events with Dependent Resource Groups or Sites

    If either sites or dependencies between any resource groups are configured in the cluster, HACMP processes all events related to resource groups in the cluster with the use of rg_move events that are launched for all resource groups when node_up events occur.

    The Cluster Manager then takes into account all node and site policies, especially the configuration of dependencies and sites for resource groups, and the current distribution and state of resource groups on all nodes in order to properly handle any acquiring, releasing, bringing online or taking offline of resource groups before node_up_complete can run.

    Parent and child or location dependencies between resource groups offer a predictable and reliable way of building clusters with multi-tiered applications. However, node_up processing in clusters with dependencies could take more time than the parallel processing in clusters without resource groups’ dependencies. You may need to adjust the config_too_long warning timer for node_up.

    For more information on processing in clusters where dependencies are specified between some of the resource groups, see the Administration Guide.

    node_down Events

    Cluster nodes exchange keepalives through the RSCT subsystem with peer nodes so that the Cluster Manager can track the status of the nodes in the cluster. A node that fails or is stopped purposefully no longer sends keepalives. When RSCT indicates all network interfaces are down, or a node does not respond to heartbeats, the Cluster Managers then run a node_down event. Depending on the cluster configuration, the peer nodes then take the necessary actions to get critical applications up and running and to ensure data remains available.

    A node_down event can be initiated by a node:

  • Stopping cluster services and bringing resource groups offline
  • Stopping cluster services and moving resource groups to another node
  • Stopping cluster services and placing resource groups in an UNMANAGED state.
  • Failing.
  • Stopping Cluster Services and Bringing Resource Groups Offline

    When you stop cluster services and bring resource groups offline,, HACMP stops on the local node after the node_down_complete event releases the stopped node’s resources. The other nodes run the node_down_complete event and do not take over the resources of the stopped node.

    Stopping Cluster Services and Moving Resource Groups

    When you stop cluster services and move the resource groups to another node,, HACMP stops after the node_down_complete event on the local node releases its resource groups. The surviving nodes in the resource group nodelist take over these resource groups.

    Stopping Cluster Services and Placing Resource Groups in an UNMANAGED State

    When you stop cluster services and place resource groups in an UNMANAGED state,, HACMP software stops immediately on the local node. Node_down_complete is run on the stopped node. The Cluster Managers on remote nodes process node_down events, but do not take over any resource groups. The stopped node does not release its resource groups.

    Node Failure

    When a node fails, the Cluster Manager on that node does not have time to generate a node_down event. In this case, the Cluster Managers on the surviving nodes recognize a node_down event has occurred (when they realize the failed node is no longer communicating) and trigger node_down events.

    This initiates a series of sub_events that reconfigure the cluster to deal with that failed node. Based upon the cluster configuration, surviving nodes in the resource group nodelist will take over the resource groups.

    Sequence of node_down Events

    The following list describes the default parallel sequence of node_down events:

    node_down

    This event occurs when a node intentionally leaves the cluster or fails.

    In some cases, the node_down event receives the forced parameter. For more information, see the section Stopping Cluster Services and Placing Resource Groups in an UNMANAGED State.

    All nodes run the node_down event.

    All nodes run the process_resources script. Once the Cluster Manager has evaluated the status of affected resource groups and the configuration, it initiates a series of sub_events, to redistribute resources as configured for fallover or fallback.

    All nodes run the process_resources_complete script.

    node_down_complete

    Network Events

    HACMP distinguishes between two types of network failure, local and global, and uses different network failure events for each type of failure. The network failure event script is often customized to send mail.

    Sequence of Network Events

    The following list shows the network events:

    network_down (Local)
    This event occurs when only a particular node has lost contact with a network. The event has the following format:
    network_down node_name network_name
    The Cluster Manager takes selective recovery action to move affected resource groups to other nodes. The results of the recovery actions are logged to hacmp.out.
    network_down (Global)
    This event occurs when all of the nodes connected to a network have lost contact with a network. It is assumed in this case that a network-related failure has occurred rather than a node-related failure. This event has the following format:
    network_down -1 network_name
    Note: The -1 argument is - one. This argument indicates that the network_down event is global.
    The global network failure event mails a notification to the system administrator, but takes no further action since appropriate actions depend on the local network configuration.
    network_down_complete (Local)
    This event occurs after a local network failure event has completed. It has the following format:
    network_down_complete node_name network_name
    When a local network failure event occurs, the Cluster Manager takes selective recovery actions for resource groups containing a service NIC connected to that network.
    network_down_complete (Global)
    This event occurs after a global network failure event has completed. It has the following format:
    network_down_complete -1 network_name
    The default processing for this event takes no actions since appropriate actions depend on the network configuration.
    network_up
    This event occurs when the Cluster Manager determines a network has become available for use.
    Whenever a network becomes available again, HACMP attempts to bring resource groups containing service IP labels on that network back online.
    network_up_complete
    This event occurs only after a network_up event has successfully completed. This event is often customized to notify the system administrator that an event demands manual attention.
    Whenever a network becomes available again, HACMP attempts to bring resource groups containing service IP labels on that network back online.
    For more information, see Appendix B: Resource Group Behavior during Cluster Events in the Administration Guide.

    Network Interface Events

    The Cluster Manager reacts to the failure, unavailability, or joining of network interfaces by initiating one of the following events. (For exceptions, see the section that follows on single interface situations.)

    swap_adapter
    This event occurs when the interface hosting a service IP label on a node fails. The swap_adapter event moves the service IP label onto a non-service interface on the same HACMP network and then reconstructs the routing table. If the service IP label is an IP alias, it is put onto the non-service interface as an additional IP label. Otherwise, the non-service IP label is removed from the interface and placed on the failed interface.
    If the interface now holding the service IP label later fails, swap_adapter can switch to another non-service interface if one exists.
    If a persistent node IP label was assigned to the failed interface, it moves with the service label to the non-service interface.
    Note:  HACMP removes IP aliases from interfaces at shutdown. It creates the aliases again when the network becomes operational. The hacmp.out file records these changes.
    swap_adapter_complete
    This event occurs only after a swap_adapter event has successfully completed. The swap_adapter_complete event ensures that the local ARP cache is updated by deleting entries and pinging cluster IP addresses.
    fail_standby
    This event occurs if a non-service interface fails or becomes unavailable as the result of an IP address takeover. The fail_standby event displays a console message indicating that a non-service interface has failed or is no longer available.
    join_standby
    This event occurs if a non-service interface becomes available. The join_standby event displays a console message indicating that a non-service interface has become available.
    In HACMP 4.5 and up, whenever a network interface becomes available, HACMP attempts to bring resource groups back online. For more information, see Resource Group Recovery when the Network or Interface is Up in Appendix B: Resource Group Behavior During Cluster Events in the Administration Guide.
    fail_interface
    This event occurs if an interface fails and there is no non-service interface available to recover the service address.
    Takeover service addresses are monitored. It is possible to have an interface failure and no available interface for recovery and another interface up on the same network.
    This event applies to all networks, including those using IP aliasing for recovery. Note that when a non-service NIC fails on a network configured for IPAT via IP Aliases, the fail_interface event is run. An rg_move event is then triggered if the interface that failed was a service label.
    join_interface
    This event occurs if a non-service interface becomes available or recovers. This event applies to all networks, including those using IPAT via IP Aliases for recovery.
    Note that networks using IP aliases by definition do not have non-service interfaces defined, so the join_interface event that is run in this case simply indicates that a non-service interface joins the cluster.

    Failure of a Single Network Interface Does Not Generate Events

    If you have only one network interface active on a network, the Cluster Manager cannot generate a failure event for that network interface, as it has no peers with which to communicate to determine the health of the interface. Situations that have a single network interface include:

  • One-node clusters
  • Multi-node clusters with only one node active
  • Failure of all but one interface on a network, one at a time.
  • For example, starting a cluster with all service or non-service interfaces disconnected produces the following results:

      1. First node up: No failure events are generated.
      2. Second node up: One failure event is generated.
      3. Third node up: One failure event is generated.
      4. And so on.
      5. See Identifying Service Adapter Failure for Two-Node Clusters in Chapter 3: Planning Cluster Network Connectivity for information on handling this situation.

    Cluster-Wide Status Events

    By default, the Cluster Manager recognizes a time limit for reconfiguring a cluster and processing topology changes. If the time limit is reached, the Cluster Manager initiates a config_too_long event. Whole cluster status events include:

    config_too_long
    This system warning occurs each time a cluster event takes more time to complete than a specified time-out period. This message is logged in the hacmp.out file.
    The time-out period for all events is set to 360 seconds by default. You can use SMIT to customize the time period allowed for a cluster event to complete before HACMP issues a config_too_long warning for it. For more information, see Chapter 5: Configuring Cluster Events in the Administration Guide.
    reconfig_topology_start
    This event marks the beginning of a dynamic reconfiguration of the cluster topology.
    reconfig_topology_complete
    This event indicates that a cluster topology dynamic reconfiguration has completed.
    reconfig_resource_acquire
    This event indicates that cluster resources that are affected by dynamic reconfiguration are being acquired by appropriate nodes.
    reconfig_resource_release
    This event indicates that cluster resources affected by dynamic reconfiguration are being released by appropriate nodes.
    reconfig_resource_complete
    This event indicates that a cluster resource dynamic reconfiguration has successfully completed.
    cluster_notify
    This event is triggered by verification when automatic cluster configuration monitoring detects errors in cluster configuration. The output of this event is logged in hacmp.out throughout the cluster on each node that is running cluster services.
    event_error
    Prior to version HACMP 5.2, non-recoverable event script failures resulted in the event_error event running on the cluster node where the failure occurred. The remaining cluster nodes do not indicate the failure. All cluster nodes run the event_error event if any node has a fatal error. All nodes log the error and call out the failing node name in the hacmp.out log file.

    Resource Group Event Handling and Recovery

    The Cluster Manager keeps track of the resource group node priority policies, site policies, and any dependencies configured as well as the necessary topology information and resource group status so that it can take a greater variety of recovery actions, often avoiding the need for user intervention. Event logging includes a detailed summary for each high-level event to help you understand exactly what actions were taken for each resource group during the handling of failures.

    For more information about how resource groups are handled in HACMP, see Appendix B: Resource Group Behavior During Cluster Events in the Administration Guide. It contains information about the following HACMP functions:

  • Selective fallover for handling resource groups
  • Handling of resource group acquisition failures
  • Handling of resource groups configured with IPAT via IP Aliases
  • Handling of HACMP/XD resource groups.
  • Note: 1. If dependencies between resource groups or sites are specified, HACMP processes events in a different sequence than usual. For more information see the section on the resource_state_change event below.
    2. The lists in this section do not include all possible resource group states: If sites are defined, a primary and a secondary instance of the resource group could be online, offline, in the error state, or unmanaged. Also, the resource group instances could be in the process of acquiring or releasing. The corresponding resource group states are not listed here, but have descriptive names that explain which actions take place.

    Resource Group Events

    The Cluster Manager may move resource groups as a result of recovery actions taken during the processing of events such as node down.

    rg_move
    This event moves a specified resource group from one node to another.
    rg_move_complete
    This action indicates that the rg_move event has successfully completed.
    resource_state_change
    This trigger event is used for resource group recovery if resource group dependencies or sites are configured in the cluster. This action indicates that the Cluster Manager needs to change the state of one or more resource groups, or there is a change in the state of a resource managed by the Cluster Manager. This event runs on all nodes if one of the following situations occurs:
    • Application monitoring failure
    • Selective fallover for loss of volume group
    • Local network down
    • WAN failure
    • Resource Group Acquisition Failure
    • Resource Group Recovery on IP Interface Availability
    • Expiry of Settling timer for a resource group
    • Expiry of fallback timer for a resource group
    While the event runs, the state of the resource group is changed to TEMP_ERROR. This is broadcast to all nodes.
    resource_state_change_complete
    This event runs when the resource_state_change event completes successfully. You can add pre- or post-events here if necessary. You may want to be notified about resource state changes, for example.
    external_resource_state_change
    This event runs when the user moves a resource group and HACMP uses the dynamic processing path to handle the request since resource group dependencies or sites are configured in the cluster.
    external_resource_state_change_complete
    This event runs when the external_esource_state_change event completes successfully.

    Resource Group sub_events

    Handling of individual resources during the processing of an event may include the following actions. For example, when a filesystem is in the process of being unmounted and mounted it is taken offline and then released by one node. Then it is acquired by another node and brought online.

    releasing
    This action indicates that a resource group is being released either to be brought offline or to be acquired on another node.
    acquiring
    This action is used when a resource group is being acquired on a node.
    rg_up
    This action indicates that the resource group is online.
    rg_down
    This action indicates that the resource group is offline.
    rg_error
    This action indicates that the resource group is in error state.
    rg_temp_error_
    state
    This action indicates that the resource group is in a temporary error state. For example, it occurs due to a local network or an application failure. This state informs the Cluster Manager to initiate an rg_move event for this resource group. Resource groups should not be in this state when the cluster is stable.
    rg_acquiring _secondary
    The resource group is coming online at the target site (only the replicated resources are online).
    rg_up_secondary
    The resource group is online in the secondary role at the target site (only replicated resources are online).
    rg_error_
    secondary
    The resource group at the site receiving the mirror data is in error state.
    rg_temp_error_
    secondary
    The resource group at the site receiving the mirror data is in temporary error state.

    This list includes some but not all possible states of resource groups:

    After the completion of an event, the Cluster Manager has the state of resources and resource groups involved in the event. The Cluster Manager then analyzes the resource group information that it maintains internally and determines whether recovery events need to be queued for any of the resource groups. The Cluster Manager also uses status of individual resources in resource groups to print out a comprehensive event summary to the hacmp.out log file.

    For each resource group, the Cluster Manager keeps track of the nodes on which the resource group has tried to come online and failed. This information is updated when recovery events are processed. The Cluster Manager resets the nodelist for a resource group as soon as the resource group moves to the online or error states.

    In HACMP 5.4, the resource group ERROR states are displayed with more detail than before:

    Resource Group is in ERROR because
    HACMP Displays this Message
    Parent group is NOT ONLINE; as a result, the child resource group is unavailable
    OFFLINE due to parent offline
    Higher priority Different-Node Dependency group is ONLINE
    OFFLINE due to lack of available node
    Another distributed group was acquired
    OFFLINE
    Group is falling over and in the OFFLINE state temporarily
    OFFLINE

    Manual intervention is only required when a resource group remains in ERROR state after the event processing finishes.

    When a resource group is in the process of being moved, application monitoring is suspended and resumed appropriately. The Application Monitor sees that the application is in “recovery” state while the event is being processed.

    resume_appmon
    This action is used by the Application Monitor to resume monitoring of an application.
    suspend_appmon
    This action is used by the Application Monitor to suspend monitoring of an application.

    For more information on how resource groups are handled in HACMP, see Appendix B: Resource Group Behavior During Cluster Events in the Administration Guide. It contains information on selective fallover for handling resource groups, handling of resource group acquisition failures, handling of resource groups configured with IPAT via IP Aliases, and HACMP/XD resource groups.

    Customizing Cluster Event Processing

    The Cluster Manager’s ability to recognize a specific series of events and sub_events permits a very flexible customization scheme. The HACMP event customization facility lets you customize cluster event processing to your site. Customizing event processing allows you to provide a highly efficient path to the most critical resources in the event of a failure. However, this efficiency depends on your configuration.

    As part of the planning process, you need to decide whether to customize event processing. If the actions taken by the default scripts are sufficient for your purposes, then you do not need to do anything further to configure events during the configuration process.

    If you do decide to customize event processing to your environment, use the HACMP event customization facility described in this chapter. If you customize event processing, register these user-defined scripts with HACMP during the configuration process.

    If necessary, you can modify the default event scripts or write your own. Modifying the default event scripts or replacing them with your own is strongly discouraged. This makes maintaining, upgrading, and troubleshooting an HACMP cluster much more difficult. Again, if you write your own event customization scripts, you need to configure the HACMP software to use those scripts.

    The event customization facility includes the following functions:

  • Event notification
  • Pre- and post-event processing
  • Event recovery and retry.
  • Complete customization of an event includes a notification to the system administrator (before and after event processing), and user-defined commands or scripts that run before and after event processing, as shown in the following example:

    Notify sysadmin of event to be processed
    Pre-event script or command
    HACMP event script
    Post-event script or command
    Notify sysadmin that event processing is complete

    For a discussion of event emulation, which lets you emulate HACMP event scripts without actually affecting the cluster, see the section Event Emulation in Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide.

    Event Notification

    You can specify a notify command that sends mail to indicate that an event is about to happen (or has just occurred), and that an event script succeeded or failed. You configure notification methods for cluster events In SMIT under the Change/Show a Custom Cluster Events panel. For example, a site may want to use a network failure notification event to inform system administrators that traffic may have to be re-routed. Afterwards, you can use a network_up notification event to tell system administrators that traffic can again be serviced through the restored network.

    Event notification in an HACMP cluster can also be done using pre- and post-event scripts. For more information, see Chapter 5: Configuring Cluster Events in the Administration Guide.

    You can also configure a custom remote notification in response to events. For details, see the section Custom Remote Notification of Events.

    Pre- and Post-Event Scripts

    You can specify commands or multiple user-defined scripts that execute before and after the Cluster Manager calls an event script.

    For example, you can specify one or more pre-event scripts that run before the node_down event script is processed. When the Cluster Manager recognizes that a remote node is down, it first processes these user-defined scripts. One such script may designate that a message be sent to all users to indicate that performance may be affected (while adapters are swapped, while application servers are stopped and restarted). Following the node_down event script, a post processing event script for network_up notification may be included to broadcast a message to all users that a certain system is now available at another network address.

    The following scenarios are other examples of where pre- and post-event processing are useful:

  • If a node_down event occurs, site specific actions may dictate that a pre-event script for the start_server subevent script be used. This script could notify users on the server about to takeover for the downed application server that performance may vary, or that they should seek alternate systems for certain applications.
  • Due to a network being down, a custom installation may be able to re-route traffic through other machines by creating new IP routes. The network_up and network_up_complete event scripts could reverse the procedure, ensuring that the proper routes exist after all networks are functioning.
  • A site may want to stop cluster services and move resource groups to another node as a post-event if a network has failed on the local node (but otherwise the network is functioning).
  • Note that when writing your HACMP pre- or post-event, none of the shell environment variables defined in /etc/environment are available to your program. If you need to use any of these variables, explicitly source them by including this line in your script: “. /etc/environment”.

    If you plan to create pre- or post-event scripts for your cluster, be aware that your scripts will be passed the same parameters used by the HACMP event script you specify. For pre- and post-event scripts, the arguments passed to the event command are the event name, event exit status, and the trailing arguments passed to the event command.

    All HACMP event scripts are maintained in the /usr/es/sbin/cluster/events directory. The parameters passed to your script are listed in the event script headers.

    Warning: Be careful not to kill any HACMP processes as part of your script. If you are using the output of the ps command and using a grep to search for a certain pattern, make sure the pattern does not match any of the HACMP or RSCT processes.

    Pre- and Post-Event Scripts May No Longer Be Needed

    If you migrated from a previous version of HACMP, some of your existing pre- and post-event scripts may no longer be needed. HACMP itself handles more situations.

    Using the Forced Varyon Attribute Instead of Pre- or Post-Event Scripts

    Prior to HACMP 5.1, you could use either pre- or post- event scripts or event recovery routines to achieve a forced activation of volume groups in the case when the activation and acquisition of raw physical volumes and volume groups fails on a node. In HACMP 5.1 and up, you can still use the previously mentioned methods, or you can specify the forced varyon attribute for a volume group. For more information, see Using Forced Varyon.

    If the forced varyon attribute is specified for a volume group, special scripts to force a varyon operation are no longer required.

    event_error Now Indicates Failure on a Remote Node

    Prior to HACMP 5.2, non-recoverable event script failures result in the event_error event being run on the cluster node where the failure occurred. The remaining cluster nodes do not indicate the failure. With HACMP 5.2 and up, all cluster nodes run the event_error event if any node has a fatal error. All nodes log the error and call out the failing node name in the hacmp.out log file.

    If you have added pre- or post-events for the event_error event, be aware that those event methods are called on every node, not just the failing node, starting with HACMP 5.2.

    A Korn shell environment variable indicates the node where the event script failed: EVENT_FAILED_NODE is set to the name of the node where the event failed. Use this variable in your pre- or post-event script to determine where the failure occurred.

    The variable LOCALNODENAME identifies the local node; if LOCALNODENAME is not the same as EVENT_FAILED_NODE then the failure occurred on a remote node.

    Resource Groups Processed in Parallel and Using Pre- and Post-Event Scripts

    Resource groups are processed in parallel by default in HACMP 5.4 unless you specify a customized serial processing order for all or some of the resource groups in the cluster.

    When resource groups are processed in parallel, fewer cluster events occur in the cluster and appear in the event summaries.

    The use of parallel processing reduces the number of particular cluster events for which you can create customized pre- or post-event scripts. If you start using parallel processing for a list of resource groups in your configuration, be aware that some of your existing pre- and post-event scripts may not work for these resource groups.

    In particular, only the following events take place during parallel processing of resource groups:

    node_up

    node_down

    acquire_svc_addr

    acquire_takeover_addr

    release_svc_addr

    release_takeover_addr

    start_server

    stop_server

    Note: In parallel processing, these events apply to an entire list of resource groups that are being processed in parallel, and not to a single resource group, as in serial processing. Prior to HACMP 5.1, if you had pre- and post-event scripts configured for these events, then, after migration, these event scripts are launched not for a single resource group but for a list of resource groups, and may not work as expected.

    The following events do not occur in parallel processing of resource groups:

    get_disk_vg_fs

    release_vg_fs

    node_up_ local

    node_up_remote

    node_down_local

    node_down_remote

    node_up_local_complete

    node_up_remote_complete

    node_down_local_complete

    node_down_remote_complete

    Consider these events that do not occur in parallel processing if you have pre- and post-event scripts and plan to upgrade to the current version.

    If you want to continue using pre- and post-event scripts, you could have one of the following cases:

    Scenario
    What You Should Do
    You would like to use pre- and post-event scripts for
    newly added resource groups.
    All newly added resource groups are processed in parallel, which results in fewer cluster events. Therefore, there is a limited choice of events for which you can create pre- and post-event scripts.
    In this case, if you have resources in resource groups that require handling by pre- and post-event scripts written for specific cluster events, include these resource groups in the serial processing lists in SMIT, to ensure that specific pre- and post-event scripts can be used for these resources.
    For information about specifying serial or parallel processing of resource groups, see the section Configuring Processing Order for Resource Groups in Chapter 4: Configuring HACMP Resource Groups (Extended) in the Administration Guide.
    You upgrade to HACMP 4.5 or higher and choose parallel processing for some of the pre-existing resource groups in your configuration.
    If, before migration you had configured customized pre- or post-event scripts in your cluster, then now that these resource groups are processed in parallel after migration, the event scripts for a number of events cannot be utilized for these resource groups, since these events do not occur in parallel processing.
    If you want existing event scripts to continue working for the resource groups, include these resource groups in the serial ordering lists in SMIT, to ensure that the pre- and post-event scripts can be used for these resources.
    For information about specifying serial or parallel processing of resource groups, see the section Configuring Processing Order for Resource Groups in Chapter 4: Configuring HACMP Resource Groups (Extended) in the Administration Guide.

    Dependent Resource Groups and Pre- and Post-Event Scripts

    Prior to HACMP 5.2, to achieve resource group and application sequencing, system administrators had to build the application recovery logic in their pre- and post-event processing scripts. Every cluster would be configured with a pre-event script for all cluster events, and a post-event script for all cluster events.

    Such scripts could become all-encompassing case statements. For instance, if you want to take an action for a specific event on a specific node, you need to edit that individual case, add the required code for pre- and post-event scripts, and also ensure that the scripts are the same across all nodes.

    To summarize, even though the logic of such scripts captures the desired behavior of the cluster, they can be difficult to customize and even more difficult to maintain later on, when the cluster configuration changes.

    If you are using pre-and post-event scripts or other methods, such as resource group processing ordering to establish dependencies between applications that are supported by your cluster, then these methods may no longer be needed or can be significantly simplified. Instead, you can specify dependencies between resource groups in a cluster. For more information on planning dependent resource groups, see Resource Group Dependencies in Chapter 6: Planning Resource Groups.

    If you have applications included in dependent resource groups and still plan to use pre- and post-event scripts in addition to the dependencies, additional customization of pre- and post-event scripts may be needed. To minimize the chance of data loss during the application stop and restart process, customize your application server scripts to ensure that any uncommitted data is stored to a shared disk temporarily during the application stop process and read back to the application during the application restart process. It is important to use a shared disk as the application may be restarted on a node other than the one on which it was stopped.

    For information on how to configure resource group dependencies, see the Administration Guide.

    Event Recovery and Retry

    You can specify a command that attempts to recover from an event script failure. If the recovery command succeeds and the retry count for the event script is greater than zero, the event script is run again. You can also specify the number of times to attempt to execute the recovery command.

    For example, a recovery command could include the retry of unmounting a filesystem after logging a user off, and making sure no one was currently accessing the filesystem.

    If a condition that affects the processing of a given event on a cluster is identified, such as a timing issue, you can insert a recovery command with a retry count high enough to be sure to cover for the problem.

    Custom Remote Notification of Events

    You can define a notification method through the SMIT interface to issue a customized page in response to a cluster event. In HACMP 5.4, you can send text messaging notification to any number including a cell phone, or mail to an email address.

    You can use the verification automatic monitoring cluster_notify event to configure an HACMP Remote Notification Method to send out a message in case of detected errors in cluster configuration. The output of this event is logged in hacmp.out throughout the cluster on each node that is running cluster services.

    You can configure any number of notification methods, for different events and with different text or numeric messages and telephone numbers to dial. The same notification method can be used for several different events, as long as the associated text message conveys enough information to respond to all of the possible events that trigger the notification.

    After configuring the notification method, you can send a test message to make sure everything is configured correctly and that the expected message will be sent for a given event.

    Planning for Custom Remote Notification

    Remote notification requires the following conditions:

  • A tty port used for paging cannot also be used for heartbeat traffic or for the DBFS function of HAGEO.
  • Any tty port specified must be defined to AIX 5L and must be available.
  • Each node that may send a page or text messages must have an appropriate modem installed and enabled.
  • Note: HACMP checks the availability of the tty port when the notification method is configured and before a page is issued. Modem status is not checked.
  • Each node that may send email messages from the SMIT panel using AIX 5L mail must have a TCP/IP connection to the Internet.
  • Each node that may send text messages to a cell phone must have an appropriate Hayes-compatible dialer modem installed and enabled.
  • Each node that may transmit an SMS message wirelessly must have a Falcom-compatible GSM modem installed in the RS232 port with the password disabled. Ensure that the modem connects to the cell phone system.
  • For information about how to configure a remote notification method, see Chapter 5: Configuring Cluster Events in the Administration Guide.

    Customizing Event Duration Time Until Warning

    Depending on cluster configuration, the speed of cluster nodes and the number and types of resources that need to move during cluster events, certain events may take different time intervals to complete. For such events, you may want to customize the time period HACMP waits for an event to complete before issuing the config_too_long warning message.

    Cluster events that include acquiring and releasing resource groups take a longer time to complete. They are considered slow events and include the following:

  • node_up
  • node_down
  • reconfig_resource
  • rg_move
  • site_up
  • site_down.
  • Customizing event duration time for slow cluster events lets you avoid getting unnecessary system warnings during normal cluster operation.

    All other cluster events are considered fast events. These events typically take a shorter time to complete and do not involve acquiring or releasing resources. Examples of fast events include:

  • swap_adapter
  • events that do not handle resource groups.
  • Customizing event duration time before receiving a warning for fast events allows you to take corrective action faster.

    Consider customizing Event Duration Time Until Warning if, in the case of slow cluster events, HACMP issues warning messages too frequently; or, in the case of fast events, you want to speed up detection of a possible problem event.

    Note: Dependencies between resource groups offer a predictable and reliable way of building clusters with multi-tier applications. However, processing of some cluster events (such as node_up) in clusters with dependencies could take more time than processing of those events where all resource groups are processed in parallel. Whenever resource group dependencies allow, HACMP processes multiple non-concurrent resource groups in parallel, and processes multiple concurrent resource groups on all nodes at once. However, a resource group that is dependent on other resource groups cannot be started until the others have been started first. The config_too_long warning timer for node_up events should be set large enough to allow for this.

    For information about how to change the event duration before receiving a system warning, see the section Tuning Event Duration Time Until Warning in Chapter 5: Configuring Cluster Events in the Administration Guide.

    User-Defined Events

    You can define your own events for which HACMP can run your specified recovery programs. This adds a new dimension to the predefined HACMP pre- and post-event script customization facility.

    You specify the mapping between events that you define and recovery programs defining the event recovery actions through the SMIT interface. This lets you control both the scope of each recovery action and the number of event steps synchronized across all nodes.

    For details about registering events, see the IBM RSCT documentation.

    Note: The RSCT Event Management subsystem has been replaced by the Resource Monitoring and Control (RMC) subsystem in HACMP 5.2 and up. See the chapter on Upgrading an HACMP Cluster for more information.

    An RMC resource refers to an instance of a physical or logical entity that provides services to some other component of the system. The term resource is used very broadly to refer to software as well as hardware entities. For example, a resource could be a particular file system or a particular host machine. A resource class refers to all resources of the same type, such as processors or host machines.

    A resource manager (daemon) maps actual entities to RMC’s abstractions. Each resource manager represents a specific set of administrative tasks or system functions. The resource manager identifies the key physical or logical entity types related to that set of administrative tasks or system functions, and defines resource classes to represent those entity types.

    For example, the Host resource manager contains a set of resource classes for representing aspects of a individual host machine. It defines resource classes to represent:

  • Individual machines (IBM.Host)
  • Paging devices (IBM.PagingDevice)
  • Physical volumes (IBM.PhysicalVolume)
  • Processors (IBM.Processor)
  • A host’s identifier token (IBM.HostPublic)
  • Programs running on the host (IBM.Program)
  • Each type of adapter supported by the host, including ATM adapters (IBM.ATMDevice), Ethernet adapters (IBM.EthernetDevice), FDDI adapters (IBM.FDDIDevice), and Token Ring adapters (IBM.TokenRingDevice).
  • The AIX 5L resource monitor generates events for OS-related resource conditions such as the percentage of CPU that is idle (IBM.Host.PctTotalTimeIdle) or percentage of disk space in use (IBM.PhysicalVolume.PctBusy). The program resource monitor generates events for process-related occurrences such as the unexpected termination of a process. It uses the resource attribute IBM.Program.ProgramName.

    Note that you cannot use the Event Emulator to emulate a user-defined event.

    Writing Recovery Programs

    A recovery program has a sequence of recovery command specifications, possibly interspersed with barrier commands.

    The format of these specifications follows:

    :node_set recovery_command expected_status NULL 
    

    Where:

  • node_set is a set of nodes on which the recovery program is to run
  • recovery_command is a quote-delimited string specifying a full path to the executable program. The command cannot include any arguments. Any executable program that requires arguments must be a separate script. The recovery program must be in this path on all nodes in the cluster. The program must specify an exit status.
  • expected_status is an integer status to be returned when the recovery command completes successfully. The Cluster Manager compares the actual status returned to the expected status. A mismatch indicates unsuccessful recovery. If you specify the character X in the expected status field, the Cluster Manager omits the comparison.
  • NULL is not used now, but is included for future functions.
  • You specify node sets by dynamic relationships. HACMP supports the following dynamic relationships:

  • all—the recovery command runs on all nodes in the current membership.
  • event—the node on which the event occurred.
  • other—all nodes except the one on which the event occurred.
  • The specified dynamic relationship generates a set of recovery commands identical to the original, except that a node id replaces node_set in each set of commands.

    The command string for user defined event commands must start with a slash (/). The clcallev command runs commands that do not start with a slash.

    Useful Commands and Reference for RMC Information

    To list all persistent attribute definitions for the IBM.Host RMC resource (selection string field):

    lsrsrcdef -e -A p IBM.Host 
    

    To list all dynamic attribute definitions for the IBM.Host RMC resource (Expression field):

    lsrsrcdef -e -A d IBM.Host 
    

    See Chapter 3, Managing and Monitoring Resources Using RMC and Resource Managers in the IBM RSCT for AIX 5L and Linux Administration Guide for more information on the SQL like expressions used to configure user defined events selection strings.

    Recovery Program Example

    A sample program to send a message to /tmp/r1.out that paging space is low on the node where the event occurred. For recovery program r1.rp, the SMIT fields would be filled in as follows:

    Event Name
    E_page_space(User-defined name)
    Recovery program path
    /r1.rp
    Resource name
    IBM.Host (cluster node)
    Selection string
    Name = ?”%” (name of node)
    Expression
    TotalPgSpFree < 256000 (VMM is within 200 MB of paging space warning level).
    The resource attribute plus the condition you want to flag.
    Rearm expression
    TotalPgSpFree >256000
    The resource attribute plus the adjusted condition.

    Where recovery program r1.rp is as follows:

    #format: 
    #relationship  >command to run  >expected status NULL 
    # 
    event  "/tmp/checkpagingspace"  0 NULL 
    

    Note that the recovery program does not execute a command with arguments itself. Instead, it points to a shell script, /tmp/checkpagingspace, which contains:

    #!/bin/ksh 
    /usr/bin/echo “Paging Space LOW!” > /tmp/r1.out 
    exit 0 
    

    Recovery Program for node_up Event Example

    The following example is a recovery program for the node_up event:

    #format: 
    #relationship  	command to run  	expected status NULL 
    # 
    other  "node_up"  0 NULL 
    # 
    barrier 
    # 
    event "node_up" 0 NULL 
    # 
    barrier 
    # 
    all "node_up_complete" X NULL 
    

    Barrier Commands

    You can put any number of barrier commands in the recovery program. All recovery commands before a barrier start in parallel. Once a node encounters a barrier command, all nodes must reach it before the recovery program continues.

    The syntax of the barrier command is barrier.

    Event Roll-up

    If there are multiple events outstanding simultaneously, you only see the highest priority event. Node events are higher priority than network events. But user-defined events, the lowest priority, do not roll up at all, so you see all of them.

    Event Summaries and Preamble

    When events are logged to a node’s hacmp.out log file, the verbose output contains numerous lines of event details followed by a concise event summary. The event summaries make it easier to scan the log for important cluster events.

    You can view a compilation of just the event summary portions of the past seven days’ hacmp.out files by using the View Event Summaries option in the Problem Determination Tools SMIT panel. The event summaries can be compiled even if you have redirected the hacmp.out file to a non-default location. The Display Event Summaries report also includes resource group information generated by the clRGinfo command. You can also save the event summaries to a specified file instead of viewing them through SMIT.

    When events handle resource groups with dependencies or sites, a preamble is written to the hacmp.out log file listing the plan of sub_events for handling the resource groups.

    For more information on this file, see the section Event Summaries in Chapter 2: Using Cluster Log Files in the Troubleshooting Guide.

    Completing the Cluster Event Worksheet

    The Cluster Event Worksheet helps you plan the cluster event processing for your cluster. Appendix A: Planning Worksheets includes the worksheet referenced in the following procedure.

    For more information about cluster events, see Chapter 5: Configuring Cluster Events in the Administration Guide.

    For each node in the cluster, repeat the steps in the following procedure on a separate worksheet. To plan the customized processing for a specific cluster event, enter a value in the fields only as necessary:

      1. Record the cluster name in the Cluster Name field.
      2. Record the custom cluster event description in the Cluster Event Description field.
      3. Record the full pathname of the cluster event method in the Cluster Event Method field.
      4. Fill in the name of the cluster event in the Cluster Event Name field.
      5. Fill in the full pathname of the event script in the Event Command field.
      6. Record the full pathname of the event notification script in the Notify Command field.
      7. Record the full text of the remote notification method in Remote Notification Message Text.
      8. Record the full pathname of the file containing the remote notification method text in Remote Notification Message Location.
      9. Record the name of the pre-event script in the Pre-Event Command field.
      10. Record the name of the post-event script in the Post-Event Command field.
      11. Record the full pathname of the event retry script in the Event Recovery Command field.
      12. Indicate the number of times to retry in the Recovery Command Retry field.
      13. Record the time to allot to process an event for a resource group before a warning is displayed in the Time Until Warning field.
    Repeat steps 3 through 13 for each event you plan to customize.

    Where You Go from Here

    You have now planned the customized and user-defined event processing for your cluster. You will next address cluster clients issues, described in Chapter 8: Planning for HACMP Clients.


    PreviousNextIndex