![]() ![]() ![]() |
Chapter 4: Configuring HACMP Cluster Topology and Resources (Extended)
This chapter describes how to configure cluster topology and resources for an HACMP cluster using the SMIT Extended Configuration path.
The main sections in this chapter include:
Understanding the Extended Configuration Options
In order to configure the less common cluster elements, or if connectivity to each of the cluster nodes is unavailable, you can manually enter the information in a way similar to previous releases of the HACMP software.
When using the HACMP Extended Configuration SMIT paths, if any components are defined on remote nodes, you must manually initiate the discovery of cluster information. That is, discovery is optional (rather than automatic, as it is when using the Initialization and Standard Configuration SMIT path).
Using the options under the Extended Configuration menu, you can add the basic components of a cluster to the HACMP Configuration Database, as well as many additional types of resources. Use the Extended Configuration path to customize the cluster for all the components, policies, and options that are not included in the Initialization and Standard Configuration menus.
Note that certain options for configuring networks or IP labels are available only under the Extended Configuration path. In particular, make sure you use the Extended Configuration path if you plan to:
Use IP Address Takeover via IP Replacement as a mechanism for binding IP labels/addresses to network interfaces. This option is available only under the Extended Configuration path in SMIT (configuring networks). Add or change an IP-based network. Add or change a non-IP-based network and devices. Configure persistent node IP labels, for cluster administrative purposes. Configure a distribution preference for service IP label aliases. Note: You can use either ASCII SMIT or WebSMIT to configure a cluster. For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
Steps for Configuring an HACMP Cluster Using the Extended SMIT Menu
These are the basic steps to configure an HACMP cluster using the Extended SMIT menus.
What You Do Description Step 1: Run discovery (Optional) Run discovery if you have already configured some or all of the cluster components. Running discovery retrieves current AIX 5L configuration information from all cluster nodes. This information is displayed in picklists to help you make accurate selections of existing components. HACMP informs you about which components have been discovered by the system. Predefined components (those that are supported but are not discovered) are also made available as selections.See Discovering HACMP-Related Information for details. Step 2: Configure, change or customize the cluster topology Under the Extended Topology Configuration menu, you can:
- Identify the nodes and establish communication paths between them using the Configure Nodes to an HACMP Cluster menu options. Here you name the cluster and select the nodes (listed in /etc/hosts) either by their names or their IP addresses. This gives HACMP the information it needs to communicate with the nodes that are participating in the cluster. Once each of the nodes is properly identified and working communications paths exist, you can run discovery to identify the basic components within the cluster.
The discovered hostnames are used as the node names and added to the HACMP Configuration Database (in particular, to HACMPnode ODM). The networks and the associated interfaces which share physical connectivity with two or more nodes in the cluster are automatically added to the HACMP Configuration Database (HACMPnetwork and HACMPadapter ODMs). Other discovered shared resource information includes PVIDs and volume groups.
- (Optional) Configure, change, or show sites.
- Configure, change, or show predefined or discovered IP-based networks, and predefined or discovered serial devices.
- Configure, change, show and update with AIX 5L settings HACMP Communication Interfaces and Devices. You can configure either previously defined, or previously discovered communication interfaces and devices.
- Configure, change, remove and show Network Interface Modules (NIMs).
- Configure, change, and show Persistent Node IP Labels.
Step 3: Configure or customize the Resources to be Made Highly Available Use the Extended Resource Configuration menu to configure resources that are to be shared among the nodes in the cluster, such that if one component fails, another component will automatically take its place.You can configure the standard resources as well as several others:
- IP labels
- Application servers
- Volume groups
- Concurrent volume groups
- Logical volumes
- Filesystems
- Application monitors
- Tape resources
- Communication adapters and links for the operating system
- HACMP communication interfaces and links
- Disk, volume group and filesystems methods for OEM disks, volumes and filesystems. In particular, you can configure Veritas volumes and filesystems to work in an HACMP cluster.
Step 4: Configure the resource groups Step 5: Assign the resources that are to be managed together into resource groups Place related resources into resource groups. Step 6: Make any further additions or adjustments to the cluster configuration (All Optional)
- Configure cluster security. See Chapter 17: Managing Cluster Security.
- Customize cluster events. See Chapter 6: Configuring Cluster Events.
- Configure HACMP file collections. For more information, see Managing HACMP File Collections in Chapter 7: Verifying and Synchronizing an HACMP Cluster.
- Configure performance tuning. See Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide.
- Configure a distribution preference for service IP label aliases. See Distribution Preference for Service IP Label Aliases: Overview in this chapter.
- Customize remote notification.
- Change attributes of nodes, communication interfaces and devices, networks, resources, or resource groups.
Step 7: Verify and synchronize the cluster configuration Use the Verify and Synchronize HACMP Configuration menu to guarantee the desired configuration is feasible with the configured physical connections and devices, and ensure that all nodes in the cluster have the same view of the configuration. Step 8: Display the cluster configuration (Optional) View the cluster topology and resources configuration. Step 9: Test cluster recovery procedures (Recommended) Run automated or custom tests before putting the cluster in the production environment. Select HACMP Cluster Test Tool from the Extended Configuration SMIT path.
Discovering HACMP-Related Information
Running the cluster discovery process is optional in the Extended Configuration SMIT path.
After you have configured and powered on all disks, communication devices, point-to-point networks and configured communication paths to other nodes, HACMP can automatically collect this information and display it in corresponding SMIT picklists, to help you make accurate selections of existing components. HACMP informs you about which components have been discovered by the system. Pre-defined components (those that are supported but are not discovered) are also made available as selections.
To run the HACMP cluster discovery process, take the following steps:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Discover HACMP-related Information from Configured Nodes and press Enter.
3. The software executes the discovery process.
Configuring Cluster Topology (Extended)
Complete the following procedures to define the cluster topology. You only need to perform these steps on one node. When you verify and synchronize the cluster topology, its definition is copied to the other node.
The Extended Topology Configuration panels include:
Configuring an HACMP Cluster
The only step necessary to configure the cluster is to assign the cluster name.
To assign a cluster name and configure a cluster:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure an HACMP Cluster > Add/Change/Show an HACMP Cluster and press Enter.
3. Enter field values as follows:
Cluster Name Enter an ASCII text string that identifies the cluster. The cluster name can include alphanumeric characters and underscores, but cannot have a leading numeric. Use no more than 32 characters. Do not use reserved names. For a list of reserved names see List of Reserved Words.
4. Press Enter.
5. Return to the Extended Topology Configuration SMIT panel.
Resetting Cluster Tunables
You can change the settings for a list of tunable values that were changed during the cluster maintenance and reset them to their default settings, or installation-time cluster settings.
Use this option to reset all the tunables (customizations) made to the cluster. Using this option returns all tunable values to their default values but does not change the cluster configuration. HACMP takes a snapshot file prior to resetting and informs you about the name and location of the snapshot file. You can choose to have HACMP synchronize the cluster when this operation is complete.
For instructions, see the Resetting HACMP Tunable Values section in Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide. For a list of what tunable values will change, see the section on the List of Tunable Values provided in Chapter 1.
Configuring HACMP Nodes
To configure the cluster nodes:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Nodes > Add a Node to the HACMP Cluster and press Enter.
3. Enter field values as follows:
Once communication paths are established, HACMP adds a new node to the cluster.
Defining HACMP Sites
Site definitions are optional. They are supplied to provide easier integration with the HACMP/XD feature components: Metro Mirror (previously known as synchronous PPRC), GLVM, and HAGEO. Sites must also be defined if you want to use cross-site LVM mirroring.
If you define sites to be used in some other way, appropriate methods or customization must be provided to handle site operations. If sites are defined, site events run during node_up and node_down events. See Chapter 7: Planning Cluster Events, in the Planning Guide for more information.
For information and documentation for HACMP/XD, see About This Guide and the following URL:
http://www.rs6000.ibm.com/aix/library
For more information on cross-site LVM mirroring, see the Planning Guide.
If you are configuring sites, two sites must be configured and all nodes must belong to one of the two sites.
To add a site definition to an HACMP cluster:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Sites > Add a Site Definition and press Enter.
3. Enter the field values as follows. Also see the documentation for the product you are using that relies on the site definitions:
4. Press Enter to add the site definition to the HACMP Configuration Database.
5. Repeat the steps to add the second site.
Configuring HACMP Networks and Heartbeat Paths
To avoid a single point of failure, the cluster should have more than one network. Often the cluster has both IP and non-IP based networks, which allows HACMP to use different heartbeat paths. Use the Add a Network to the HACMP Cluster SMIT panel to configure HACMP IP and point-to-point networks.
To speed up the configuration process, run discovery before configuring networks.
You can use any or all of these methods for heartbeat paths:
Point-to-point networks IP-based networks, including heartbeating using IP aliases Heartbeating over disk. For information about configuring heartbeating over disk, see the section Configuring Heartbeating over Disk. Note: When using an SP Switch network, configure an additional network for HACMP. If only one network is configured, HACMP issues errors during the cluster verification.
If you need to configure networks in HACMP for using them with in a cluster with sites that has HACMP/XD for GLVM installed, see the HACMP/XD for GLVM Planning and Administration Guide for the descriptions of XD_data, XD_ip and XD_rs232 networks.
Configuring IP-Based Networks
To configure IP-based networks:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Networks > Add a Network to the HACMP Cluster and press Enter.
3. Select the type of network to configure.
4. Enter the information as follows:
5. Press Enter to configure this network.
6. Repeat the operation to configure more networks.
Configuring IP Address Takeover via IP Replacement
If you do not have extra subnets to use in the HACMP cluster, you may need to configure IPAT via IP Replacement for the HACMP cluster network.
Note: IPAT via IP Aliases is the default method for binding an IP label to a network interface, and for ensuring the IP label recovery. IPAT via IP Aliases saves hardware, but requires multiple subnets. For general information about IPAT and the differences between the two IPAT methods, see the Concepts and Facilities Guide. For planning IPAT methods, see the Planning Guide.
To configure IPAT via IP Replacement:
1. In the Add a Service IP Label/Address SMIT panel, specify that the IP label that you add as a resource to a resource group is Configurable on Multiple Nodes.
2. In the same panel, configure hardware address takeover (HWAT) by specifying the Alternate Hardware Address to Accompany IP Label/Address. For instructions, see the section Configuring Service IP Labels/Addresses later in this chapter.
Note: Do not use HWAT with gigabit Ethernet adapters (network interface cards) that support flow control. Network problems may occur after an HACMP fallover.
3. In the Add a Network to the HACMP Cluster panel, specify False in the Enable IP Takeover via IP Aliases SMIT field. For instructions, see the section Configuring Communication Interfaces/Devices to HACMP in this chapter.
Configuring Heartbeating over IP Aliases
You can configure heartbeating over IP Aliases to establish IP-based heartbeating rings over IP Aliases to run over your existing cluster networks. Heartbeating over IP Aliases supports either IPAT via IP Aliases or IPAT via IP Replacement. The type of IPAT configured determines how HACMP handles the service label:
Note: HACMP removes the aliases from the interfaces at shutdown. It creates the aliases again when the network becomes operational. The hacmp.out file records these changes.
To configure heartbeating over IP Aliases, you specify an IP address offset when configuring an interface. Make sure that this address does not conflict with addresses configured on your network. For information about configuring an interface, see the section Configuring Communication Interfaces/Devices to HACMP.
Verifying Configuration for Heartbeating over IP Aliases
The HACMP cluster verification ensures that:
The configuration is valid for the address range. All interfaces are the same type (for example, Ethernet) and have the same subnet mask. The offset address allows sufficient addresses and subnets on the network. Configuring an Application Service Interface
If you already have an active application that is active and using a particular IP Address as a base address on network interface, you can configure this service IP label in HACMP without disrupting your application. The following steps guide you through configuring your application service IP label in HACMP in order not to disrupt your application:
1. Configure an HACMP cluster
2. Configure HACMP nodes
3. Configure HACMP networks
4. Run Discovery.
5. Configure HACMP communication interfaces/devices
6. Run verification and synchronization to propagate your configuration to all the nodes.
7. For each node that has an application using a particular IP Address:
a. For each IP Address used as a base address on network interface on that node, decide on a Boot_IP_Address. For more information, see the Planning Guide.
b. Run the sample utility clchipdev (described below):
/usr/es/sbin/cluster/samples/appsvclabel/clchipdev:The sample utility clchipdev helps configure an application service interface correctly in HACMP when you have an active application that is using a particular IP Address as a base address on network interface before starting HACMP.
Where:
NODE is the nodename. network_name is the name of the network that contains this service interface. App_IP_Address is the IP Address currently in use by the application (and currently configured in the CuAt as the base address for the given interface). Boot_IP_Address is the IP Address that is to be used as the new base (boot) address. Alternate_HW_address is an optional parameter for Hardware address (optionally used when configuring a service address in HA in IPAT networks). For example, suppose NodeA has an IP Address 10.10.10.1 that is being used to make an application highly available. You would use the following steps:
1. Run the sample utility clchipdev.
clchdev -n NodeA -w net_ip -a '10.10.10.1=192.3.42.1'.Performs rsh to NodeA and determines the network interface for which 10.10.10.1 is currently configured as the base address. Determines the network interface to be en0. Determines the network type as defined in HACMPnetwork ODM, using the network name. Runs: chdev -l en0 -a netaddr=192.3.42.1 -PThis changes the CuAt on that node to use the new Boot_IP_Address as the base address.
Replaces 10.10.10.1 in HACMPadapter ODM with 192.3.42.1. Configures to HACMP the IP Address 10.10.10.1 as a service IP address. 2. Add this service IP label to a resource group.
3. Run verification and synchronization.
Configuring Point-to-Point Networks to HACMP
To configure a point-to-point network:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Networks > Add a Network to the HACMP Cluster and press Enter.
3. Select the type of network to configure.
4. Fill in the fields on the Add a non IP-based Network panel as follows:
Network Name Name the network, using no more than 32 alphanumeric characters and underscores; do not begin the name with a numeric. Do not use reserved names. For a list of reserved names, see List of Reserved Words. Network Type Valid types are RS232, tmssa, tmscsi, diskhb.
Note: The volume groups associated with the disks used for disk heartbeating (disk heartbeating networks) do not have to be defined as resources within an HACMP resource group. In other words, an enhanced concurrent volume group associated with the disk that enables heartbeating does not have to belong to any resource group in HACMP.
5. Press Enter to configure this network.
6. Repeat the operation to configure more networks.
Configuring Communication Interfaces/Devices to HACMP
When you are configuring these HACMP components, you can have three different scenarios:
Communication interfaces and devices are already configured to AIX 5L, and you have run the HACMP discovery process to add them to HACMP picklists to aid in the HACMP configuration process. See Configuring Discovered Communication Interfaces to HACMP and Configuring Discovered Communication Devices to HACMP. Communication interfaces and devices are already configured to AIX 5L, and need to be configured to HACMP (no discovery was run). See Configuring Predefined Communication Interfaces to HACMP and Configuring Predefined Communication Devices to HACMP. Network interfaces and serial devices need to be defined to AIX 5L before you can configure them in HACMP. In this case, HACMP SMIT provides you with links to the AIX 5L SMIT, where you can configure, change or delete communication interfaces/devices to the operating system, or update them with the AIX 5L settings without leaving the HACMP user interface. To configure network interfaces and serial devices to the AIX 5L operating system without leaving HACMP SMIT, use the System Management (C-SPOC) > HACMP Communication Interface Management SMIT path.
For instructions, see the section Managing Communication Interfaces in HACMP in Chapter 13: Managing the Cluster Topology.
Configuring Discovered Communication Interfaces to HACMP
To add discovered communication interfaces to the HACMP cluster:
1. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Communication Interfaces/Devices > Add Communication Interfaces/Devices and press Enter. A panel appears that lets you add previously discovered, or previously defined network interfaces:
2. Select the Add Discovered Communication Interfaces and Devices option. SMIT displays a list of interfaces and devices.
3. Select Communication Interfaces you want to configure. The panel Select a Network Name appears.
4. Select a network name.
The panel Select One or More Discovered Communication Interfaces to Add appears. It displays a picklist that contains multiple communication interfaces, which when you select one or more, are added to the HACMP Configuration Database (ODM).
HACMP either uses HACMP Configuration Database defaults, or automatically generates values, if you did not specifically define them earlier. For example, the physical network name is automatically generated by combining the string “Net” with the network type (for instance, ether) plus the next available integer, as in NetEther3.
Interfaces that are already added to the cluster are filtered from the picklist, as in the following example:
5. Select All, one or more discovered communication interfaces to add. You have added either All, one or more of discovered communication interfaces to the operating system. (If you selected All, all discovered communication interfaces are added.)
Configuring Predefined Communication Interfaces to HACMP
The Predefined Communication Interfaces panel provides fields and picklists that enable you to choose configuration options quickly.
While choosing options, make sure that your choices do not conflict with the existing network topology. For example, if AIX 5L refers to a Token-Ring NIC (Network Interface Card), make sure that HACMP refers to the same type of network interface card (for example, not an Ethernet NIC).
To add predefined network interfaces to the HACMP cluster:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Communication Interfaces/Devices > Add Communication Interfaces/Devices and press Enter.
A panel appears that lets you add previously discovered, or previously defined network interfaces:
3. Select the Add Predefined Communication Interfaces and Devices option. SMIT displays a list of communication interfaces and devices for the selected type.
4. Select Communication Interfaces. The Select a Network Name panel appears.
5. Select a network name. The Add a Communication Interface panel appears.
6. Fill in the fields as follows:
7. Press Enter. You have added the communication interface(s) that were already predefined to AIX 5L to the HACMP cluster.
Configuring Discovered Communication Devices to HACMP
To configure discovered serial devices to the HACMP cluster:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Communication Interfaces/Devices > Add Communication Interfaces/Devices and press Enter. A panel appears that lets you add previously discovered, or previously defined network interfaces or devices:
3. Select the Add Discovered Communication Interfaces and Devices option.
4. Select the Communications Devices type from the list.
The panel Select Point-to-Point Pair of Discovered Communication Devices to Add appears. It displays a picklist that contains multiple communication devices, which when you select one or more, are added to the HACMP Configuration Database. Devices that are already added to the cluster are filtered from the picklist, as in the following example:
5. Select only two devices in this panel. It is assumed that these devices are physically connected; you are responsible for making sure this is true.
6. Continue defining devices as needed.
Configuring Predefined Communication Devices to HACMP
To configure predefined communication devices for the cluster:
1. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Communication Interfaces/Devices > Add Communication Interfaces/Devices and press Enter. A panel appears that lets you add previously discovered, or previously defined network interfaces or devices:
2. Select the Add Predefined Communication Interfaces and Devices option.
3. Select the Communications Devices type from the list.
4. Select the non IP-based network to which you want to add the devices.
5. Enter the field values as follows:
6. Press Enter after filling in all required fields. HACMP now checks the validity of the device configuration. You may receive warnings if a node cannot be reached.
7. Repeat until each node has all appropriate communication devices defined.
Configuring Heartbeating over Disk
You can configure disk heartbeating over any shared disk that is part of an enhanced concurrent mode volume group. RSCT passes topology messages between two nodes over the shared disk. Heartbeating networks contain:
Two nodes An enhanced concurrent mode disk that participates in only one heartbeating network. The following considerations apply:
Use the filemon command to check the disk load. If the disk is heavily used, it may be necessary to change the tuning parameters for the network to allow more missed heartbeats. Disk heartbeating networks use the same tuning parameters as RS232 networks. For troubleshooting purposes, you or a third-party system administrator assisting you with cluster support may optionally reset the HACMP tunable values (such as aliases for heartbeating, or the tuning parameters for the network module) to their installation-time defaults. For more information, see the Resetting HACMP Tunable Values section in Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide. To configure a heartbeating over disk network:
1. Ensure that the disk is defined in an enhanced concurrent mode volume group.
2. Create a diskhb network.
For information about configuring a diskhb network, see the section Configuring Point-to-Point Networks to HACMP.
3. Add disk-node pairs to the network.
Ensure that each node is paired with the same disk, as identified by the device name, for example hdisk1, and PVID.
A device available as a physical volume to the Logical Volume Manager (LVM) is available for disk heartbeating, such as an hdisk or vpath disk. The lspv command displays a list of these devices.
For information about adding communication interfaces to a disk heartbeating network, see the sections Configuring Discovered Communication Interfaces to HACMP and Configuring Discovered Communication Devices to HACMP.
Verifying Disk Heartbeating Configuration
The HACMP cluster verification ensures the following:
Interface and network names are valid and unique. Disk heartbeating devices use valid device names (/dev/hdisk#, /dev/vpath#). Disk heartbeating network devices are included in enhanced concurrent mode volume groups. Each heartbeating network has: Two nodes with different names Matching PVIDs for the node-hdisk pairs defined on each disk heartbeating network. Using Disk Heartbeating Networks for Fast Failure Detection
HACMP 5.4 reduces the time it takes for a node failure to be realized throughout the cluster. When a node fails, HACMP uses disk heartbeating to place a departing message on the shared disk so neighboring nodes are aware of the node failure within one heartbeat period (hbrate). Topology Services then distributes the information about the node failure throughout the cluster nodes and a Topology Services daemon on each node sends a node_down event to any concerned client node.
You can turn on fast method of node failure detection when you change the NIM values for disk heartbeating networks. See Reducing the Node Failure Detection Rate: Enabling Fast Detection for Node Failures in Chapter 13: Managing the Cluster Topology.
Using Disk Heartbeating Networks for Detecting Failed Disk Enclosures
In addition to providing a non-IP network to help ensure high availability, disk heartbeating networks can be used to detect a failure of a disk enclosure. To use this function, configure a disk heartbeating network for at least one disk in each disk enclosure.
To let HACMP detect a failed disk enclosure:
1. Configure a disk heartbeating network for a disk in the specified enclosure.
For information about configuring a disk heartbeating network, see the section Configuring Heartbeating over Disk.
2. Create a pre- or post-event, or a notification method, to determine the action to be taken in response to a failure of the disk heartbeating network. A failure of the disk enclosure is seen as a failure of the disk heartbeating network.
Configuring HACMP Persistent Node IP Labels/Addresses
A persistent node IP label is an IP alias that can be assigned to a network for a specified node. A persistent node IP label is a label that:
Always stays on the same node (is node-bound) Co-exists with other IP labels present on an interface Does not require installing an additional physical interface on that node Is not part of any resource group. Assigning a persistent node IP label for a network on a node allows you to have a node-bound address on a cluster network that you can use for administrative purposes to access a specific node in the cluster.
Prerequisites and Notes
If you are using persistent node IP Labels/Addresses, note the following issues:
You can define only one persistent IP label on each node per cluster network. Persistent IP labels become available at a node’s boot time. On a non-aliased network, a persistent label may be placed on the same subnet as the service labels, or it may be placed on an entirely different subnet. However, the persistent label must be placed on a different subnet than all non-service or non-boot IP labels (such as, “backup” IP labels) on the network. On an aliased network, a persistent label may be placed on the same subnet as the aliased service label, or it may be configured on an entirely different subnet. However, it must be placed on a different subnet than all boot IP labels on the network. Once a persistent IP label is configured for a network interface on a particular network on a particular node, it becomes available on that node on a boot interface at operating system boot time and remains configured on that network when HACMP is shut down on that node. You can remove a persistent IP label from the cluster configuration using the Delete a Persistent Node IP Label/Address SMIT panel. However, after the persistent IP label has been removed from the cluster configuration, it is not automatically deleted from the interface on which it was aliased. In order to completely remove the persistent IP label from the node, you should manually remove the alias with the ifconfig delete command or reboot the cluster node. Configure persistent node IP labels individually on each node. You cannot use the HACMP discovery process for this task. To add persistent node IP labels:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure HACMP Persistent Node IP Labels/Addresses > Add a Persistent Node IP Label and press Enter.
3. Enter the field values as follows:
4. Press Enter.
To change or show persistent node IP labels, use the Change/Show a Persistent Node IP label SMIT menu. To delete them, use the Delete a Persistent Node IP label menu.
Configuring Node-Bound Service IP Labels
A node-bound service IP label is a specific type of service IP label configured on a non-aliased network.
In the case of node-bound service IP labels, IP aliases are not used, and therefore, to have a node-bound service IP label, a network should be first configured to use IPAT via IP Replacement.
These IP labels do not “float” with a resource group, but they are kept highly available on the node to which they are assigned. Node-bound service IP labels could be useful for the following purposes:
A concurrent resource group could have node-bound service IP addresses configured for it. This way, a “load balancer” type of application (or another application, in general) configured in front of the HACMP cluster could forward requests to each instance of the concurrent resource group via specifically assigned node-bound service IP addresses on each of the nodes in the resource group nodelist. A node-bound service IP label could also be used for administrative purposes. Some of its functions can, of course, be achieved by using other capabilities in HACMP (such as using persistent service IP labels). Node-bound service IP labels are required for configuring clusters in HACMP/XD for HAGEO. For more information, see the HACMP/XD for HAGEO documentation. To configure a node-bound service IP label:
1. Add an IP-based network to the HACMP cluster, use the procedure in the section Configuring IP-Based Networks.
2. Disable the option for IP Address Takeover via IP Aliases for this network. Use the procedure in the section Configuring IP Address Takeover via IP Replacement.
3. Add a communication interface to the network. Use the procedures in the section Configuring Discovered Communication Interfaces to HACMP, or in the section Configuring Discovered Communication Devices to HACMP.
4. Add a service IP label/address that is bound to a single node. Use the procedure in the section Configuring Service IP Labels/Addresses and select an option Bound to a Single Node.
Configuring HACMP Global Networks
This section describes global networks and steps to configure them.
HACMP Global Networks: Overview
In order to reduce the possibility of a partitioned HACMP cluster, you should configure multiple heartbeat paths between cluster nodes. Even if an IP network is not used for IP address takeover (IPAT), it should still be defined to HACMP for use by heartbeat. This reduces the chance that the loss of any single network will result in a partitioned cluster (and subsequent shutdown of one side of the partition).
When defining IP networks for heartbeating only (for example, the Administrative Ethernet on an SP), it is possible to combine multiple individual networks into a larger global network. This allows heartbeating across interfaces on multiple subnets.
Interfaces on each subnet are added to a single HACMP network (either manually or using the HACMP discovery process), and then the individual networks are added to the global network definition. HACMP treats all interfaces on the combined global network as a single network and ignores the subnet boundaries of the individual networks.
Networks combined in a global network therefore cannot be used for IP address recovery (that is, the IP labels from such networks should not be included in resource groups).
You define global networks by assigning a character string name to each HACMP network that you want to include as a member of the global network. All members of a global network must be of the same type (all Ethernet, for example).
Steps for Configuring HACMP Global Networks
To configure global networks, complete the following steps:
1. Enter smit hacmp
2. In SMIT, select Extended Topology Configuration > Configure HACMP Global Networks and press Enter. SMIT displays a picklist of defined HACMP networks.
3. Select one of these networks. SMIT displays the Change/Show a Global Network panel. The name of the network you selected is entered as the local network name.
4. Enter the name of the global network (character string).
5. Repeat these steps to define all the HACMP networks to be included in each global network you want to define.
Configuring HACMP Network Modules
Using HACMP SMIT, you can add, change, remove or list existing network interface modules (NIMs).
For information on listing existing NIMs, or adding and removing NIMs, see Changing the Configuration of a Network Module in Chapter 13: Managing the Cluster Topology.
To add a network notifies module:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration> Configure HACMP Network Modules > Add a Network Module and press Enter.
3. Enter field values as follows:
After the command completes, a panel appears that shows the current settings for the specified network module.
Configuring Topology Services and Group Services Logs
You can change the settings for the length of the Topology and Group services logs. However, the default settings are highly recommended. The SMIT panel contains entries for heartbeat settings, but these are not adjustable.
Note: You can change the HACMP network module settings. See the section Configuring HACMP Network Modules.
To configure Topology Services and Group Services logs:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Topology Configuration > Configure Topology Services and Group Services > Change/Show Topology and Group Services Configuration and press Enter.
3. Enter field values as follows:
Topology Services log length (lines) The default is 5000. This is usually sufficient. Group Services log length (lines) The default is 5000. This is usually sufficient.
4. Press Enter if you make any changes to field values, and then return to the menu.
Showing HACMP Topology
To view the current HACMP topology configuration, select the Show HACMP Topology option from the Extended Configuration > Extended Topology Configuration menu.
Configuring HACMP Resources (Extended)
Once you have configured the cluster topology, continue setting up your cluster by configuring the resources that will be placed in the resource groups. You may have already used the HACMP Standard path menus to configure some resources and groups. Use the Extended menus to add the resources not available on the standard path, to make changes, or to add more extensive customization.
Using the Extended Resources Configuration path you can configure the following types of resources:
Application server Service IP label Shared volume group Concurrent volume group Filesystem Application monitor(s) CS/AIX Fast Connect Tape drive. Configuring Service IP Labels as HACMP Resources
You should record the service IP label configuration on the planning worksheets. See Chapter 3: Planning Cluster Network Connectivity in the Planning Guide for information on service IP labels that are included as resources in the resource groups.
For the initial configuration, follow the procedures described in this section.
Discovering IP Network Information (Optional)
When you are using the extended configuration path, you can choose to run the HACMP cluster information discovery process. If you choose to run discovery, all communication paths must be configured first. Then HACMP will discover nodes, networks, and communication interfaces and devices for you and show them in the SMIT picklists. If you choose not to run discovery, HACMP will only include in the picklist network information that is predefined in AIX 5L.
To run cluster discovery:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Discover HACMP-Related Information from Configured Nodes and press Enter.
HACMP retrieves current AIX 5L configuration information from all cluster nodes. This information is displayed in picklists to help you make accurate selections of existing components. HACMP informs you about which components have been discovered by the system. Predefined components (those that are supported but are not discovered) are also made available as selections in picklists.
3. Return to the Extended Configuration menu.
Configuring Service IP Labels/Addresses
To add service IP labels/addresses as resources to the resource group in your cluster:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Service IP Labels/Addresses > Add a Service IP Label/Address and press Enter. SMIT displays the following panel.
3. Select the type of service IP label you are configuring:
4. Fill in field values as follows:
5. Press Enter after filling in all required fields. HACMP now checks the validity of the IP label/address configuration.
6. Repeat the previous steps until you have configured all service IP labels/addresses for each network, as needed.
Distribution Preference for Service IP Label Aliases: Overview
You can configure a distribution preference for the service IP labels that are placed under HACMP control. HACMP lets you specify the distribution preference for the service IP label aliases. These are the service IP labels that are part of HACMP resource groups and that belong to IPAT via IP Aliasing networks.
A distribution preference for service IP label aliases is a network-wide attribute used to control the placement of the service IP label aliases on the physical network interface cards on the nodes in the cluster. Configuring a distribution preference for service IP label aliases does the following:
Lets you customize the load balancing for service IP labels in the cluster. Enables HACMP to redistribute the alias service IP labels according to the preference you specify. Allows you to configure the type of distribution preference suitable for the VPN firewall external connectivity requirements. For more information, see the section Planning for the VPN Firewall Network Configurations in HACMP in the Planning Guide. The distribution preference is exercised as long as there are acceptable network interfaces available. HACMP always keeps service IP labels active, even if the preference cannot be satisfied. Rules for the Distribution Preference for Service IP Label Aliases
The following rules apply to the distribution preference:
You can specify the distribution preference for service IP labels that belong to IPAT via IP Aliases networks. If you do not specify any preference, HACMP by default distributes all service IP label aliases across all available boot interfaces on a network using the IPAT via IP Aliasing function. For more information on how the default method for service IP label distribution works, see Appendix B: Resource Group Behavior during Cluster Events. If there are insufficient network interface cards available to satisfy the preference that you have specified, HACMP allocates service IP label aliases to an active network interface card that may be hosting other IP labels. You can change the IP labels distribution preference dynamically: The new selection becomes active during subsequent cluster events. (HACMP does not require the currently active service IP labels to conform to the newly changed preference.) If you did not configure persistent labels, HACMP lets you select the Collocation with Persistent and Anti-Collocation with Persistent distribution preferences, but it issues a warning and uses the regular collocation or anti-collocation preferences by default. When a service IP label fails and another one is available on the same node, HACMP recovers the service IP label aliases by moving them to another NIC on the same node. During this event, the distribution preference that you specified remains in effect. You can view the distribution preference per network using the cltopinfo or the cllsnw commands. Configuring Distribution Preference for Service IP Label Aliases
To specify a distribution preference for service IP label aliases:
1. (Optional). Configure a persistent IP label for each cluster node on the specific network. For instructions, see the section Configuring HACMP Persistent Node IP Labels/Addresses.
2. Configure the service IP labels for the network.
3. Select the type of the distribution preference for the network. For a list of available distribution preferences, see the section below.
Types of Distribution for Service IP Label Aliases
You can specify in SMIT the following distribution preferences for the placement of service IP label aliases:
Steps to Configure Distribution Preference for Service IP Label Aliases
To configure a distribution preference for service IP label aliases on any cluster node:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure Resource Distribution Preferences > Configure Service IP labels/addresses Distribution Preferences and press Enter.
HACMP displays a list of networks that use IPAT via IP Aliasing.
3. Select a network for which you want to specify the distribution preference.
4. SMIT displays the Configure Resource Distribution Preferences screen. Enter field values as follows:
Note: If you did not configure persistent IP labels, HACMP lets you select the Collocation with Persistent and Anti-Collocation with Persistent distribution preferences but issues a warning and uses the regular collocation or anti-collocation preferences by default.
5. Press Enter to add this information to the HACMP Configuration Database on the local node. Return to previous HACMP SMIT screens to perform other configuration tasks.
6. To synchronize the cluster definition, go to the Initialization and Standard Configuration or Extended Configuration menu and select Verification and Synchronization. If the Cluster Manager is running on the local node, synchronizing the cluster resources triggers a dynamic reconfiguration event.
See Synchronizing Cluster Resources in Chapter 14: Managing the Cluster Resources for more information.
Configuring HACMP Application Servers
An application server is a cluster component that is included in the resource group as a cluster resource, and that is used to control an application that must be kept highly available. An application server consists of application start and stop scripts. Configuring an application server does the following:
Associates a meaningful name with the server application. For example, you could give the tax software a name such as taxes. You then use this name to refer to the application server when you define it as a resource. When you set up the resource group, you add an application server as a resource. Points the cluster event scripts to the scripts that they call to start and stop the server application. Allows you to then configure application monitoring for that application server. Note that this section does not discuss how to write the start and stop scripts. See the vendor documentation for specific product information on starting and stopping a particular application.
Defining an HACMP Application Server
To configure an application server on any cluster node:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Servers > Add an Application Server and press Enter.
3. Enter field values as follows:
4. Press Enter to add this information to the HACMP Configuration Database on the local node. Return to previous HACMP SMIT panels to perform other configuration tasks.
Configuring Volume Groups, Logical Volumes, and Filesystems as Resources
You define volume groups, logical volumes, and filesystems in AIX 5L and then configure them as resources for HACMP. Plan and note them on the worksheets before configuring to HACMP. For more information, see Chapter 5 in the Installation Guide and Chapter 11: Managing Shared LVM Components in this guide.
Configuring Concurrent Volume Groups, Logical Volumes, and Filesystems as Resources
Concurrent volume groups, logical volumes, and filesystems must be defined in AIX 5L and then configured as resources for HACMP. They should be planned and noted on the worksheets before configuring to HACMP. See Chapter 5: Planning Shared LVM Components in the Planning Guide and Chapter 12: Managing Shared LVM Components in a Concurrent Access Environment for information in this guide.
Configuring Multiple Application Monitors
HACMP can monitor specified applications using application monitors. These application monitors can:
Check if an application is running before HACMP starts it. Watch for the successful startup of the application. Check that the application runs successfully after the stabilization interval has passed. Monitor both the startup and the long-running process. Automatically take action to restart applications upon detecting process termination or other application failures. In HACMP 5.2 and up, you can configure multiple application monitors and associate them with one or more application servers.
By supporting multiple monitors per application, HACMP can support more complex configurations. For example, you can configure one monitor for each instance of an Oracle parallel server in use. Or, you can configure a custom monitor to check the health of the database along with a process termination monitor to instantly detect termination of the database process.
Note: If a monitored application is under control of the system resource controller, ensure that action:multi are -O and -Q. The -O specifies that the subsystem is not restarted if it stops abnormally. The -Q specifies that multiple instances of the subsystem are not allowed to run at the same time. These values can be checked using the following command:
lssrc -Ss Subsystem | cut -d : -f 10,11
If the values are not -O and -Q, change them using the chssys command.
Process and Custom Monitoring
You can select either of two application monitoring methods:
Process application monitoring detects the termination of one or more processes of an application, using RSCT Resource Monitoring and Control. Custom application monitoring checks the health of an application with a custom monitor method at user-specified polling intervals. Process monitoring is easier to set up, as it uses the built-in monitoring capability provided by RSCT and requires no custom scripts. However, process monitoring may not be an appropriate option for all applications. Custom monitoring can monitor more subtle aspects of an application’s performance and is more customizable, but it takes more planning, as you must create the custom scripts.
Fallover and Notify Actions
In both process and custom monitoring methods, when the monitor detects a problem, HACMP attempts to restart the application on the current node and continues the attempts until the specified retry count has been exhausted.
When an application cannot be restarted within the retry count, HACMP takes one of two actions, which you specified when configuring the application monitor:
Choosing fallover causes the resource group containing the application to fall over to the node with the next highest priority according to the nodelist. (See Note on the Fallover Option and Resource Group Availability for more information.) Choosing notify causes HACMP to generate a server_down event, which informs the cluster of the failure. Monitor Modes
When you configure process monitor(s) and custom monitor(s) for the application server, you can also specify the mode in which the application monitor is used:
Startup Monitoring Mode. In this mode, the monitor checks the application server’s successful startup within the specified stabilization interval and exits after the stabilization period expires. The monitor in the startup mode may run more than once, but it always runs during the time specified by the stabilization interval value in SMIT. If the monitor returns within the stabilization interval, its zero return code indicates that the application had successfully started. If the monitor returns a non-zero code within the stabilization interval, this is interpreted as a failure of the application to start. Use this mode for applications in parent resource groups. If you configure dependencies between resource groups in the cluster, the applications in these resource groups are started sequentially as well. To ensure that this process goes smoothly, we recommend configuring several application monitors, and, especially, a monitor that checks the application startup for the application that is included in the parent resource group. This ensures that the application in the parent resource group starts successfully.
Long-Running Mode. In this mode, the monitor periodically checks that the application is running successfully. The checking begins after the stabilization interval expires and it is assumed that the application server is started and the cluster has stabilized. The monitor in the long-running mode runs at multiple intervals based on the monitoring interval value that you specify in SMIT. Configure a monitor in this mode for any application server. For example, applications included in child and parent resource groups can use this mode of monitoring.
Both. In this mode, the monitor checks for the successful startup of the application server and periodically checks that the application is running successfully. Retry Count and Restart Interval
The restart behavior depends on two parameters, the retry count and the restart interval, that you configure in SMIT.
Retry count. The retry count specifies how many times HACMP should try restarting before considering the application failed and taking subsequent fallover or notify action. Restart interval. The restart interval dictates the number of seconds that the restarted application must remain stable before the retry count is reset to zero, thus completing the monitor activity until the next failure occurs. Note: Do not specify both of these parameters if you are creating an application monitor that will only be used as in a startup monitoring mode.
If the application successfully starts up before the retry count is exhausted, the restart interval comes into play. By resetting the restart count, it prevents unnecessary fallover action that could occur when applications fail several times over an extended time period. For example, a monitored application with a restart count set to three (the default) could fail to restart twice, and then successfully start and run cleanly for a week before failing again. This third failure should be counted as a new failure with three new restart attempts before invoking the fallover policy. The restart interval, set properly, would ensure the correct behavior: it would have reset the count to zero when the application was successfully started and found in a stable state after the earlier failure.
Be careful not to set the restart interval for a too short period of time. If the time period is too short, the count could be reset to zero too soon, before the immediate next failure, and the fallover or notify activity will never occur.
See the instructions for setting the retry count and restart intervals later in this chapter for additional details.
Application Monitoring Prerequisites and Considerations
Keep the following in mind when planning and configuring application monitoring:
Any application to be monitored must be defined to an application server in an existing cluster resource group. If you have configured dependent resource groups, we recommend to configure multiple monitors: for applications included in parent resource groups, and for applications in child resource groups. For example, a monitor for a parent resource group can monitor the successful startup of the application, and a monitor for a child resource group can monitor the process for an application. For more information, see Monitor Modes. Multiple monitors can be configured for the same application server. Each monitor can be assigned a unique name in SMIT. The monitors that you configure must conform to existing configuration rules. For more information, see Configuring a Process Application Monitor and Configuring a Custom Application Monitor. We recommend that you first configure an application server, and then configure the monitor(s) that you can associate with the application server. Before configuring an application monitor, configure all your application servers. Then configure the monitors and associate them with the servers. You can go back at any time and change the association of monitors to servers. You can configure no more than 128 monitors per cluster. No limit exists on the number of monitors per application server, as long as the total number of all monitors in the cluster is less than 128. When multiple monitors are configured that use different fallover policies, each monitor specifies a failure action of either “notify” or “fallover”. HACMP processes actions in the order in which the monitors indicate an error. For example, if two monitors are configured for an application server and one monitor uses the “notify” method and the other uses the “fallover” method, the following occurs: If a monitor with “fallover” action indicates an error first, HACMP moves the resource group to another node, and the remaining monitor(s) are shut down and restarted on another node. HACMP takes no actions specified in any other monitor. If a monitor with “notify” action indicates an error first, HACMP runs the “notify” method and shuts down that monitor, but any remaining monitors continue to operate as before. You can manually restart the “notify” monitor on that node using the Suspend/Resume Application Monitoring SMIT panel. If multiple monitors are used, HACMP does not use a particular order for the monitors startup or shutdown. All monitors for an application server are started at the same time. If two monitors are configured with different fallover policies, and they fail at precisely the same time, HACMP does not guarantee it processes methods specified for one monitor before methods for the other. The same monitor can be associated with multiple application servers using the Application Monitor(s) field in the Change/Show an Application Server SMIT panel. You can select a monitor from the picklist. If you remove an application monitor, HACMP removes it from the server definition for all application servers that were using the monitor, and indicates which servers are no longer using the monitor. If you remove an application server, HACMP removes that server from the definition of all application monitors that were configured to monitor the application. HACMP also sends a message about which monitor will no longer be used for the application. If you remove the last application server in use for any particular monitor, that is, if the monitor will no longer be used for any application, verification issues a warning that the monitor will no longer be used. Note on the Fallover Option and Resource Group Availability
Be aware that if you select the fallover option of application monitoring in the Customize Resource Recovery SMIT panel—which could cause a resource group to migrate from its original node—the possibility exists that while the highest priority node is up, the resource group remains inactive. This situation occurs when an rg_move event moves a resource group from its highest priority node to a lower priority node, and then you stop the cluster services on the lower priority node with the option to take all the resources offline. Unless you bring the resource group up manually, it remains in an inactive state.
Also, for more information on resource group availability, see the section Selective Fallover for Handling Resource Groups in Appendix B: Resource Group Behavior during Cluster Events.
Steps for Configuring Multiple Application Monitors
To define multiple application monitors for an application:
1. Define one or more application servers. For instructions, see the section Configuring Application Servers in Configuring an HACMP Cluster (Standard).
2. Add the monitors to HACMP. The monitors can be added using the HACMP Extended Configuration path in SMIT. For instructions, see the sections Configuring a Process Application Monitor and Configuring a Custom Application Monitor.
Configuring a Process Application Monitor
You can configure multiple application monitors and associate them with one or more application servers. By supporting multiple monitors per application, HACMP can support more complex configurations.
Process application monitoring uses the RSCT subsystem functions to detect the termination of a process and to generate an event. This section describes how to configure process application monitoring, in which you specify one or more processes of a single application to be monitored.
Note: Process monitoring may not be the appropriate solution for all applications. For instance, you cannot monitor a shell script with a process application monitor. If you wish to monitor a shell script, configure a custom monitor. See Configuring a Custom Application Monitor for details on the other method of monitoring applications.
Identifying Correct Process Names
For process monitoring, it is very important that you list the correct process names in the SMIT Add Process Application Monitor panel. You should use processes that are listed in response to the ps -el command, and not ps -f. (This is true for any process that is launched through a #!<path name> in the script. For example, this is true for bsh, csh, etc).
If you are unsure of the correct names, use the following short procedure to identify all the process names for your list.
To identify correct process names:
1. Enter the following command:
ps -el | cut -c72-80 | sort > list12. Run the application server.
3. Enter the following command:
ps -el | cut -c72-80 | sort > list24. Compare the two lists by entering
diff list1 list2 | grep \>The result is a complete and accurate list of possible processes to monitor. You may choose not to include all of them in your process list.
Steps for Configuring a Process Application Monitor
An application must have been defined to an application server before you set up the monitor.
To configure a process application monitor (in any of the three running modes: startup mode, long-running mode or both):
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Monitoring > Configure Process Application Monitors > Add Process Application Monitor and press Enter. A list of previously defined application servers appears.
3. Select the application server to which you want to add a process monitor.
4. In the Add a Process Application Monitor panel, fill in the field values as follows:
Monitor Name Enter the name of the application monitor. Each monitor can have a unique name that does not have to be the same name as the application server name. Monitor Mode Select the mode in which the application monitor monitors the application:
- startup monitoring. In this mode the application monitor checks that the application server has successfully started within the specified stabilization interval. The monitor in this mode may run multiple times, as long as it is being run within the stabilization interval that you specify. If the monitor in this mode returns a zero code, this means that the application had started successfully. If a non-zero code is returned, this means that the application did not start within the stabilization interval. Select this mode if you are configuring an application monitor for an application that is included in a parent resource group (in addition to other monitors that you may need for dependent resource groups).
- long-running monitoring. In this mode, the application monitor periodically checks that the application server is running. The monitor is run multiple times based on the monitoring interval that you specify. If the monitor returns a zero code, it means that the application is running successfully. A non-zero return code indicates that the application has failed. The checking starts after the specified stabilization interval has passed. This mode is the default.
- both. In this mode, the application monitor checks that within the stabilization interval the application server has started successfully, and periodically monitors that the application server is running after the stabilization interval has passed. If the same monitor is used in the “both” mode, HACMP interprets the return codes differently, according to which type of monitoring is used (see the description of modes).
Processes to Monitor Specify the process(es) to monitor. You can type more than one process name. Use spaces to separate the names.Note: To be sure you are using correct process names, use the names as they appear from the ps -el command (not ps -f), as explained in the section Identifying Correct Process Names. Process Owner Specify the user ID of the owner of the processes specified above, for example root. Note that the process owner must own all processes to be monitored. Instance Count Specify how many instances of the application to monitor. The default is 1 instance. The number of instances must exactly match the number of processes to monitor. If you put one instance, and another instance of the application starts, you will receive an application monitor error.Note: This number must be more than 1 if you have specified more than one process to monitor (1 instance for each process). Stabilization Interval Specify the time (in seconds). HACMP uses the stabilization period for the monitor in different ways, depending on which monitor mode is selected in this SMIT panel:
- If you select the startup monitoring mode, the stabilization interval is the period within which HACMP runs the monitor to check that the application has successfully started. When the specified time expires, HACMP terminates the monitoring of the application startup and continues event processing. If the application fails to start within the stabilization interval, the resource group’s acquisition fails on the node, and HACMP launches resource group recovery actions to acquire a resource group on another node. The number of seconds you specify should be approximately equal to the period of time it takes for the application to start. This depends on the application you are using.
- If you select the long-running mode for the monitor, the stabilization interval is the period during which HACMP waits for the application to stabilize, before beginning to monitor that the application is running successfully. For instance, with a database application, you may wish to delay monitoring until after the start script and initial database search have been completed. You may need to experiment with this value to balance performance with reliability.
- If you select both as a monitoring mode, the application monitor uses the stabilization interval to wait for the application to start successfully. It uses the same interval to wait until it starts checking periodically that the application is successfully running on the node.
Note: In most circumstances, this value should not be zero. Restart Count Specify the number of times to try restarting the application before taking any other actions. The default is 3. If you are configuring a monitor that is going to be used only in the startup monitoring mode, restart count does not apply, and HACMP ignores values entered in this field.Note: Make sure you enter a Restart Method if your Restart Count is any non-zero value. Restart Interval Specify the interval (in seconds) that the application must remain stable before resetting the restart count. Do not set this to be shorter than (Restart Count) x (Stabilization Interval). The default is 10% longer than that value. If the restart interval is too short, the restart count will be reset too soon and the desired fallover or notify action may not occur when it should.If you are configuring a monitor that is going to be used only in the startup monitoring mode, restart interval does not apply, and HACMP ignores values entered in this field. Action on Application Failure Specify the action to be taken if the application cannot be restarted within the restart count. You can keep the default choice notify, which runs an event to inform the cluster of the failure, or select fallover, in which case HACMP recovers the resource group containing the failed application on the cluster node with the next highest priority for that resource group.If you are configuring a monitor that is going to be used only in the startup monitoring mode, the action specified in this field does not apply, and HACMP ignores values entered in this field.See Note on the Fallover Option and Resource Group Availability for more information. Notify Method (Optional) Define a notify method that will run when the application fails.This custom method runs during the restart process and during notify activity.If you are configuring a monitor that is going to be used only in the startup monitoring mode, the method specified in this field does not apply, and HACMP ignores values entered in this field. Cleanup Method (Optional) Specify an application cleanup script to be called when a failed application is detected, before calling the restart method. The default is the application server stop script defined when the application server was set up (if you have only one application server defined. If you have multiple application servers, enter the stop script in this field that is used for the associated application server).If you are configuring a monitor that is going to be used only in the startup monitoring mode, the method specified in this field does not apply, and HACMP ignores values entered in this field.Note: With application monitoring, since the application is already stopped when this script is called, the server stop script may fail. Restart Method (Required if Restart Count is not zero.) The default restart method is the application server start script defined previously, if only one application server was set up. This field is empty if multiple servers are defined. You can specify a different method here if desired.If you are configuring a monitor that is going to be used only in the startup monitoring mode, the method specified in this field does not apply, and HACMP ignores values entered in this field.
5. Press Enter.
SMIT checks the values for consistency and enters them into the HACMP Configuration Database. When the resource group is brought online, the application monitor in the long-running mode starts (if it is defined). Note that the application monitor in the startup monitoring mode starts before the resource group is brought online.
When you synchronize the cluster, verification ensures that all methods you have specified exist and are executable on all nodes.
Configuring a Custom Application Monitor
You can configure multiple application monitors and associate them with one or more application servers. By supporting multiple monitors per application, HACMP can support more complex configurations. For more information, see the section Steps for Configuring Multiple Application Monitors.
Custom application monitoring allows you to write a monitor method to test for conditions other than process termination. For example, if an application sometimes becomes unresponsive while still running, a custom monitor method could test the application at defined intervals and report when the application’s response is too slow. Also, some applications (shell scripts, for example) cannot be registered with RSCT, so process monitoring cannot be configured for them. A custom application monitor method can monitor these types of applications.
For instructions on defining a process application monitor, which requires no custom monitor method, refer to Application Monitoring Prerequisites and Considerations.
Notes on Defining a Monitor Method
Unlike process monitoring, custom application monitoring requires you to provide a script to test the health of the application. You must also decide on a suitable polling interval.
When devising your custom monitor method, keep the following points in mind:
The monitor method must be an executable program (it can be a shell script) that tests the application and exits, returning an integer value that indicates the application’s status. The return value must be zero if the application is healthy, and must be a non zero value if the application has failed. HACMP cannot pass arguments to the monitor method. The method can log messages to /tmp/clappmond.application monitor name.monitor.log by simply printing messages to the standard output (stdout) file. The monitor log file is overwritten each time the application monitor runs. Since the monitor method is set up to terminate if it does not return within the specified polling interval, do not make the method overly complicated. Steps for Configuring a Custom Application Monitor
To set up a custom application monitoring method:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Monitoring > Configure Custom Application Monitors > Add a Custom Application Monitor and press Enter.
3. Select the application server for which you want to add a monitoring method.
4. In the Add Custom Application Monitor panel, fill in field values as follows. Note that the Monitor Method and Monitor Interval fields require you to supply your own scripts and specify your own preference for the polling interval:
5. Press Enter.
SMIT checks the values for consistency and enters them into the HACMP Configuration Database. When the resource group comes online, the application monitor in the long-running mode starts. (The application startup monitor starts before the resource group is brought online).
When you synchronize the cluster, verification ensures that all methods you have specified exist and are executable on all nodes.
Suspending, Changing, and Removing Application Monitors
You can temporarily suspend an application monitor in order to perform cluster maintenance. You should not change the application monitor configuration while it is in a suspended state.
If you have multiple application monitors configured, and choose to temporarily suspend an application monitor, all monitors configured for a specified server are suspended.
For instructions on temporarily suspending application monitoring, changing the configuration, or permanently deleting a monitor, see the section Changing or Removing Application Monitors in Managing the Cluster Resources.
Configuring Tape Drives as HACMP Resources
The HACMP SMIT panels enable the following actions for configuring tape drives:
Add tape drives as HACMP resources Specify synchronous or asynchronous tape operations Specify appropriate error recovery procedures Change or show tape drive resources Remove tape drive resources Add tape drives to HACMP resource groups Remove tape drives from HACMP resource groups. Adding a Tape Resource
To add a tape drive as a cluster resource:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Tape Resources > Add a Tape Resource and press Enter.
3. Enter the field values as follows:
Sample scripts are available in the /usr/es/sbin/cluster/samples/tape directory. The sample scripts rewind the tape drive explicitly. See Appendix A: Script Utilities in the Troubleshooting Guide for the syntax.
Change or Show a Tape Resource
To change or show the current configuration of a tape drive resource, see Reconfiguring Tape Drive Resources in Chapter 14: Managing the Cluster Resources.
Adding a Tape Resource to a Resource Group
To add a tape drive resource to a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resources and Attributes for a Resource Group and press Enter.
3. Select the resource group to which you want to add the tape resource.
SMIT displays the Change/Show all Resources/Attributes for a <selected type of> Resource Group panel.
4. Enter the field value for Tape Resource.
Type in the resource name or press F4 to display a picklist of defined tape resources. Select the desired resource. If there are no tape resources defined, SMIT displays an error message.
Verifying and Synchronizing Tape Drive Configuration
After adding a resource to a resource group, verify that the configuration is correct and then synchronize shared tape resources to all nodes in the cluster.
Verification ensures the following:
Validity of the specified tape special file (is it a tape drive?) Accessibility of the tape drive (does a device on the specified SCSI LUN exist?) Consistency of the configuration (does the device have the same LUN on the nodes sharing the tape drive?) Validity of the user defined start and stop scripts (do the scripts exist and are they executable?) Dynamic Reconfiguration of Tape Resources
When a tape drive is added to a resource group, or when a new resource group is created with tape resources, DARE will reserve the tape and invoke the user-provided tape start script.
When a tape drive is removed from a resource group, or when a resource group with tape resources is removed, DARE invokes the user-provided tape stop script and releases the tape drive.
Configuring AIX 5L Fast Connect
The AIX 5L Fast Connect for Windows application is integrated with HACMP, so you can configure it, via the SMIT interface, as a highly available resource in resource groups. This application does not need to be associated with application servers or special scripts.
Refer to your planning worksheets as you prepare to configure the application as a resource.
AIX 5L Fast Connect allows client PCs running Windows, DOS, and OS/2 operating systems to request files and print services from an AIX 5L server. Fast Connect supports the transport protocol NetBIOS over TCP/IP.
Prerequisites
Before you can configure Fast Connect resources in HACMP, make sure these steps have been taken:
Install the Fast Connect Server on all nodes in the cluster. AIX 5L print queue names match for all nodes in the cluster if Fast Connect printshares are to be highly available. For non-concurrent resource groups, assign the same netBIOS names to each node. when the Fast Connect Server is configured. This action will minimize the steps needed for the client to connect to the server after fallover. For concurrently configured resource groups, assign different netBIOS names across nodes. Configure on the Fast Connect Server those files and directories on the AIX 5L machine that you want shared. Configuration Notes for Fast Connect
When configuring Fast Connect as a cluster resource in HACMP, keep the following points in mind:
When starting cluster services, the Fast Connect server should be stopped on all nodes, so that HACMP can take over the starting and stopping of Fast Connect resources properly. In concurrent configurations, the Fast Connect server should have a second, non-concurrent, resource group defined that does not have Fast Connect on it. Having a second resource group configured in a concurrent cluster keeps the AIX 5L filesystems used by Fast Connect cross-mountable and highly available in the event of a node failure. Fast Connect cannot be configured in a mutual takeover configuration. Make sure there are no nodes participating in more than one Fast Connect resource groups at the same time. For instructions on using SMIT to configure Fast Connect services as resources, see the section Adding Resources and Attributes to Resource Groups Using the Extended Path in Chapter 5: Configuring HACMP Resource Groups (Extended).
Verification of Fast Connect
After completing your resource configuration, you synchronize cluster resources. During this process, if Fast Connect resources are configured in HACMP, the verification ensures that:
The Fast Connect server application exists on all participating nodes in a resource group. The Fast Connect fileshares are in filesystems that have been defined as resources on all nodes in the resource group. The Fast Connect resources are not configured in a mutual takeover form; that is, there are no nodes participating in more than one Fast Connect resource group. Configuring Highly Available Communication Links
HACMP can provide high availability for three types of communication links:
SNA configured over a LAN interface SNA over X.25 Pure X.25. Highly available SNA links can use either LAN interfaces (as in previous releases of HACMP) or X.25 links.
LAN interfaces are Ethernet, Token Ring, and FDDI interfaces; these interfaces are configured as part of the HACMP cluster topology.
X.25 interfaces are usually, though not always, used for WAN connections. They are used as a means of connecting dissimilar machines, from mainframes to dumb terminals. Because of the way X.25 networks are used, these interfaces are treated as a different class of devices that are not included in the cluster topology and not controlled by the standard HACMP topology management methods. This means that heartbeats are not used to monitor X.25 interface status, and you do not define X.25-specific networks in HACMP. Instead, an HACMP daemon, clcommlinkd, takes the place of heartbeats. This daemon monitors the output of the x25status command to make sure the link is still connected to the network.
Basic Steps for Configuring Highly Available Communication Links
Making a communication link highly available in HACMP involves these general steps:
1. Define the communication interfaces and links in AIX 5L.
2. Define the interfaces and links in HACMP.
3. Add the defined communication links as resources in HACMP resource groups.
These steps will be explained further in the sections covering each of the three communication link types.
Configuring SNA-Over-LAN Communication Links
Using the SMIT interface, you can configure an SNA link in AIX 5L and HACMP and add the link to a resource group.
Supported Software Versions
To configure highly available SNA links, you must have Communication Server for AIX 5L (CS/AIX) version 6.1 or higher.
Creating a Highly Available SNA-over-LAN Communication Link
To configure a highly available SNA-over-LAN communication link, you configure the link first in AIX 5L, then in HACMP, and finally you add the link to a cluster resource group; these procedures are described in the following sections.
Note that the AIX 5L configuration steps must be performed on each node; the HACMP steps can be performed on a single node and then the information will be copied to all cluster nodes during synchronization.
Configuring the SNA Link in AIX 5L
To define an SNA link in AIX 5L:
1. Enter smit hacmp
2. In SMIT, select System Management (C-SPOC) > HACMP Communication Interface Management > Configure Communication Interfaces/Devices to the Operating System on a Node and press Enter.
3. Select a node from the list.
4. Select a network interface type SNA Communication Links from the list.
If you have the Communication Server for AIX 5L (CS/AIX) version 6.1 or higher installed, this brings you to the main AIX 5L SMIT menu for SNA system configuration. Press F1 for help on entries required for configuring these links. Note that a valid SNA configuration must exist before an SNA link can be configured to HACMP and made highly available.
Configuring the SNA Link in HACMP
To configure a link in HACMP:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Communication Adapters and Links > Configure Highly Available Communication Links > Add Highly Available Communication Link > Add Highly Available SNA-over-LAN link and press Enter.
Enter field values as follows:
Name Enter the name by which you want the link to be known to HACMP throughout the cluster. This name must be unique among all highly available communication links, regardless of type, within the cluster. It can include alphanumeric characters and underscores, but cannot have a leading numeric. Maximum size is 32 characters. DLC Name Specify the SNA Data Link Control (DLC) profile to be made highly available. Port(s) Enter ASCII text strings for the names of any SNA ports to be started automatically. Link Station(s) Enter ASCII text strings for the names of the SNA link stations. Application Service File Enter the name of the file that this link should use to perform customized operations when this link is started or stopped. For more information on how to write an appropriate script, see Notes on Application Service Scripts for Communication Links.
3. After entering all values, press Enter to add this information to the HACMP Configuration Database.
Important Notes
It is possible to configure a DLC in HACMP when no connection actually exists. There is no guarantee that the port(s) or link station(s) are actually there. In other words, HACMP does not monitor the SNA DLCs, ports, and links directly for actual status—it monitors only the service interface over which SNA is running. Therefore, the SNA-over-LAN connection is only as highly available as the interface. The DLC follows the service label when the resource group falls over, so make sure you include a service label in the resource group that holds the SNA-over-LAN link. All SNA resources (DLC, ports, link stations) must be properly configured in AIX 5L when HACMP cluster services start up, in order for HACMP to start the SNA link and treat it as a highly available resource. If an SNA link is already running when HACMP starts, HACMP stops and restarts it. If multiple SNA links are defined in the operating system, make sure all of them are defined to HACMP. If not, HACMP will still stop them all at startup, but the links outside of HACMP will not be restarted. Adding the SNA Link to a Resource Group
You may or may not have resource groups defined at this point. The process for creating resource groups and adding resources to them is covered in Chapter 5: Configuring HACMP Resource Groups (Extended).
Note: When a resource group has a list of Service IP labels and Highly Available Communication Links with configured SNA resources, the first Service IP label in the list of Service IP labels defined in the resource group will be used to configure SNA.
To complete the configuration of highly available SNA-over-LAN communication links, you must add them to a resource group.
To add the SNA Link to a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resources and Attributes for a Resource Group and press Enter. SMIT displays a list of resource groups.
3. Select a resource group.
4. Specify the links in the Communication Links field for the resource group.
5. Press Enter.
Changing or Removing an SNA Communication Link
To change or remove a highly available SNA-over-LAN communication link, see the relevant SMIT options on the Extended Configuration >Extended Resource Configuration menu.
SNA Communication Links as Highly Available Resources
HACMP protects SNA-over-LAN connections during interface and node failures. This section describes the HACMP actions that take place for each of these failures.
Network Interface Failure
When a service interface over which SNA is running fails, HACMP will take the following actions:
1. Stops the DLC, which stops the link stations and the ports
2. Modifies the DLC to use an available standby interface
3. Restarts the ports, the link stations, and the DLC.
Depending on the number of DLC ports and link stations, this process may take several minutes. The DLC and its associated ports and link stations may be unavailable during the time it takes to recover from an interface failure. Clients or applications connected before the failure may have to reconnect.
If no standby interface is available, the event is promoted to a local network failure event, causing the affected resource group to fallover to another node, if another node using the same network is available.
Node Failure
Once any communication link type is configured as a resource, it is treated like other cluster resources in the event of a node failure. When a node fails, the resource group is taken over in the normal fashion, and the link is restarted on the takeover node. Any identified resources of that link, such as link stations and ports, are restarted on the takeover node.
Network Failure
Network failures are handled as they are in a non-SNA environment. When a network failure occurs, HACMP detects an IP network_down event and logs an error message in the /tmp/hacmp.out file. Even though the SNA connection is independent of the IP network, it is assumed that an IP network failure event indicates that the SNA link is also down. The local network_down event causes the resource group containing the SNA link to fall over to another node, if another node using the same network is available.
Verification of SNA Communication Links
Verification ensures that:
The specified DLCs, ports, and links exist and are correctly associated with each other The specified application service file exists and is readable and executable. There is no checking for invalid SNA configuration information; it is assumed that the system administrator has properly configured and tested SNA.
Note: Verification will fail if the SNA server is not running when verification is run. If SNA is stopped on any node in the resource group at the time of verification, HACMP reports an error, even if the SNA DLC is properly configured.
Configuring X.25 Communication Links
An X.25 communication link can be made highly available when included as a resource in an HACMP resource group. It is then monitored by the clcommlinkd daemon and appropriate fallover actions are taken when an X.25 communication link fails.
Supported Adapters and Software Versions
The following adapters are supported for highly available X.25 communication links:
IBM 2-Port Multiprotocol Adapter (DPMP) IBM Artic960Hx PCI Adapter (Artic) To configure highly available X.25 links, you must have AIXLink/X.25 version 2 or higher.
Creating an X.25 Communication Link
These steps describe how to configure a highly available X.25 communication link. You must first configure the X.25 adapter in AIX 5L, then configure the adapter and the link in HACMP, and finally, add the link to an HACMP resource group.
Note that the AIX 5L configuration steps must be performed on each node; the HACMP steps can be performed on a single node and then the information will be copied to all cluster nodes during synchronization.
Warning: HACMP should handle the starting of your X.25 links. If HACMP attempts to start the link and finds that it already exists, link startup fails, because all ports must have unique names and addresses. For this reason, make sure the X.25 port is not defined to AIX 5L when the cluster starts.
It is highly recommended that you configure the adapters and drivers and test the X.25 link outside of HACMP before adding it to HACMP. However, if you do this, you must make sure to delete the X.25 port from AIX 5L before starting HACMP, so that HACMP can properly handle the link startup.
Configuring the X.25 Adapter in AIX 5L
To configure an X.25 communication link you first configure the adapter in AIX 5L as follows:
1. Enter smit hacmp
2. In SMIT, select System Management (C-SPOC) > HACMP Communication Interface Management > Configure Communication Interfaces/Devices to the Operating System on a Node and press Enter.
3. Select a node from the list.
4. Select a network interface type X.25 Communication Interfaces from the list.
If you have the Communication Server for AIX 5L (CS/AIX) version 6.1 or higher installed, this brings you to an AIX 5L menu listing adapter types.
5. Select an adapter and fill in the Adapter, Services, and User Applications fields.
Configuring the X.25 Adapter in HACMP
After you define the adapter in AIX 5L, you must configure the adapter and the link in HACMP as follows:
1. In SMIT, enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Communication Adapters and Links > Configure Communication Adapters for HACMP > Add Communication Adapter and press Enter.
3. Enter field values as follows:
4. Press Enter to add this information to the HACMP Configuration Database.
Configuring the X.25 Link in HACMP
After you configure the X.25 adapter in HACMP, you must configure the X.25 link in HACMP as follows:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Communication Adapters and Links > Configure Highly Available Communication Links > Add Highly Available Communication Link > Add Highly Available X.25 Link and press Enter.
3. Enter field values as follows:
Name Enter the name by which you want the link to be known to HACMP throughout the cluster. This name must be unique among all highly available communication links, regardless of type, within the cluster. It can include alphanumeric characters and underscores, but cannot have a leading numeric.The maximum size is 32 characters. Port Enter the X.25 port designation you wish to use for this link, for example sx25a0. The port name must be unique across the cluster.This name must begin with “sx25a” but the final numeric character is your choice. The port name can be up to 8 characters long; therefore, the final numeric can contain up to three digits. Address/NUA Enter the X.25 address (local NUA) that will be used by this link. Network ID Enter the X.25 network ID. The default value is 5, which will be used if this field is left blank. Country Code Enter the X.25 country code. The system default will be used if this field is left blank. Adapter Name(s) Press F4 and select from the picklist the communication adapters that you want this link to be able to use. Note that these are the HACMP names, not the device names. Application Service File Enter the name of the file that this link should use to perform customized operations when this link is started and/or stopped. For more information on how to write an appropriate script, see Notes on Application Service Scripts for Communication Links.
4. Press Enter to add this information to the HACMP Configuration Database.
Adding the X.25 Link to a Resource Group
You may not have any resource groups defined at this point. The process for creating resource groups and adding resources to them is covered in Chapter 5: Configuring HACMP Resource Groups (Extended).
To complete the configuration of highly available X.25 communication links, you add them to a resource group.
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resources and Attributes for a Resource Group and press Enter.
3. Select the resource group.
4. Specify the links in the Communication Links field for the resource group.
Changing or Removing an X.25 Communication Link
For information on changing or removing a highly available X.25 communication link, see Reconfiguring Communication Links in Chapter 14: Managing the Cluster Resources.
X.25 Communication Links as Highly Available Resources
X.25 connections are protected during network interface and node failures. This section describes the HACMP actions that take place for each of these failures.
Network Adapter Failure
The X.25 link is monitored by an HACMP daemon, clcommlinkd. The daemon monitors whether the port is connected to the network, by checking the output of x25status. It does not check for an actual connection between nodes. If network connectivity is lost, the link falls over to another adapter port on the same node, or—if another port is not available on the node—the affected resource group falls over to the node with the next highest priority.
Node Failure
Once any communication link type is configured as a resource, it is treated like other cluster resources in the event of a node failure. When a node fails, the resource group is taken over in the normal fashion, and the link is restarted on the takeover node. Any identified resources of that link, such as link stations and ports, are restarted on the takeover node.
Network Failure
Network failures are handled as they would in a non-X.25 environment. When a network failure occurs, HACMP detects an IP network down and logs an error message in the /tmp/hacmp.out file. The local network failure event, network_down <node name> <network name>, causes the affected resource group to fall over to another node.
Verification of X.25 Communication Links
The HACMP cluster verification process ensures the following:
The specified adapters exist on the specified nodes. There is at least one adapter defined for every node defined in the resource group (since all nodes must be able to acquire the link). An adapter has not been specified for more than one link (this check occurs if the Multiple Links Allowed field is set to false). The application service file exists and is readable and executable. There is no checking for invalid X.25 configuration information; it is assumed that the system administrator has properly configured X.25.
Configuring SNA-Over-X.25 Communication Links
Configuring communication links for SNA running over X.25 involves a combination of the steps for SNA-over-LAN and pure X.25.
Supported Adapters and Software Versions
The following adapters are supported for highly available X.25 communication links:
IBM Dual Port Multiprotocol Adapter (DPMP) Artic960hx 4-Port Adapter (Artic). To configure highly available SNA-over-X.25 links, you must have the following software:
Communication Server for AIX 5L (CS/AIX) version 6.1 or higher AIXLink/X.25 version 2 or higher. Creating an SNA-over-X.25 Communication Link
These steps describe how to configure a highly available SNA-over-X.25 communication link.
Note that the AIX 5L configuration steps must be performed on each node; the HACMP steps can be performed on a single node and then the information will be copied to all cluster nodes during synchronization.
To create a highly available SNA-over-X.25 link, follow these steps:
1. Configure the SNA link in AIX 5L only (not in HACMP) on each node. For instructions, see Configuring the SNA Link in AIX 5L.
2. Configure the X.25 adapter in AIX 5L on each node. For instructions, see Configuring the X.25 Adapter in AIX 5L.
3. Configure the X.25 adapter in HACMP. For instructions, see Configuring the X.25 Adapter in HACMP.
Warning: HACMP should manage the startup of your X.25 links. If HACMP attempts to start the link and finds that it already exists, link startup fails, because all ports must have unique names and addresses. For this reason, make sure the X.25 port is not defined to AIX 5L when the cluster starts. Only the X.25 devices and drivers should be defined at cluster startup.
It is highly recommended that you configure the adapters and drivers and test the X.25 link outside of HACMP before adding it to HACMP. However, if you do this, you must make sure to delete the X.25 port from AIX 5L before starting HACMP, so that HACMP can properly handle the link startup.
In contrast, all SNA resources (DLC, ports, link stations) must be properly configured in AIX 5L when HACMP cluster services start up, in order for HACMP to start the SNA link and treat it as a highly available resource. If an SNA link is already running when HACMP starts, HACMP stops and restarts it. Also see the Important Notes in the SNA-over-LAN configuration section.
4. Configure the SNA-over-X.25 link in HACMP. From the main HACMP menu, In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Communication Adapters and Links > Configure Highly Available Communication Links > Add Highly Available Communication Links > Add Highly Available SNA-over-X.25 Link and press Enter.
5. Enter field values as follows:
Name Enter the name by which you want the link to be known to HACMP throughout the cluster. This name must be unique among all highly available communication links, regardless of type, within the cluster. It can include alphanumeric characters and underscores, but cannot have a leading numeric. The maximum length is 30 characters. X.25 Port Enter the X.25 port designation you wish to use for this link, for example: sx25a0. The port name must be unique across the cluster.This name must begin with “sx25a” but the final numeric character is your choice. The port name can be up to eight characters long; therefore the final numeric can contain up to three digits. X.25 Address/NUA Enter the X.25 Address/NUA that will be used by this link. X.25 Network ID Enter the X.25 Network ID. The default value is 5, which will be used if this field is left blank. X.25 Country Code Enter the X.25 Country Code. The system default will be used if this field is left blank. X.25 Adapter Name(s) Enter the HACMP names (not the device names) of the communication adapters that you want this link to be able to use. SNA DLC Identify the SNA DLC profile to be made highly available. Pick F4 to see a list of the DLC names. SNA Port(s) Enter ASCII text strings for the names of any SNA ports to be started automatically. SNA Link Station(s) Enter ASCII text strings for the names of the SNA link stations. Application Service
File Enter the name of the file that this link should use to perform customized operations when this link is started and/or stopped. For more information on how to write an appropriate script, see Notes on Application Service Scripts for Communication Links.
6. Press Enter to add this information to the HACMP Configuration Database.
Adding the SNA-Over-X.25 Link to a Resource Group
You may or may not have resource groups defined at this point. The process for creating resource groups and adding resources to them is described in Chapter 5: Configuring HACMP Resource Groups (Extended).
To complete the configuration of a highly available SNA-over-X.25 communication link, you add it to a resource group.
1. In SMIT, select Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resources and Attributes for a Resource Group and press Enter. SMIT displays a list of resource groups.
2. Select a resource group.
3. Specify the link in the Communication Links field.
Changing or Removing an SNA-Over-X.25 Communication Link
To change or remove a highly available SNA-over-X.25 communication link, see Reconfiguring Communication Links in Chapter 14: Managing the Cluster Resources.
SNA-Over-X.25 Communication Links as Highly Available Resources
SNA-over-X.25 connections are protected during adapter and node failures. This section describes the HACMP actions that take place for each of these failures.
Adapter Failure
The X.25 link is monitored by an HACMP daemon, clcommlinkd. The daemon monitors whether the port is connected to the network, by checking the output of x25status. It does not check for an actual connection between nodes. When an X.25 link over which SNA is running fails, HACMP causes the link to fall over to another adapter port on the same node, or—if another port is not available on the node—the affected resource group falls over to the node with the next highest priority.
Node Failure
Once any communication link type is configured as a resource, it is treated like other cluster resources in the event of a node failure. When a node fails, the resource group is taken over in the normal fashion, and the link is restarted on the takeover node. Any identified resources of that link, such as link stations and ports, are restarted on the takeover node.
Network Failure
Network failures are handled as they are in a non-SNA/X.25 environment. When a network failure occurs, HACMP detects an IP network down and logs an error message in the /tmp/hacmp.out file. The local network failure event, network_down <node name> <network name>, causes the resource group containing the link to fall over to another node.
Verification of SNA-Over-X.25 Communication Links
Verification ensures the following:
The specified SNA DLCs, ports, and links exist and are correctly associated with each other. The specified X.25 adapters exist on the specified nodes. There is at least one adapter for every node defined in the resource group (since all nodes must be able to acquire the link). (If the Multiple Links Allowed field is set to false) An adapter has not been specified for more than one link. The application service file exists and is readable and executable. There is no checking for invalid SNA or X.25 configuration information; it is assumed that the system administrator has properly configured SNA and X.25.
Note: Verification will fail if the SNA server is not running when verification is run. If SNA is stopped on any node in the resource group at the time of verification, HACMP reports an error (“The DLC <name> is not defined to CS/AIX on node <name>”), even if the SNA DLC is properly configured.
Notes on Application Service Scripts for Communication Links
When you define a communication link in HACMP, you specify the name of an application service file. The application service file may contain a script to perform customized operations when the link is started or stopped by HACMP. This script should contain a START line followed by startup instructions and a STOP line followed by stop instructions.
HACMP passes a single parameter in that identifies the type of link (SNA or X.25) to start or stop. For SNA-over-LAN and pure X.25 links, this parameter does not need to appear in your script. However, for SNA-over-X.25, you must specify “X25” and “SNA” so that the appropriate start and stop scripts run for both link types.
Here is an example application service file for an SNA-over-X.25 link:
START if [[ "$1" = "X25" ]]; then /usr/local/scripts/my_x25link_app_start_script.sh & elif [[ "$1" = "SNA" ]]; then /usr/local/scripts/my_snalink_app_start_script.sh & fi STOP if [[ "$1" = "X25" ]]; then /usr/local/scripts/my_x25link_app_stop_script.sh & elif [[ "$1" = "SNA" ]]; then /usr/local/scripts/my_snalink_app_stop_script.sh & fiWhen the application is to be started, all lines between the START and STOP tags are run. When the application is to be stopped, all lines after the STOP tag are run.
For an SNA-over-LAN or an X.25 link, the file need only contain the individual start and stop instructions, as in this example (for SNA-over-LAN):
START /usr/local/scripts/my_snalink_app_start_script.sh & STOP /usr/local/scripts/my_snalink_app_stop_script.sh &Customizing Resource Recovery
HACMP monitors system resources and initiates recovery when a failure is detected. Recovery involves moving a set of resources (grouped together in a resource group) to another node. HACMP uses selective fallover function when it can. Selective fallover enables HACMP to recover only those resource groups that are affected by the failure of a specific resource.
HACMP uses selective fallover in the following cases:
Loss of a volume group Local network failure Resource group acquisition failure Application failure. If you have configured resource groups with sites and an HACMP/XD disaster recovery solution, these are replicated resources. HACMP tracks and handles recovery of resource groups on both the primary and the secondary (backup) site. HACMP tries to recover the failure of a secondary instance as well as the primary instance.
You can customize recovery for two types of resources where HACMP uses selective fallover:
Service IP labels. By default, for a local network failure, HACMP responds by scanning the configuration for any service labels on that network and moving only the resource group containing the failed service IP label to another available node. Note: You cannot customize recovery for service IP labels for the secondary instance of a replicated resource group.
Volume groups. For volume groups where the recovery is triggered by an AIX 5L error notification from the LVM (caused by a loss of quorum for a volume group), HACMP moves the resource group to a takeover node. Note: Customizing volume group recovery (disabling selective fallover) in a cluster with this type of resource in a replicated resource group applies to both the primary and the secondary instances of the resource group.
However, selective fallover may not be the behavior you want when one of these resources fails. After upgrading from a previous release, if you have custom pre- and post-events to handle these situations, these may act in unexpected ways when combined with the selective fallover behavior. HACMP includes the Customize Resource Recovery option for changing the behavior of the selective fallover action for these resources. You can select to have the fallover occur, or to simply receive a notification.
Take the following steps to customize resource recovery for service label and volume group resources (especially if you have your own custom pre- and post-event scripts):
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Customize Resource Recovery and press Enter.
3. Select the resource to customize from the list.
4. Enter field values as follows:
5. Press Enter to apply the customized resource recovery action.
6. If you use the Notify Method, make sure it is on all nodes in the resource group nodelist.
7. Verify and synchronize the cluster.
Note on the Fallover Option and Resource Group Availability
Be aware that if you select the fallover option of customized resource recovery—which could cause a resource group to migrate from its original node—the possibility exists that while the highest priority node is up, the resource group remains down. This situation occurs when an rg_move event moves a resource group from its highest priority node to a lower priority node, and then you stop the cluster services on the lower priority node with an option to bring the resource groups offline. Unless you bring the resource group up manually, it remains in an inactive state.
For more information on resource group availability, see the section Selective Fallover for Handling Resource Groups in Appendix B: Resource Group Behavior during Cluster Events.
Testing Customized Resource Recovery
Once you have configured the options and synchronized your cluster successfully, you are ready to test that the new options provide the desired behavior.
Testing the Fallover Action on Resource Failure
This is the default behavior. When a resource failure occurs (local_network_down or volume group quorum loss), an rg_move event will be run for the affected resource group. You can test this behavior by inducing a local_network_down (fail all interfaces on that network on a single node) or by inducing the LVM_SA_QUORCLOSE error (power off a disk while writes are occurring such that quorum is lost for that volume group). For additional information, see the section Quorum Issues in Appendix A: 7x24 Maintenance.
You can also use the error emulation facility to create an LVM_SA_QUORCLOSE error log record. See the HACMP Event Emulation section in Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide.
Testing the Notify Action on Resource Failure
Induce the same failures mentioned above, selecting notify but no Notify Method. Instead of an rg_move event, a server_down event should run. Check the output in hacmp.out.
Testing the Notify Method
Configure a resource and resource group and specify the notify option for that resource, with a Notify Method. Induce one of the failures above to trigger the server_down event. The server_down event will call the Notify Method and any output from that method will be logged in hacmp.out.
Where You Go From Here
The next step is to configure the resource groups for the cluster, configure dependencies between resource groups, if needed, and add the resources to the resource groups. See Chapter 5: Configuring HACMP Resource Groups (Extended).
![]() ![]() ![]() |