![]() ![]() ![]() |
Chapter 5: Configuring HACMP Resource Groups (Extended)
This chapter describes all the options for configuring resource groups using the SMIT Extended Configuration path. It also contains sections for NFS considerations and for using the forced varyon option.
The main sections in this chapter include:
Note: Starting with HACMP 5.3, you can configure several types of dependencies between resource groups. See Configuring Dependencies between Resource Groups for information.
Overview
You may have already used the Standard Configuration panels to configure some resources and groups automatically. Use the Extended Configuration SMIT panels to add more resources and groups, to make changes, or to add more extensive customization.
The Extended Resources Configuration path includes three main menus that contain sub menus:
HACMP Extended Resources Configuration. See Chapter 4: Configuring HACMP Cluster Topology and Resources (Extended) for this information. The SMIT panel for customizing cross-site recovery of resource groups is located here. Resource Group Runtime Policies Configuration Dependencies between Resource Groups Workload Manager Configure Resource Group Processing Order Delayed Fallback Timer Settling Time Node Distribution Policy HACMP Extended Resource Group Configuration: Add a Resource group Change/Show a Resource group Change /Show Resources and Attributes for a Resource Group Remove a Resource Group Show All Resources by Node or Resource Group The chapter also includes sections on two specific configuration issues:
Note: You can use either ASCII SMIT or WebSMIT to configure and manage the cluster. For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
Configuring Resource Groups
You can add resource groups with different startup, fallover and fallback policies. See the Concepts Guide for definitions and an overview of the different resource group policies. For information about planning new resource groups, see the Planning Guide. For information about migration of the pre-5.2 resource groups, see the Installation Guide.
This chapter explains how to configure resource groups with different combinations of inter-site resource group management with startup, fallover and fallback policies, and runtime policies.
We recommend that prior to configuring resource groups, you read the planning information. For more information, see Chapter 6: Planning Resource Groups in the Planning Guide.
Note: You can use either ASCII SMIT or WebSMIT to configure and manage the cluster and view interactive cluster status. Starting with HACMP 5.4, you can also use WebSMIT to navigate, configure and view the status of the and graphical displays of the running cluster. For more information about WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
The Extended Configuration path enables you to specify parameters that precisely describe the resource group’s behavior at startup, fallover, and fallback, including delayed fallback timers (you cannot configure the fallback timers using the Initialization and Standard Configuration path).
This section describes the parameters that you can specify for resource groups, provides their definitions and limitations for their use, and includes rules on how to set these parameters correctly:
Limitations and Prerequisites for Configuring Resource Groups
When configuring a resource group, the following conditions apply:
Networks can be configured to use IPAT via IP Aliases or IPAT via IP Replacement. You can add both aliased and non-aliased service IP labels as resources to resource groups. By default, HACMP processes resource groups in parallel. You may include a resource group in a list of resource groups that are processed serially. However, if you do not include a resource group in a serially-processed list, but specify a settling time or a delayed fallback timer for a resource group, the acquisition of this resource group is delayed. For complete information, see the section Configuring Processing Order for Resource Groups. Clocks on all nodes must be synchronized for the settings for fallover and fallback of a resource group to work as expected. You can configure sites for all resource groups. In SMIT, select the inter-site management policy according to your requirements. For planning inter-site management policies, see the Planning Guide. To view the information about resource groups and for troubleshooting purposes, use the clRGinfo command. Also, for troubleshooting purposes, you can use the Show All Resources by Node or Resource Group SMIT option. Steps for Configuring Resource Groups in SMIT
Steps to configure a resource group:
Step What you do... 1 Configure the Runtime Policies that you want to assign to your resource group.a) (Optional.) Configure a delayed fallback timer. See the section Defining Delayed Fallback Timers for instructions. After you have configured a delayed fallback policy, you can assign it to the resource group by specifying the appropriate fallback policy, and by adding this policy as an attribute to your resource group.b) (Optional.) Configure a settling time. See Configuring Resource Group Runtime Policies for instructions. 2 Define a startup policy for the resource group. Select the SMIT option Startup Policy. Instructions follow this list of steps. 3 Define a fallover policy for the resource group. Select the SMIT option Fallover Policy. Instructions follow this list of steps. 4 Define a fallback policy for the resource group. Select the SMIT option Fallback Policy. Instructions follow this list of steps. 5 After you have set the resource group’s startup, fallover, and fallback policies as necessary, add resources and attributes to the resource group.If you have configured a delayed fallback timer, you can include it as an attribute of the resource group. If you want to use one of the predefined dynamic node priority policies, include it. Also, if you have specified the settling time, it will be used for this resource group if the appropriate startup policy is specified. If you configured the node distribution policy for the cluster, it will be used for those resource groups that have the Online Using Node Distribution Policy startup policy specified.Note, during this step you can also define a dependency between resource groups, and a customized serial processing order of resource groups, if needed.
To configure a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Add a Resource Group and press Enter.
3. Enter the field values as follows:
Resource Group Name The name of the resource group must be unique within the cluster and distinct from the volume group and service IP label; it should relate to the application it serves, as well as to any corresponding device, such as websphere_service_address.Use no more than a total of 32 alphanumeric characters and underscores. Do not use a leading numeric. Duplicate entries and reserved words are not allowed. See List of Reserved Words. Inter-site Management Policy The default is Ignore. Select one of these options:
- Ignore. If you select this option, the resource group will not have ONLINE SECONDARY instances. Use this option if you use cross-site LVM mirroring. You can also use it with HACMP/XD for Metro Mirror.
- Prefer Primary Site. The primary instance of the resource group is brought ONLINE on the primary site at startup, the secondary instance is started on the other site. The primary instance falls back when the primary site rejoins.
- Online on Either Site. During startup the primary instance of the resource group is brought ONLINE on the first node that meets the node policy criteria (either site). The secondary instance is started on the other site. The primary instance does not fall back when the original site rejoins.
- Online on Both Sites. During startup the resource group (node policy must be defined as Online on All Available Nodes) is brought ONLINE on both sites. There is no fallover or fallback
Note that the resource group moves to another site only if no node or condition exists under which it can be brought or kept ONLINE on the site where it is currently located. The site that owns the active resource group is called the primary site.
Participating Nodes (Default Node Priority) Enter the names of the nodes that can own or take over this resource group. Enter the node with the highest priority first, followed by the other nodes in priority order. Leave a space between node names.NOTE: If you defined sites for the cluster, this panel is divided into Participating Nodes from Primary Site and Participating Nodes from Secondary Site. Startup Policy Select a value from the list that defines the startup policy of the resource group:Online On Home Node Only. The resource group should be brought online only on its home (highest priority) node during the resource group startup. This requires the highest priority node to be available.Online On First Available Node. The resource group activates on the first participating node that becomes available.If you have configured the settling time for resource groups, it will only be used for this resource group if you use this startup policy option. For information on the settling time, see the section Configuring Resource Group Runtime Policies.Online Using Node Distribution Policy. The resource group is brought online according to the node-based distribution policy. This policy allows only one resource group to be brought online on a node during startup.Note: Rotating resource groups upgraded from HACMP 5.1 will have the node-based distribution policy.For more information, see Using the Node Distribution Startup Policy.Online On All Available Nodes. The resource group is brought online on all nodes.If you select this option for the resource group, ensure that resources in this group can be brought online on multiple nodes simultaneously. Fallover Policy Select a value from the list that defines the fallover policy of the resource group:Fallover To Next Priority Node In The List. In the case of fallover, the resource group that is online on only one node at a time follows the default node priority order specified in the resource group’s nodelist.Fallover Using Dynamic Node Priority. Select one of the predefined dynamic node priority policies. See Dynamic Node Priority Policies for more information.Bring Offline (On Error Node Only). Select this option to bring a resource group offline on a node during an error condition.This option is most suitable when you want to ensure that if a particular node fails, the resource group goes offline only on that node but remains online on other nodes.Selecting this option as the fallover preference when the startup preference is not Online On All Available Nodes may allow resources to become unavailable during error conditions. If you do so, HACMP issues an error. Fallback Policy Select a value from the list that defines the fallback policy of the resource group:Fallback To Higher Priority Node In The List. A resource group falls back when a higher priority node joins the cluster.If you select this option, you can use the delayed fallback timer that you previously specified in the Configure Resource Group Runtime Policies SMIT menu. If you do not configure a delayed fallback timer, the resource group falls back immediately when a higher priority node joins the cluster.See the section Defining Delayed Fallback Timers for instructions.Never Fallback. A resource group does not fall back when a higher priority node joins the cluster.
4. Press Enter to add the resource group information to the HACMP Configuration Database.
If, during the configuration of resource groups, you select an option that prevents high availability of a resource group, HACMP issues a warning message. Also, HACMP prevents invalid or incompatible resource group configurations.
5. Return to the Extended Resource configuration panel or exit SMIT.
Note: For additional information on resource group behavior in clusters with sites, see the Planning Guide, Chapter 6, Planning Resource Groups.
Dynamic Node Priority Policies
The default node priority policy is the order in the participating nodelist. However, you may want to have a takeover node selected dynamically, according to the value of a specific system property at the time of failure.
Dynamic node priority policies based on three RMC resource attributes are preconfigured. You can see these listed when you choose Dynamic Node Priority as the Fallover Policy for the resource group on the Add a Resource Group SMIT panel:
cl_highest_free_mem - select the node with the highest percentage of free memory cl_highest_idle_cpu - select the node with the most available processor time cl_lowest_disk_busy - select the disk that is least busy Note: If you have defined a resource group over multiple sites (using the HACMP/XD software) and a dynamic node priority policy is configured for the group, you will receive this warning when verification runs:
"Warning: Dynamic Node Priority is configured in a resource group with nodes in more than one site. The priority calculation may fail due to slow communication, in which case the default node priority will be used instead."Configuring Resource Group Runtime Policies
Resource Group runtime policies include:
Dependencies between resource groups. See the section Configuring Dependencies between Resource Groups. Resource group processing order See the section Configuring Processing Order for Resource Groups Workload Manager. See the section Configuring Workload Manager Settling Time for resource groups. See the section Configuring Resource Groups. Delayed Fallback Timer for resource groups. See the section Configuring Resource Groups. Node distribution policy. See the section Using the Node Distribution Startup Policy. Configuring Dependencies between Resource Groups
You can set up more complex clusters by specifying dependencies between resource groups.
Business configurations that use multi-tiered applications can utilize parent/child dependent resource groups. For example, the back end database must be online before the application server. In this case, if the database goes down and is moved to a different node, the resource group containing the application server must be brought down and back up on any node in the cluster. For more information about examples of multi-tiered applications, see the Concepts Guide.
Business configurations that require different applications to run on the same node, or on different nodes can use location dependency runtime policies. See Examples of Location Dependency and Resource Group Behavior in Appendix B: Resource Group Behavior during Cluster Events for more information.
In releases prior to HACMP 5.2, support for resource group ordering and customized serial processing of resources accommodated cluster configurations where a dependency existed between applications residing in different resource groups. With customized serial processing, you can specify that a resource group is processed before another resource group, on a local node. However, it is not guaranteed that a resource group will be processed in the order specified, as this depends on other resource groups policies and conditions.
The dependencies that you configure are:
Explicitly specified using the SMIT interface Established cluster-wide, not just on the local node Guaranteed to occur in the cluster, that is, they are not affected by the current cluster conditions. You can configure four types of dependencies between resource groups:
Parent/child dependency Online On Same Node Location Dependency Online On Different Nodes Location Dependency Online On Same Site Location Dependency. See the Planning Guide for more details and examples of these types of dependencies.
Considerations for Dependencies between Resource Groups
This section lists additional considerations you may need to keep in mind when configuring resource group dependencies. These include interaction with sites, use of pre-and post-event scripts, and information about the clRGinfo command.
If, prior to HACMP 5.2, you were using pre-and post-event scripts or other methods, such as resource group processing ordering to establish dependencies between the applications that are supported by your cluster environment, then these methods may no longer be needed or could be significantly simplified. For more information, see the section Dependent Resource Groups and the Use of Pre- and Post-Event Scripts in the Planning Guide. To obtain more granular control over the resource group movements, use the clRGinfo -a command to view what resource groups are going to be moved during the current cluster event. Also, use the output in the hacmp.out file. For more information, see the section Using Resource Groups Information Commands in Monitoring an HACMP Cluster. Dependencies between resource groups offer a predictable and reliable way of building clusters with multi-tiered applications. However, node_up processing in clusters with dependencies could take more time than in the clusters where the processing of resource groups upon node_up is done in parallel. A resource group that is dependent on other resource groups cannot be started until others have been started first. The config_too_long warning timer for node_up should be adjusted large enough to allow for this. During verification, HACMP verifies that your configuration is valid and that application monitoring is configured. You can configure resource group dependencies in HACMP/XD clusters that use replicated resources for disaster recovery. However, you cannot have the combination of any non-concurrent startup policy and concurrent (Online on Both Sites) inter-site management policy. You can have a concurrent startup policy combined with a non-concurrent inter-site management policy. The high-level steps required to specify resource group dependencies are described in the following sections.
Steps to Configure Dependencies between Resource Groups
This section provides a high-level outline of the steps required to configure a dependency between resource groups:
1. For each application that is going to be included in dependent resource groups, configure application servers and application monitors. For more information, see Application Monitoring for Dependent Resource Groups.
2. Create resource groups and include application servers as resources. For instructions, see Configuring Resource Groups and Adding Resources and Attributes to Resource Groups Using the Extended Path.
3. Specify a dependency between resource groups. For instructions, see Configuring Resource Groups with Dependencies.
4. Use the SMIT Verify and Synchronize HACMP Configuration option to guarantee the desired configuration is feasible given the dependencies specified, and ensure that all nodes in the cluster have the same view of the configuration.
Application Monitoring for Dependent Resource Groups
To ensure that the applications in the dependent resource groups start successfully, we recommend that you configure multiple application monitors.
In general, we recommend that you configure a monitor that will check the running process for an application in the child resource group, and a monitor that will check the running process for an application in the parent resource group.
For a parent resource group, it is also advisable to configure a monitor in a startup monitoring mode to watch the application startup.This ensures that after the parent resource group is acquired, the child resource group(s) can be also acquired successfully.
For information on monitor modes that you can specify (long-running mode, startup monitoring mode, and both), see Monitor Modes in Chapter 4: Configuring HACMP Cluster Topology and Resources (Extended).
For instructions on configuring application monitoring, see Configuring Multiple Application Monitors in Chapter 4: Configuring HACMP Cluster Topology and Resources (Extended).
Configuring Resource Groups with Dependencies
You can configure four types of dependencies between resource groups:
Parent/child dependency Online On Same Node Location Dependency Online On Different Nodes Location Dependency Online On Same Site Location Dependency. See the Planning Guide for more details and examples of how to use these types of dependencies.
Limitations for Combinations of Location Dependencies
The following limitations apply to configurations that combine dependencies:
Only one resource group can belong to a Same Node Dependency and a Different Node Dependency at the same time If a resource group belongs to both a Same Node Dependency and a Different Node Dependency, all nodes in the Same Node Dependency set have the same Priority as the shared resource group. Only resource groups with the same Priority within a Different Node Dependency can participate in a Same Site Dependency. Configuring a Parent/Child Dependency Between Resource Groups
In this type of dependency, the parent resource group must be online on any node in the cluster before a child (dependent) resource group can be activated on a node. These are the guidelines and limitations:
A resource group can serve as both a parent and a child resource group, depending on which end of a given dependency link it is placed. You can specify three levels of dependencies for resource groups. You cannot specify circular dependencies between resource groups. If a child resource group cannot be acquired on a node until after its parent resource group is fully functional, the child resource group goes into an ERROR state. If you notice that a resource group is in this state, you may need to troubleshoot which resources might need to be brought online manually to resolve the resource group dependency. When a resource group in a parent role falls over from one node to another, the resource groups that depend on it are stopped before the parent resource group falls over, and restarted again once the parent resource group is stable again. If a parent resource group is concurrent, the child resource group(s) that depend on it are stopped and restarted again. This allows the child resource group to update its knowledge about the nodes on which the parent resource group is currently online. For information on dynamic reconfiguration (DARE), see Reconfiguring Resources in Clusters with Dependent Resource Groups in Chapter 14: Managing the Cluster Resources. To configure a parent/child dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Parent/Child Dependency >Add Parent/Child Dependency between Resource Groups and press Enter.
3. Fill in the fields as follows:
4. Press Enter and verify the cluster.
Configuring Online on the Same Node Dependency for Resource Groups
When you configure two or more resource groups to establish a location dependency between them, they belong to a set for that particular dependency. The following rules and restrictions apply to the Online On Same Node Dependency set of resource groups:
All resource groups configured as part of a given Same Node Dependency set must have the same nodelist (the same nodes in the same order). All non-concurrent resource groups in the Same Node Dependency set must have the same Startup/Fallover/Fallback Policies. Online Using Node Distribution Policy is not allowed for Startup. If a Dynamic Node Priority Policy is chosen as Fallover Policy, then all resource groups in the set must have the same policy. If one resource group in the set has a fallback timer, it applies to the set. All resource groups in the set must have the same setting for fallback timers. Both concurrent and non-concurrent resource groups are allowed. You can have more than one Same Node Dependency set in the cluster. All resource groups in the Same Node Dependency set that are active (ONLINE) are required to be ONLINE on the same node, even though some resource groups in the set may be OFFLINE or in the ERROR state. If one or more resource groups in the Same Node Dependency set fails, HACMP tries to place all resource groups in the set on the node that can host all resource groups that are currently ONLINE (the ones that are still active) plus one or more failed resource groups. To configure an Online on Same Node dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Same Node Dependency >Add Online on Same Node Dependency between Resource Groups and press Enter.
3. Fill in the field as follows:
4. Press Enter.
5. Verify the configuration.
Configuring Online on Different Nodes Dependency for Resource Groups
When you configure two or more resource groups to establish a location dependency between them, they belong to a set for that particular dependency.The following rules and restrictions apply to the Online On Different Nodes Dependency set of resource groups:
Only one Online On Different Nodes Dependency set is allowed per cluster. Each resource group in the set should have a different home node for startup. When you configure resource groups in the Online On Different Nodes Dependency set you assign priorities to each resource group in case there is contention for a given node at any point in time. You can assign High, Intermediate, and Low priority. Higher priority resource groups take precedence over lower priority groups at startup, fallover, and fallback: If a resource group with High Priority is ONLINE on a node, then no other resource group in the Different Nodes Dependency set can come ONLINE on that node. If a resource group in this set is ONLINE on a node, but a resource group with a higher priority falls over or falls back to this node, the resource group with the higher priority will come ONLINE and the one with the lower priority will be taken OFFLINE and moved to another node if this is possible. Resource groups with the same priority cannot come ONLINE (startup) on the same node. Priority of a resource group for a node within the same Priority Level is determined by alphabetical order of the groups. Resource groups with the same priority do not cause one another to be moved from the node after a fallover or fallback. If a parent/child dependency is specified, then the child cannot have a higher priority than its parent. To configure an Online On Different Nodes dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Same Node Dependency >Add Online on Different Nodes Dependency between Resource Groups and press Enter. The following screen appears.
3. Fill in the fields as follows and then press Enter
4. Continue configuring runtime policies for other resource groups or verify the cluster.
Configuring Online on the Same Site Dependency for Resource Groups
When you configure two or more resource groups to establish a location dependency between them, they belong to a set for that particular dependency. The following rules and restrictions are applicable to Online On Same Site Dependency set of resource groups:
All resource groups in a Same Site Dependency set must have the same Inter-site Management Policy but may have different Startup/Fallover/Fallback Policies. If fallback timers are used, these must be identical for all resource groups in the set. All resource groups in the Same Site Dependency set must be configured so that the nodes that can own the resource groups are assigned to the same primary and secondary sites. Online Using Node Distribution Policy Startup Policy is supported. Both concurrent and non-concurrent resource groups are allowed. You can have more than one Same Site Dependency set in the cluster. All resource groups in the Same Site Dependency set that are active (ONLINE) are required to be ONLINE on the same site, even though some resource groups in the set may be OFFLINE or in the ERROR state. If you add a resource group included in a Same Node Dependency set to a Same Site Dependency set, then you must add all the other resource groups in the Same Node Dependency set to the Same Site Dependency set. To configure an Online On Same Site Dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Same Site Dependency >Add Online on Same Site Dependency between Resource Groups and press Enter.
3. Fill in the field as follows and press Enter:
4. Verify the cluster.
Configuring Processing Order for Resource Groups
This section describes how to set up the order in which HACMP acquires and releases resource groups.
By default, HACMP acquires and releases resource groups in parallel. If you upgraded your cluster from previous releases, see the Planning Guide for more information on which processing order is used for various resource groups in this case.
Resource groups acquisition occurs in the following order:
1. Those resource groups for which the customized order is specified are acquired in the customized serial order.
2. If some of the resource groups in the cluster have dependencies between them, these resource groups are acquired in phases. Parent resource groups are acquired before the child resource groups and resource group location dependencies are taken into account.
3. Resource groups which must mount NFS only are processed in the specified order.
4. Resource groups which are not included in the customized ordering lists are acquired in parallel.
Resource groups release occurs in the following order:
1. Those resource groups for which no customized order have been specified are released in parallel.
2. HACMP releases resource groups that are included in the customized release ordering list.
3. If some of the resource groups in the cluster have dependencies between them, these resource groups are released in phases. Child resource groups are released before the parent resource groups are released, for example.
4. Resource groups which must unmount NFS are processed in the specified order.
Resource Groups Processing Order and Timers
HACMP acquires resource groups in parallel, but if the settling time or the delayed fallback timer policy is configured for a particular resource group, HACMP delays its acquisition for the duration specified in the timer policy.
Settling and delayed fallback timers do not affect the release process.
Prerequisites and Notes
The following sections detail limitations of the resource group ordering.
Serial Processing Notes
When you configure individual resource groups that depend on other resource groups, you can customize to use the serial processing order that will dictate the order of processing on the local node. If you specify dependencies between resource groups, the order in which HACMP processes resource groups cluster-wide is dictated by the dependency.
Specify the same customized serial processing order on all nodes in the cluster. To do this, you specify the order on one node and synchronize cluster resources to propagate the change to the other nodes in the cluster. Also, since resource group dependencies also override any serial processing order, make sure that the serial order you specify does not contradict the dependencies. If it does, it will be ignored. If you have specified serial processing order for resource groups, and if in some of the resource groups only the NFS cross-mounting takes place during the acquisition (node_up event), or release (node_down event), then HACMP automatically processes these resource groups after other resource groups in the list. If you remove a resource group that has been included in the customized serial ordering list from the cluster, then the name of that resource group is automatically removed from the processing order list. If you change a name of a resource group, the list is updated appropriately. Parallel Processing Notes
In clusters where some groups have dependencies defined, these resource groups are processed in parallel using event phasing. For information on the order of processing in clusters with dependent resource groups, see the Job Types: Processing in Clusters with Dependent Resource Groups section in Chapter 2: Using Cluster Log Files in the Troubleshooting Guide.
If, prior to migrating to HACMP 5.3 and up (where parallel processing is the default) you had pre- and post-event scripts configured for specific cluster events, you may need to change them as they may no longer work as expected. See if you can reconfigure the resource groups to take advantage of the new dependency options. If you want to continue using these scripts for these resource groups, make sure you add each of the resource groups to the serial processing lists in SMIT. For more information, see the section Resource Groups Processed in Parallel and the Use of Pre- and Post-Event Scripts in Chapter 7: Planning for Cluster Events, in the Planning Guide.
Error Handling
If an error occurs during the acquisition of a resource group, recovery procedures are run after the processing of all other resource groups is complete. See the section Updated Cluster Event Processing in Chapter 7: Planning for Cluster Events, in the Planning Guide.
If an error occurs during the release of a resource group, the resource group goes offline temporarily while HACMP tries to recover it. If it moves to the ERROR state, you should take care of it manually.
Steps for Changing Resource Group Processing Order
To view or change the current resource group processing order in SMIT:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Configure Resource Group Run-time Policies > Configure Resource Group Processing Ordering and press Enter.
3. Enter field values as follows:
4. Press Enter to accept the changes. HACMP checks that a resource group name is entered only once on a list and that all specified resource groups are configured in the cluster. Then it stores the changes in the HACMP Configuration Database.
5. Synchronize the cluster in order for the changes to take effect across the cluster.
You can determine whether or not the resource groups are being processed in the expected order based on the content of the event summaries. For more information, see the section Tracking Parallel and Serial Processing of Resource Groups in the hacmp.out File in Chapter 2: Using Cluster Log Files in the Troubleshooting Guide.
Configuring Workload Manager
IBM offers AIX 5L Workload Manager (WLM) as a system administration resource included with AIX 5L. WLM allows users to set targets for and limits on CPU time, physical memory usage, and disk I/O bandwidth for different processes and applications; this provides better control over the usage of critical system resources at peak loads. HACMP allows you to configure WLM classes in HACMP resource groups so that the starting, stopping, and active configuration of WLM can be under cluster control.
For complete information on how to set up and use Workload Manager, see the AIX 5L Workload Manager (WLM) Redbook at the URL:
Steps for Configuring WLM in HACMP
Configuring WLM classes in HACMP involves these basic steps:
1. Configure WLM classes and rules, using the appropriate AIX 5L SMIT panels, as described below.
2. If you select a configuration other than the default (“HACMP_WLM_config”), specify the WLM configuration to be used in HACMP, as described below.
3. Assign the classes for this configuration to a resource group, selecting from a picklist of the classes associated with the default WLM configuration or the configuration you specified in Step 2. For instructions on adding resources to resource groups, see Adding Resources and Attributes to Resource Groups Using the Extended Path in this chapter.
4. After adding the WLM classes to the resource group—or after all resource group configuration is complete—verify and synchronize the configuration.
Note: Once WLM is configured in HACMP, HACMP starts and stops WLM. If WLM is already running when HACMP is started, HACMP restarts it with a new configuration file. Therefore, only the WLM rules associated with classes in a resource group that can be acquired on a given node will be active on that node. Once HACMP is stopped, WLM will be switched back to the configuration it was using when it was started.
Creating a New Workload Manager Configuration
To set up WLM classes and rules, use the AIX 5L SMIT panels.
1. In AIX 5L SMIT, select Performance & Resource Scheduling > Workload Management > Work on alternate configurations > Create a configuration. (You can also get to the “alternate configurations” panel by typing smitty wlm.)
2. Enter the new name for the configuration in the New configuration name field. It is recommended to use the default name that HACMP supplies: HACMP_WLM_config.
3. Define classes and rules for the HACMP configuration.
Defining a Non-Default Workload Manager Configuration in HACMP
You may have a non-default Workload Manager configuration. In this case, make this configuration known to HACMP, so that it is managed.
To ensure that a non-default Workload Manager Configuration is managed by HACMP:
1. Change the WLM runtime parameters to specify the HACMP configuration.
2. From the main HACMP SMIT panel, select Extended Configuration > Extended Resource Configuration > Configure Resource Group Run-time Policies > Configure HACMP Workload Manager Parameters.
This field indicates the WLM configuration to be managed by HACMP. By default, the configuration name is set to HACMP_WLM_config.
3. Specify a different configuration name if needed.
Verification of the Workload Manager Configuration
After adding WLM classes to resource groups, or after you have finished configuring all your resource groups, verify that the configuration is correct. The verification step is included in the synchronization process, as described in Synchronizing Cluster Resources later in this chapter.
Verification checks for the following conditions:
For each resource group with which a WLM class is associated, an application server is associated with this resource group. It is not required that an application server exists in the resource group, but it is expected. HACMP issues a warning if no application server is found. Each WLM class defined to an HACMP resource group exists in the specified HACMP WLM configuration directory. A non-concurrent resource group (that does not have the Online Using Node Distribution Policy startup policy) does not contain a secondary WLM class without a primary class. A resource group with the startup policy Online on All Available Nodes has only a primary WLM class. A resource group with the startup policy Online Using Node Distribution Policy has only a primary WLM class. Note: The verification utility cannot check class assignment rules to verify that the correct assignment will take place, since HACMP has no way of determining the eventual gid, uid and pathname of the user application. The user is entirely responsible for assigning user applications to the WLM classes when configuring WLM class assignment rules.
Cluster verification looks only for obvious problems and cannot verify all aspects of your WLM configuration; for proper integration of WLM with HACMP, you should take the time to plan your WLM configuration carefully in advance.
Reconfiguration, Startup, and Shutdown of WLM by HACMP
This section describes the way WLM is reconfigured or started or stopped once you have placed WLM under the control of HACMP.
Workload Manager Reconfiguration
When WLM classes are added to an HACMP resource group, then at the time of cluster synchronization on the node, HACMP reconfigures WLM so that it will use the rules required by the classes associated with the node. In the event of dynamic resource reconfiguration on the node, WLM will be reconfigured in accordance with any changes made to WLM classes associated with a resource group.
Workload Manager Startup
WLM startup occurs either when the node joins the cluster or when a dynamic reconfiguration of the WLM configuration takes place.
The configuration is node-specific and depends upon the resource groups in which the node participates. If the node cannot acquire any resource groups associated with WLM classes, WLM will not be started.
For a non-concurrent resource group with the startup policy other than Online Using Node Distribution Policy, the startup script will determine whether the resource group is running on a primary or on a secondary node and will add the corresponding WLM class assignment rules to the WLM configuration.
For each concurrent access resource group, and for each non-concurrent resource group with the startup policy Online Using Node Distribution Policy that the node can acquire, the primary WLM class associated with the resource group will be placed in the WLM configuration; the corresponding rules will be put into the rules table.
Finally, if WLM is currently running and was not started by HACMP, the startup script restarts WLM from the user-specified configuration, saving the prior configuration. When HACMP is stopped, it returns WLM back to its prior configuration.
Failure to start up WLM generates an error message logged in the hacmp.out log file, but node startup and/or the resource reconfiguration will proceed normally.
Workload Manager Shutdown
WLM shutdown occurs either when the node leaves the cluster or on dynamic cluster reconfiguration. If WLM is currently running, the shutdown script checks if the WLM was running prior to being started by the HACMP and what configuration it was using. It then either does nothing (if WLM is not currently running), or stops WLM (if it was not running prior to HACMP startup), or stops it and restarts it in the previous configuration (if WLM was Configuring Resources in a Resource Group
Once you have defined a resource group, you assign resources to it. SMIT cannot list possible shared resources for the node (making configuration errors more likely) if the node is powered off.
Configuring a Settling Time for Resource Groups
The settling time specifies how long HACMP waits for a higher priority node (to join the cluster) to activate a resource group that is currently offline on that node. If you set the settling time, HACMP waits for the duration of the settling time interval to see if a higher priority node may join the cluster, rather than simply activating the resource group on the first possible node that reintegrates into the cluster.
To configure a settling time for resource groups, do the following:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Settling Time for Resource Group and press Enter.
3. Enter field values as follows:
4. Press Enter to commit the changes and synchronize the cluster. This settling time is assigned to all resource groups with the Online on First Available Node startup policy.
You can change, show or delete a previously configured settling time using the same SMIT path as described for configuring a settling time.
For an example of the event summary showing the settling time, see the Event Summary for the Settling Time section in Chapter 2: Using Cluster Log Files in the Troubleshooting Guide.
Defining Delayed Fallback Timers
A delayed fallback timer lets a resource group fall back to its higher priority node at a specified time. This lets you plan for outages for maintenance associated with this resource group.
You can specify a recurring time at which a resource group will be scheduled to fall back, or a specific time and date when you want to schedule a fallback to occur.
You can specify the following types of delayed fallback timers for a resource group:
Daily Weekly Monthly Yearly On a specific date. Note: It is assumed that the delayed timer is configured so that the fallback time is valid. If the configured time occurs in the past or is not valid, you receive a warning and the delayed fallback policy is ignored. If you use a specific date, the fallback attempt is made only once, at the specified time.
To make a resource group use a delayed fallback policy, follow these steps:
Step What you do... 1 Configure a delayed fallback timer that you want to use. After you have configured the delayed fallback timer, you can use it in one or several resource groups as the default fallback policy. For instructions, see the following section. 2 Select the Fallback to Higher Priority Node option from the picklist of fallback policies for your resource group. You can do so when configuring a resource group.For instructions, see the section Steps for Configuring Resource Groups in SMIT. 3 Assign a fallback timer to a resource group, by adding it as an attribute to the resource group.If the delayed fallback timer entry does not show up in the list of attributes/resources that you can add to a resource group, this indicates that you did not follow the instructions in steps 1 and 2, because HACMP only displays attributes and resources that are valid in each particular case.For instructions, see the section Assigning a Delayed Fallback Policy to a Resource Group.
Configuring Delayed Fallback Timers in SMIT
To configure a delayed fallback timer, follow these steps:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Delayed Fallback Timer Policies > Add a Delayed Fallback Timer Policy and press Enter.
A picklist Recurrence for Fallback Timer displays. It lists Daily, Weekly, Monthly, Yearly and Specific Date policies.
3. Select the timer policy from the picklist and press Enter. Depending on which option you select, a corresponding SMIT panel displays that lets you configure this type of a fallback policy.
Assigning a Delayed Fallback Policy to a Resource Group
You must define the delayed fallback policies before you can assign them as attributes to resource groups.
To assign a delayed fallback policy to a resource group:
1. In HACMP SMIT, create a resource group, or select an existing resource group.
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resource and Attributes for a Resource Group and press Enter. SMIT displays a list of resource groups.
3. Select the resource group for which you want to assign a delayed fallback policy. The following panel appears. (The SMIT panel is abbreviated below. All valid options for the resource group are displayed based on the startup, fallover and fallback preferences that you have specified for the resource group.)
4. Enter field values as follows:
5. Press the F4 key to see the picklist in the Fallback Timer Policy field and select the fallback timer policy you want to use for this resource group.
6. Press Enter to commit the changes. The configuration is checked before populating the HACMP Configuration Database. You can assign the same fallback timer policy to other resource groups.
7. Assign fallback timer policies to other resource groups and synchronize the cluster when you are done.
Using the Node Distribution Startup Policy
For each resource group in the cluster, you can specify a startup policy to be Online Using Node Distribution Policy. The only distribution policy supported in HACMP 5.3 and up is node-based distribution. You can use this policy whether or not you have sites configured in the cluster.
Note: If you upgrade to HACMP 5.3 or 5.4 from a previous release that allowed network-based distribution, that configuration is automatically changed to node-based distribution.
This distribution policy is a cluster-wide attribute that causes the resource groups to distribute themselves in a way that only one resource group is acquired on a node during startup. Using this policy ensures that you distribute your CPU-intensive applications on different nodes.
If two or more resource groups are offline at the time when a particular node joins, the node acquires the resource group that has the least number of nodes in its nodelist. After considering the number of nodes, HACMP sorts the list of resource groups alphabetically.
Note: If one of the resource groups is a parent resource group (has a dependent resource group), HACMP gives preference to the parent resource group.
Prerequisites and Limitations for the Node Distribution Startup Policy
When configuring the node distribution startup policy, take into consideration the following:
If the number of resource groups is larger than the number of cluster nodes, HACMP issues a warning. It is recommended that all resource groups that use node-based distribution have potential nodes on which they could be brought online during the cluster startup. Resource groups configured for distribution during startup cannot have the fallover policy set to Bring Offline (on Error Node Only). If you select this combination of policies, HACMP issues an error. Resource groups configured for distribution during startup must use the Never Fallback policy. This is the only fallback policy HACMP allows for such resource groups. If you configure multiple resource groups to use the Online Using Node Distribution startup policy, and you select the Prefer Primary Site inter-site management policy for all groups, the node-based distribution policy ensures that the primary site hosts one group per node. Whether the resource group will fall back to the primary site depends on the availability of nodes on that site. HACMP allows only valid startup, fallover and fallback policy combinations and prevents you from configuring invalid combinations.
Adding Resources and Attributes to Resource Groups Using the Extended Path
Keep the following in mind as you prepare to define the resources in your resource group:
If you are configuring a resource group, first you configure timers (optional), startup, fallover, and fallback policies for a resource group, and then add specific resources to it. For information on configuring resource groups, see Configuring Resource Groups. You cannot change a resource group’s policies once it contains resources. If you have added resources, you need to remove them prior to changing the resource group’s policies. If you configure a non_concurrent resource group (with the Online on Home Node startup policy) with an NFS mount point, you must also configure the resource to use IP Address Takeover. If you do not do this, takeover results are unpredictable.You should also set the field value Filesystems Mounted Before IP Configured to true so that the takeover process proceeds correctly. A resource group may include multiple service IP addresses. When a resource group configured with IPAT via IP Aliasing is moved, all service labels in the resource group are moved as aliases to the available interfaces, according to the resource group management policies in HACMP. For more information on how HACMP handles the resource groups configured with IPAT via IP Aliasing see Appendix B: Resource Group Behavior during Cluster Events. When setting up a non-concurrent resource group with the startup policy of either Online on Home Node Only or Online on First Available Node, and with an IPAT via IP Address Replacement configuration, each cluster node should be configured in no more than (N + 1) resource groups on a particular network. Here, N is the number of backup (standby) interfaces on a particular node and network. IPAT functionality is not applicable to concurrent resource groups. If you configure application monitoring, remember that HACMP can monitor only one application in a given resource group, so you should put applications you intend to have HACMP monitor in separate resource groups. If you plan to request HACMP to use a forced varyon option to activate volume groups in case a normal varyon operation fails due to a loss of quorum, the logical volumes should be mirrored. It is recommended to use the super strict disk allocation policy for the logical volumes in AIX 5L. Steps for Adding Resources and Attributes to Resource Groups (Extended Path)
To configure resources and attributes for a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Extended Resource Group Configuration > Change/Show Resources and Attributes for a Resource Group and press Enter.
3. Select the resource group you want to configure and press Enter. SMIT returns the panel that matches the type of resource group you selected, with the Resource Group Name, Inter-site Management Policy, and Participating Node Names (Default Node Priority) fields filled in.
SMIT displays only valid choices for resources, depending on the resource group startup, fallover, and fallback policies that you selected.
Note: Be aware that once you add resources to a resource group, its startup, fallover, and fallback policies cannot be changed unless you remove the resources. You can only change the resource group policies in a resource group that does not contain any resources yet. Plan your resource group’s policies in advance, before adding resources to it.
If the participating nodes are powered on, press F4 to list the shared resources. If a resource group/node relationship has not been defined, or if a node is not powered on, the picklist displays the appropriate warnings.
4. Enter the field values as follows (non-concurrent resource group is shown):
5. Press Enter to add the values to the HACMP Configuration Database.
6. Return to the top of the Extended Configuration menu and synchronize the cluster.
Customizing Inter-Site Resource Group Recovery
When you install HACMP 5.3 or 5.4 and configure a new cluster with sites and an HACMP/XD product, selective fallover of resources included in replicated resource groups is enabled by default. If necessary for recovery, HACMP moves the resource group containing the resource to the other site.
If you have migrated from a previous release, the pre-5.3 release behavior is the default. In releases prior to 5.3, a particular instance of a resource group can fall over within one site, but cannot move between sites. If no nodes are available on the site where the affected instance resides, that instance goes into ERROR or ERROR_SECONDARY state. It does not stay on the node where it failed. This behavior applies to both primary and secondary instances.
Note that in HACMP 5.3 and 5.4, as in prior releases, even if the Cluster Manager cannot initiate a selective fallover inter-site rg_move (if this recovery is disabled), it will still move the resource group if a node_down or node_up event occurs, and you can manually move the resource group across sites.
Enabling or Disabling Selective Fallover between Sites
You can change the resource group recovery policy to allow or disallow the Cluster Manager to move a resource group to another site in cases where it can use selective fallover to avoid having the resource group go into ERROR state.
Inter-Site Recovery of Both Instances of Replicated Resource Groups
If selective fallover across sites is enabled, HACMP tries to recover both the primary and the secondary instance of a resource group:
If an acquisition failure occurs while the secondary instance of a resource group is being acquired, the Cluster Manager tries to recover the resource group's secondary instance, as it does for the primary instance. If no nodes are available for the acquisition, the resource group's secondary instance goes into global ERROR_SECONDARY state. If quorum loss is triggered, and the resource group has its secondary instance online on the affected node, HACMP tries to recover the secondary instance on another available node. If a local_network_down occurs on an XD_data network, HACMP moves replicated resource groups that are ONLINE on the particular node that have GLVM or HAGEO resources to another available node on that site. This functionality of the primary instance is mirrored to the secondary instance so that secondary instances may be recovered via selective fallover. (For more information on XD_data networks and GLVM, see the HACMP/XD for GLVM: Planning and Administration Guide). Using SMIT to Enable or Disable Inter-Site Selective Fallover
To enable or disable the Resource Group Recovery with Selective Fallover behavior:
1. In SMIT, select Extended Configuration > Extended Resource Configuration > Extended Resource Configuration > Customize Resource Group and Resource Recovery > Customize Inter-site Resource Group Recovery and press Enter.
A selector screen lists the resource groups that contain nodes from more than one site (including those with a site management policy of Ignore. These are not affected by this function even if you select one of them.)
2. Select the resource groups for recovery customization.
The next screen lists the selected resource groups and includes the field to enable or disable inter-site selective fallover.
3. To enable inter-site selective fallover (initiated by the Cluster Manager), select true. The default is false for a cluster migrated from a previous release, and true for a new HACMP 5.3 or 5.4 cluster.
Reliable NFS Function
You can configure NFS in all non-concurrent resource groups. See the chapter on Planning Shared LVM Components in the Planning Guide for information on planning prerequisites for configuring NFS as resources in resource groups.
As you configured resources, you can specify the following items related to NFS:
Use the Reliable NFS server capability that preserves locks and dupcache (two-node clusters only). Specify a network for NFS mounting. Define NFS exports and mounts at the directory level. Specify export options for NFS-exported directories and filesystems. Relinquishing Control over NFS Filesystems in an HACMP Cluster
Once you configure resource groups that contain NFS filesystems, you relinquish control over NFS filesystems to HACMP.
Once NFS filesystems become part of resource groups that belong to an active HACMP cluster, HACMP takes care of cross-mounting and unmounting the filesystems, during cluster events, such as fallover of a resource group containing the filesystem to another node in the cluster.
If for some reason you stop the cluster services and must manage the NFS filesystems manually, the filesystems must be unmounted before you restart the cluster services. This enables management of NFS filesystems by HACMP once the nodes join the cluster.
NFS Exporting Filesystems and Directories
The process of NFS-exporting filesystems and directories in HACMP differs from that in AIX 5L. The sections in Chapter 6: Planning Shared LVM Components in the Planning Guide explain the NFS-exporting process in HACMP. The following sections provide additional information.
Specifying Filesystems and Directories to NFS Export
In AIX 5L, you list filesystems and directories to be NFS-exported in the /etc/exports file; in HACMP, you must put these in a resource group.
You can configure NFS in all non-concurrent resource groups. See Chapter 6: Planning Shared LVM Components in the Planning Guide for information on planning prerequisites for configuring NFS as resources in resource groups.
Specifying Export Options for NFS Exported Filesystems and Directories
If you want to specify special options for NFS-exporting in HACMP, you can create a /usr/es/sbin/cluster/etc/exports file. This file has the same format as the regular /etc/exports file used in AIX 5L.
Use of this alternate exports file is optional. HACMP checks the /usr/es/sbin/cluster/etc/exports file when NFS-exporting a filesystem or directory. If there is an entry for the filesystem or directory in this file, HACMP will use the options listed. If the filesystem or directory for NFS-export is not listed in the file, or, if the user has not created the /usr/es/sbin/cluster/etc/exports file, the filesystem or directory will be NFS-exported with the default option of root access for all cluster nodes.
Configuring the Optional /usr/es/sbin/cluster/etc/exports File
In this step, you add the directories of the shared filesystems to the exports file. Complete the following steps for each filesystem you want to add to the exports file. Refer to your NFS-Exported Filesystem Worksheet.
Remember that this alternate exports file does not specify what will be exported, only how it will be exported. To specify what to export, you must put it in a resource group.
To add a directory to Exports List:
1. In SMIT, enter the fastpath smit mknfsexp.
2. In the EXPORT directory now, system restart or both field, enter restart.
3. In the PATHNAME of alternate Exports file field, enter /usr/es/sbin/cluster/etc/exports. This step creates the alternate exports file which will list the special NFS export options.
4. Add values for the other fields as appropriate for your site, and press Enter. Use this information to update the /usr/es/sbin/cluster/etc/exports file.
5. Return to the Add a Directory to Exports List panel, or exit SMIT if you are finished.
6. Repeat steps 1 through 4 for each filesystem or directory listed in the FileSystems/Directories to Export field on your planning worksheets.
Forcing a Varyon of Volume Groups
Forcing a varyon of volume groups is an option that you should use only with understanding of its consequences. This section describes the conditions under which you can safely attempt to forcefully bring a volume group online on the node, in the case when a normal varyon operation fails due to a loss of quorum.
For a complete overview of the forced varyon functionality and quorum issues, see the section Forcing a Varyon in Chapter 5: Planning Shared LVM Components in the Planning Guide.
We recommend to specify the super strict disk allocation policy for the logical volumes in volume groups for which forced varyon is specified. Configuring the super strict disk allocation policy for volume groups that may be forced on does the following:
Guarantees that copies of a logical volume are always on separate disks Increases the chances that forced varyon will be successful after a failure of one or more disks. Note: You should apply the super strict disk allocation policy for disk enclosures in the cluster. You specify the super strict policy under the Allocate each logical partition copy on a separate physical volume? option in the Add a Logical Volume, or Change/Show a Logical Volume SMIT panels in AIX 5L. Also, if you are using the super strict disk allocation policy, specify the correct number of physical volumes for this logical volume and do not accept the default setting of 32 physical volumes.
Use independent disk enclosures that use logical volume mirroring; place logical volume mirror copies on separate disks that rely on separate power supplies, and use separate physical network interfaces to ensure access. This ensures that no disk is a single point of failure for your cluster.
You can specify a forced varyon attribute for:
Volume groups on SSA or SCSI disks that use LVM mirroring where you want to NFS mount the filesystems Volume groups that are mirrored between separate RAID or ESS devices. Note: Be aware that when the forced varyon facility is used successfully and the volume group is brought online on the node (using the one complete copy of the data that was found), the data that you recover by forcing a volume group to go online is guaranteed to be consistent, but not necessarily the latest.
Note: During runtime, for large volume groups (those with more than 256 disks), checking logical partition maps may take extra processing time. However, since this time delay occurs only when you select a forced varyon for a large volume group in the case when a normal varyon failed due to a lack of quorum, enduring a slow varyon process that enables data recovery is preferable to having no chance at all to activate the volume group.
When HACMP Attempts a Forced Varyon
For troubleshooting purposes, it is helpful to know under what conditions or cluster events HACMP attempts a forced varyon, when this is configured. In general, HACMP attempts a forced varyon in the event of a cluster failure. The following list contains examples of cluster event failures that can trigger a forced varyon:
Cluster startup, normal varyon fails due to a loss of quorum on one of the disks. Nodes joining the cluster, normal varyon fails due to a loss of quorum on one of the disks. Node reintegration, normal varyon fails for concurrent resource groups. Selective fallover caused by an application or a node failure moves a resource group to a takeover node. Selective fallover caused by a loss of quorum for a volume group moves a resource group to a takeover node. When HACMP selectively moves a resource group for which a loss of quorum for a volume group error has occurred, it tries to bring the volume groups online on the takeover node. If a normal varyon process for volume groups fails at this point, and, if you have specified a forced varyon for the volume groups in this resource group, then, since quorum is lost, HACMP attempts a forced varyon operation.
To summarize, for the cases where HACMP uses selective fallover to move the resource groups, the sequence of events would be the following:
If, after an rg_move event, a forced varyon is launched and is successful, the resource group remains online on the node to which it has been moved. If, after an rg_move event, a forced varyon is launched and fails, selective fallover continues to move the resource group down the node chain. Note: If a resource failure occurs in a concurrent resource group, HACMP takes this resource group offline on a particular node. In this case, use the clRGmove utility to manually bring the resource group online on the node.
Avoiding a Partitioned Cluster
The forced option to activate a volume group must be used with care. Should the cluster become partitioned, each partition might force on the volume group and continue to run. In this case, two unequal copies of the data will be active at the same time. This situation can cause data divergence and does not allow a clean recovery. Were this to happen with a concurrent volume group, the consequences would be even worse, as the two sides of the cluster would have made uncoordinated updates.
To prevent cluster partitioning, configure multiple heartbeating paths. Where possible, use heartbeating through a disk path (TMSCSI, TMSSA or disk heartbeating).
Verification Checks for Forced Varyon
If you specified a forced varyon attribute for a resource group, and HACMP detects that the logical volumes are not being mirrored with the super strict disk allocation policy, HACMP a warns upon verification of cluster resources. In this case, a forced varyon operation may not succeed.
As part of the process, HACMP checks the logical partitions on each disk for each volume group:
If it cannot find a complete copy of every logical volume for a volume group, an error message: “Unable to vary on volume group <vg name> because logical volume <logical volume name> is incomplete” displays in the hacmp.out file. In this case, a forced varyon operation fails and you will see an event error. If HACMP can find a complete copy for every logical volume for all volume groups in this resource group that require a forced varyon, it varies on the volume groups on the node in the cluster. Testing Your Configuration
After you configure a cluster, you should test it before making it available in a production environment. For information about using Cluster Test Tool to test your cluster, see Chapter 8: Testing an HACMP Cluster.
![]() ![]() ![]() |