![]() ![]() ![]() |
Chapter 15: Managing Resource Groups in a Cluster
This chapter describes how to reconfigure the cluster resource groups. It describes adding and removing resource groups, and changing resource group attributes and processing order. It also covers the Resource Group Management utility that allows you to change the status and location of resource groups dynamically using the SMIT interface, or the clRGmove command. This utility lets you move resource groups to other cluster nodes, for instance to perform system maintenance on a particular cluster node.
If you have dependent resource groups in the cluster, see the section Reconfiguring Resources in Clusters with Dependent Resource Groups in Chapter 14: Managing the Cluster Resources for information on making dynamic reconfiguration changes to the cluster resources.
Changes to the cluster resource groups are grouped under two general categories in this chapter:
Note: You can use either ASCII SMIT or WebSMIT to configure and manage the cluster and view interactive cluster status. You can also use WebSMIT to navigate, configure and view the status of the running cluster and graphical displays of sites, networks, nodes and resource group dependencies.
Changes to Resource Groups
Changes you make to resource groups consist of the following actions:
Reconfiguring Cluster Resources and Resource Groups
When you initially configured your HACMP system, you defined each resource as part of a resource group. This allows you to combine related resources into a single logical entity for easier configuration and management. You then configured each resource group to have a particular kind of relationship with a set of nodes. You also assigned a priority to each participating node for some non-concurrent resource groups.
To change the nodes associated with a given resource group or to change the priorities assigned to the nodes in a resource group chain, you must redefine the resource group. You must also redefine the resource group if you add or change a resource assigned to the group.
You can also redefine the order in which HACMP attempts to acquire and release the resource groups in your cluster. In general, HACMP processes all individual resource groups configured in your cluster in parallel unless you define a specific serial order upon which certain resource groups should be acquired or released, using the Change/Show Resource Group Processing Order panel in SMIT.
For general information about customizing the serial order of processing resource groups, see Chapter 6 in the Planning Guide.
This section describes how to view, change, add, and delete a resource group. For more information about the initial configuration of cluster resources, see Chapter 6 in the Planning Guide and the chapters in this Guide that describe initial configuration. These include:
Adding a Resource Group
You can add a resource group to an active cluster. You do not need to stop and then restart cluster services for the resource group to become part of the current cluster configuration.
To add a resource group, see Chapter 5: Configuring HACMP Resource Groups (Extended).
If the Cluster Manager is running on the local node, synchronizing the cluster triggers a dynamic reconfiguration event. For more information, see Chapter 7: Verifying and Synchronizing an HACMP Cluster.
Removing a Resource Group
You can remove a resource group from an active cluster. You do not need to stop and then restart cluster services for the resource group to be removed from the current cluster configuration.
To remove a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Extended Resource Group Configuration > Remove a Resource Group and press Enter.
3. Select the resource group you want to remove and press Enter. SMIT displays a popup warning, reminding you that all information about the resource group will be lost.
Note: If you have the following parent/child resource group dependency chain configured: A > B > C, and remove the resource group B, HACMP sends a warning that the dependency links between A and B, and between B and C are also removed. For more information, see Configuring Dependencies between Resource Groups in Chapter 5: Configuring HACMP Resource Groups (Extended).
4. Press Enter again to confirm your action.
5. Return to previous SMIT panels to perform other configuration tasks.
6. To synchronize the cluster definition, In SMIT, select the Extended Configuration > Extended Verification and Synchronization SMIT panel and press Enter.
If the Cluster Manager is running on the local node, synchronizing cluster resources triggers a dynamic reconfiguration event. For more information, see Chapter 7: Verifying and Synchronizing an HACMP Cluster.
Changing Resource Group Processing Order
By default, HACMP acquires and releases resource groups in parallel. To view or change the current order in which HACMP processes resource groups in your cluster use the Change/Show Resource Group Processing Order panel in SMIT. See Configuring Processing Order for Resource Groups in Chapter 5: Configuring HACMP Resource Groups (Extended).
Resource Group Ordering during DARE
In general, HACMP processes all individual resource groups configured in your cluster in parallel unless you define a specific serial order upon which certain resource groups should be acquired or released. Handling of any dependencies between resource groups take priority over any serial processing you specify.
If you need to control the actual processing order during dynamic reconfiguration (DARE), make the changes to only one resource group at a time. Otherwise, the order in which resource groups are acquired and released may be unpredictable.
During the dynamic reconfiguration process, you could have two scenarios:
Prior to dynamically changing any of the resource groups: The processing order for all the resource groups was parallel and
You did not change it during dynamic reconfiguration (DARE). In this case, during the dynamic reconfiguration process, HACMP processes the resource groups according to an alphabetically-sorted order, and not in parallel. If you made the changes to particular resource groups in the cluster, these changes may affect the order in which these resources will be actually released and acquired.
Prior to dynamically changing any of the resource groups: The processing order for some of the resource groups was parallel and
Some of the resource groups were included in the list for serial processing. In this case, if during DARE, you change the serial order in which some of the resource groups are acquired or released on the nodes, then the newly specified order becomes valid during the reconfiguration process. The new order is used by HACMP during the same cluster reconfiguration cycle.
After reconfiguration is complete, HACMP returns to the usual order of processing, as described below.
Resource group acquisition in HACMP occurs in the following order:
1. Resource groups for which the customized order is specified are acquired in the customized serial order.
2. If some of the resource groups in the cluster have dependencies between them, these resource groups are processed in phases. For example, parent resource groups are acquired before child resource groups are acquired.
3. Resource groups that mount NFS only are processed in the specified order.
4. Resource groups that are not included in the customized ordering lists are acquired in parallel.
Resource group release in HACMP occurs in the following order:
1. Resource groups for which no customized order have been specified are released in parallel.
2. HACMP releases resource groups that are included in the customized release ordering list.
3. If some of the resource groups in the cluster have dependencies between them, these resource groups are processed in phases. For example, the child resource groups are released before the parent resource groups are released.
4. Resource groups that must unmount NFS are processed in the specified order.
However, if you made changes to particular resource groups in the cluster, these changes may affect the order in which these resource groups are released and acquired. As a result, during the dynamic reconfiguration process, the actual order in which resource groups are acquired and released is unpredictable.
This order is dependent on the changes you make to the order during DARE, and on the types of dynamic changes you make to the resource groups themselves. For instance, due to the changes you made to a particular resource group, this resource group may need to be released before others in the list, even though the alphabetically-sorted order is used for the remaining resource groups.
Changing the Configuration of a Resource Group
You can change the following for a configured resource group:
The name of the resource group The nodes in the list of participating nodes The site management policy of a resource group The priority of participating nodes (by changing their position in the list of participating nodes) The startup, fallover and fallback policies for resource groups Attributes of the resource group. Warning: If you have added resources to the resource group, you need to remove them prior to changing the resource group’s startup, fallover and fallback policies.
You can change most of the attributes of a resource group in an active cluster without having to stop and then restart cluster services. However, to change the name of a resource group, you must stop and then restart the cluster to make the change part of the current cluster configuration.
Changing the Basic Configuration of a Resource Group
To change the basic configuration of a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show a Resource Group. SMIT displays a list of the currently defined resource groups.
3. Select the resource group to change and press Enter.
Note: HACMP shows only the valid choices for the specified resource group.
4. Enter field values as necessary.
5. Press Enter to change the resource group information stored in the HACMP Configuration Database (ODM).
6. Return to previous SMIT panels to perform other configuration tasks or to synchronize the changes you just made.
If the Cluster Manager is running on the local node, synchronizing cluster resources triggers a dynamic reconfiguration event. For more information, see Chapter 7: Verifying and Synchronizing an HACMP Cluster.
Changing Resource Group Attributes
To change the attributes of a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration > Change/Show Resources/Attributes for a Resource Group. SMIT displays a list of the currently defined resource groups.
3. Select the resource group you want to change and press Enter.
4. Change field values as needed.
5. Press Enter to change the resource group information stored in the HACMP Configuration Database.
6. Return to previous SMIT panels to perform other configuration tasks.
7. To synchronize the changes you made, In SMIT, select the Extended Configuration SMIT panel and select the Extended Verification and Synchronization option.
If the Cluster Manager is running on the local node, synchronizing cluster resources triggers a dynamic reconfiguration event.
Changing a Dynamic Node Priority Policy
You can use SMIT to change or show a dynamic node priority policy.
To show or change the dynamic node priority policy for a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Extended Resource Group Configuration > Change/Show Resources /Attributes for a Resource Group and press Enter.
3. Select the resource group.
You can change the dynamic node priority policy on the next panel, if you have one configured previously.
4. Select the policy you want and press Enter. See Dynamic Node Priority Policies in Chapter 5: Configuring HACMP Resource Groups (Extended) for more information.
Changing a Delayed Fallback Timer Policy
You can use SMIT to change or show a delayed fallback timer policy.
To change or show a previously configured fallback policy, follow these steps:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Extended Resource Group Configuration > Configure Resource Group Run-Time Policies > Change/Show a Delayed Fallback Timer Policy and press Enter. A picklist displays the previously configured timer policies.
3. Select the fallback timer policy to change.
4. Change the fallback timer policy on the next panel.
The new value for the timer will come into effect after synchronizing the cluster and after the resource group is released and restarted (on a different node or on the same node) due to either a cluster event or if you move the group to another node.
Note that you can change the parameters, but you cannot change the type of recurrence for the specific fallback timer. However, you can configure another fallback timer policy that uses a different predefined recurrence, and assign it to a resource group.
Removing a Delayed Fallback Timer Policy for a Resource Group
Note that you cannot remove a delayed fallback timer if any resource groups are configured to use it. First, change or remove the delayed fallback timer included as an attribute to any resource groups configured to use the unwanted timer, then proceed to remove it, as described in the following procedure.
To delete a previously configured delayed fallback timer policy:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration> Extended Resource Configuration > Extended Resource Group Configuration > Configure Resource Group Run-Time Policies > Delete a Delayed Fallback Timer Policy and press Enter. A picklist displays the previously configured timer policies.
3. Select the fallback timer policy to remove and press Enter. You will be asked Are you sure?
4. Press Enter again.
Showing, Changing, or Deleting a Settling Time Policy
You can change, show or delete previously configured settling time policies using the Extended HACMP Resource Group Configuration > Configure Resource Group Run-Time Policies > Configure Settling Time Policy SMIT path.
Changing a Location Dependency between Resource Groups
There are three types of location dependencies between resource groups:
Online On Same Node Online On Different Nodes Online On Same Site Changing an Online on Same Node Dependency
To change an Online on Same Node location dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Same Node Dependency > Change/Show Online on Same Node Dependency between Resource Groups and press Enter.
3. Select the Online on Same Node dependency set of resource groups to show.
4. Add a resource group to the selected Online on Same Node dependency set of resource groups:
5. Press Enter.
6. Verify the cluster.
Changing an Online on Different Nodes Dependency
To change an Online on Different Nodes location dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Different Nodes Dependency > Change/Show Online on Different Nodes Dependency between Resource Groups and press Enter.
3. Select the Online on Different Nodes dependency set of resource groups to show.
4. Make changes as required and then press Enter
5. Verify the cluster.
Changing an Online on Same Site Dependency
To change an Online on Same Site location dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on Same Site Dependency > Change/Show Online on Same Site Dependency between Resource Groups and press Enter.
3. Select the Online on Same Site dependency set of resource groups to show.
4. Add or remove resource groups from the list.
5. Verify the cluster.
Changing a Parent/Child Dependency between Resource Groups
To change a parent/child dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Parent/Child Dependency >Change/Show Parent/Child Dependency between Resource Groups and press Enter.
A list of child-parent resource group pairs appears.
3. Select a pair from the list and press Enter. A screen appears where you can change the parent resource group or the child resource group.
4. Change the resource groups as required and press Enter. Note that you cannot change the Dependency Type.
Displaying a Parent/Child Dependency between Resource Groups
You can display parent/child dependencies between resource groups.
Note: You can use either ASCII SMIT or WebSMIT to display parent/child dependencies between resource groups. For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
To display a dependency between parent/child resource groups:
1. Enter smit hacmp
2. In SMIT, select Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Parent/Child Dependency > Display All Parent/Child Resource Group Dependencies and press Enter.
3. Select Display per Child or Display per Parent to display all resource group dependencies for a child resource group, or for a parent resource group. Press Enter.
4. HACMP displays a list similar to one of the following:
Or:
Resource Group (RG_a) has the following child resource groups:RG_bRG_cRG_dResource Group (RG_e) has the following child resource groups:RG_bRG_cRG_d
Displaying a Parent/Child Dependency between Resource Groups in WebSMIT
You can display parent/child dependencies between resource groups using WebSMIT.
Starting with HACMP 5.4, you can use WebSMIT to view graphical displays of sites, networks, nodes and resource group dependencies. For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
Removing a Dependency between Resource Groups
You can remove any of the four types of dependencies between resource groups.
Deleting a Parent/Child Dependency between Resource Groups
To delete a parent/child dependency between resource groups:
1. Enter smit hacmp
2. In SMIT, select HACMP Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Parent/Child Dependency > Delete a Dependency between Parent/Child Resource Groups and press Enter.
HACMP displays a list of child-parent resource group pairs.
3. Select a pair from the list to delete and press Enter. Deleting a dependency between resource groups does not delete the resource groups themselves.
Note: If you have the following dependency chain configured: A > B > C, and remove the resource group B, HACMP sends a warning that the dependency links between A and B, and between B and C are also removed.
Deleting a Location Dependency between Resource Groups
To delete a location dependency between resource groups:
1. In SMIT, select the path for configuring the location dependency that you want to remove.
This example shows the path for Online on Same Node Dependency: Extended Resource Configuration > Configure Resource Group Run-Time Policies > Configure Dependencies between Resource Groups > Configure Online on same node Dependency > Remove Online on Same Node Dependency between Resource Groups and press Enter.
2. Select the Online on same node dependency to remove and press Enter.
Deleting a dependency between resource groups does not delete the resource groups themselves. The resource groups are now handled individually according to their site management, startup, fallover, and fallback policies.
Adding or Removing Individual Resources
You can add a resource to or remove a resource from a resource group in an active cluster without having to stop and restart cluster services to apply the change to the current configuration. You can add or remove resources from resource groups even if another node in the cluster is inactive. However, it is more convenient to have nodes active, so you can obtain a list of possible shared resources for each field by pressing the F4 key when you are in the SMIT Change/Show Resources/Attributes for a Resource Group panel.
Resource groups can contain many different types of cluster resources, including IP labels/addresses, filesystems, volume groups and application servers. You can change the mix of resources in a resource group and the settings of other cluster resource attributes by using the SMIT Change/Show Resources/Attributes for a Resource Group panel. See the following section.
Reconfiguring Resources in a Resource Group
To change the resources in a resource group:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > Configure Resource Groups > Change/Show Resources/Attributes for a Resource Group and press Enter. SMIT displays a picklist of configured resource groups.
3. Select the resource group you want to change and press Enter. SMIT displays a panel that lists all the types of resources that can be added to the type of selected resource group, with their current values.
Note: If you specify filesystems to NFS mount in a non-concurrent resource group with the startup policy of either Online on Home Node Only, or Online on First Available Node, you must also configure the resource to use IP Address Takeover. If you do not do this, takeover results are unpredictable.You should also set the field value Filesystems Mounted Before IP Configured to true so that the takeover process proceeds correctly.
4. Enter the field values you want to change, and press Enter.
5. Return to previous SMIT panels to perform other configuration tasks or to synchronize the changes you just made.
If the Cluster Manager is running on the local node, synchronizing cluster resources triggers a dynamic reconfiguration event. For more information, see Chapter 7: Verifying and Synchronizing an HACMP Cluster.
Forcing a Varyon of a Volume Group
You can force a varyon of a volume group either by specifying an attribute in SMIT, or by entering a command at the command line. It is recommended that you use SMIT to force a varyon because HACMP does the following before attempting to activate a volume group on a node:
Checks whether the LVM mirroring is used for the disks Verifies that at least one copy of every logical volume for this volume group can be found. It is recommended to specify the super strict allocation policy for the logical volumes in volume groups for which forced varyon is specified.
As with regular volume group operations, you can determine the final status of the volume group using the messages logged by HACMP during the verification process and the information logged in the hacmp.out file. You can also use the lsvg -o command to verify whether a volume group is offline or online, and the lsvg -l command to check the volume group status and attributes.
If after checking partition maps, HACMP cannot find a complete copy of every logical volume for a volume group, an error message: “Unable to vary on volume group <vg name> because logical volume <logical volume name> is incomplete” displays in the hacmp.out file and the volume group remains offline.
For more information about the forced varyon functionality and quorum issues, see Chapter 5: Planning Shared LVM Components in the Planning Guide.
Forcing a Varyon of a Volume Group from SMIT
Note that you specify a forced varyon attribute for all volume groups that belong to a resource group. For instructions on setting a forced varyon attribute using SMIT, see Forcing a Varyon of Volume Groups in Chapter 5: Configuring HACMP Resource Groups (Extended).
With this attribute, if a normal varyonvg fails, a check is made to ensure that there is at least one complete copy of all data available in the volume group. If there is, it runs varyonvg -f; otherwise, the volume group remains offline. Specifying the forced varyon attribute for a volume group eliminates the need for a quorum buster disk or special scripts to force a varyon, although you can continue to use these methods.
Use the following procedure to ensure that you always have access to your data if there is a copy available, and that you receive notification if you lose either a copy of your data, or all copies.
To use HACMP forced varyon and error notification:
1. Disable quorum on your volume group. This will ensure that it does not vary off if you still have access to a copy of your data.
2. Use the SMIT forced varyon option to vary on your volume group if your data is available.
3. Set up error notification to inform you if a filesystem or logical volume becomes unavailable.
For information about creating scripts for cluster events, see Chapter 7: Planning for Cluster Events in the Planning Guide.
Forcing a Varyon of a Volume Group from the Command Line
Issue the varyonvg -f command for a specific volume group on a node in the cluster. If you use this method, HACMP does not verify that the disks are LVM mirrored, and does not check the logical partitions to verify that at least one complete copy of every logical volume can be found for this volume group. You should use this command with caution to avoid forcing a varyon of a volume group in a partitioned cluster. For more information, see the section Avoiding a Partitioned Cluster.
Warning: Forcing a varyon with non-mirrored logical volumes and missing disk resources can cause unpredictable results (both conditions must be present to cause problems.) Forcing a varyon should only be performed with a complete understanding of the risks involved. For more information, see the following section. Also, refer to the AIX 5L documentation.
Avoiding a Partitioned Cluster
Use care when using forced varyon to activate a volume group. If the cluster becomes partitioned, each partition might force the volume group to vary on and continue to run. In this case, two different copies of the data are active at the same time. This situation is referred to as data divergence, and does not allow a clean recovery. If this happens in a concurrent volume group, the two sides of the cluster have made uncoordinated updates.
To prevent cluster partitioning, you should configure multiple heartbeat paths between the disks in the cluster.
Resource Group Migration
The Resource Group Management utility lets you perform maintenance on a node without losing access to the node’s resources. You are not required to synchronize cluster resources or stop cluster services.
The Resource Group Management utility provides improved cluster management by allowing you to:
Bring a resource group online or offline. Move a non-concurrent resource group to a new location. This location can be a node in the same site or a node in the other site. Non-concurrent resource groups are resource groups that do not have the Online on All Available Nodes startup policy that is, they are not online simultaneously on multiple nodes in the cluster. Concurrent resource groups are resource groups that have the Online on All Available Nodes startup policy that is, they start on all nodes in the cluster. If you have requested HACMP to move, activate, or stop a particular resource group, then no additional operations on any additional groups will run until this operation is completed.
Specific considerations related to resource group migrations are:
HACMP attempts to recover resource groups in the ERROR state upon node_up events. However, if you moved a group to Node A, the group remains on Node A (even in the error state). When node B joins the cluster, it does not acquire any resource groups that are currently in the ERROR state on node A. To recover such resource groups, manually bring them online or move them to other nodes. For a summary of clRGmove actions that are allowed in clusters with sites, see the section Migrating Resource Groups with Replicated Resources later in this chapter. Note: When you request HACMP to perform resource group migration, it uses the clRGmove utility, which moves resource groups by calling an rg_move event. It is important to distinguish between an rg_move event that is triggered automatically by HACMP, and an rg_move event that occurs when you explicitly request HACMP to manage resource groups for you. To track and identify the causes of operations performed on the resource groups in the cluster, look for the command output in SMIT and for the information in the hacmp.out file.
This section covers:
Requirements before Migrating a Resource Group
Before attempting to explicitly move a resource group from one node to another, or to take a resource group online or offline, ensure that:
HACMP 5.4 is installed on all nodes. The Cluster Manager is running on the node that releases the resource group and on the node that acquires it. The cluster is stable. If the cluster is not stable, the operation that you request with the resource group terminates and you receive an error message. Migrating Resource Groups with Dependencies
HACMP prevents you from moving resource groups online or to another node under the following conditions:
If you took the parent resource groups offline with the Resource Group Management utility, clRGmove, HACMP rejects manual attempts to bring the resource groups that depend on these resource groups online. The error message lists the parent resource groups that you must activate first to satisfy the resource group dependency. If you have a parent and a child resource group online, and would like to move the parent resource group to another node or take it offline, HACMP prevents you from doing so before a child resource group is taken offline. However, if both parent and child are in the same Same Node or Same Site location dependency set, you can move them both as you move the whole set. You can move Same Node dependency or Same Site dependency sets of resource groups. If you move one member of one of these sets, the whole set moves. The rules for location dependencies may not allow some moves. See the section Moving Resource Groups with Dependencies in a Cluster with Sites.
Migrating Resource Groups Using SMIT
You can access the resource group migration functions using the SMIT interface as well as the command line. This section includes instructions on how to move, bring online, or take offline concurrent and non-concurrent resource groups using SMIT.
Note: You can access the resource group migration functions using the ASCII SMIT, WebSMIT, or the command line interface. This section includes instructions on how to move, bring online, or take offline concurrent and non-concurrent resource groups using ASCII SMIT.
For more information on the command line, see Migrating Resource Groups from the Command Line in this chapter.
For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
To manage resource groups in SMIT:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management and press Enter.
SMIT presents the following options for resource group migration.
Show Current State of Applications and Resource Groups Displays the current states of applications and resource groups for each resource group.
- For non-concurrent groups, HACMP shows only the node on which they are online and the applications state on this node.
- For concurrent groups, HACMP shows ALL nodes on which they are online and the applications states on the nodes.
- For groups that are offline on all nodes, only the application states are displayed; node names are not listed.
Bring a Resource Group
Online This option brings a resource group online by calling the rg_move event. Select this option to activate all resources in a specified resource group on the destination node that you specify, or on the node that is the current highest priority node for that resource group. For more information, see Bringing a Non-Concurrent Resource Group Online with SMIT or Bringing a Concurrent Resource Group Online with SMIT. Bring a Resource Group
Offline This option brings a resource group offline on a node. Use this option to deactivate all resources in a specified resource group on a specific destination node.For more information, see Taking a Concurrent Resource Group Offline or Taking a Non-Concurrent Resource Group Offline with SMIT. Move a Resource Group to another node/site This option moves resource group(s) between nodes within the same site, or between sites. This option is applicable only to non-concurrent resource groups.For more information, see the section Moving Non-Concurrent Resource Groups to Another Node with SMIT.
3. Select a type of resource group migration you want to perform and press Enter.
Note: If you have replicated resources defined in your cluster, resource group selector panels and node selector panels in SMIT contain additional information to help you in the selection process.
In resource group selector panels, the resource group state, owner node and site appear to the right of the resource group name. In node selector panels, if the node belongs to a site, the name of that site appears to the right of the node name. Only the nodes within the same site appear in the selector panel for the destination node. Moving Non-Concurrent Resource Groups to Another Node with SMIT
You can move a non-concurrent resource group to a specified node in the cluster. You have two choices:
If you have defined location dependencies between resource groups (Online On Same Node, Online At Same Site) the picklist of resource groups lists these sets together on the same line. Moving any one of these resource groups causes the whole set to move.
If you have defined an Online on Different Nodes location dependency, HACMP may prevent you from moving a resource group or a set to a node if a higher priority resource group is already online on that node.
Moving a non-concurrent resource group to a node at the same site with SMIT
To move a non-concurrent resource group to another node at the same site:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Move a Resource Group to Another Node/Site and press Enter.
3. Select Move to another node within the same site and press Enter.
A picklist with resource groups appears. The list includes only those non-concurrent resource groups that are online on any node in the cluster.
Note: Resource groups with Online On The Same Node dependencies are listed as sets together (to move the whole set).
4. Select the resource group from the picklist and press Enter. The Select a Destination Node panel with potential destination nodes appears.
5. SMIT displays the nodes on which the Cluster Manager is running, that are included in the resource group’s nodelist, and that can acquire the resource group or set of groups:
6. Enter field values as follows:
Resource Groups to be Moved Resource group(s) that will be moved. Destination Node Destination node that you selected in the previous panel. This field is non-editable.
7. Confirm your selections and press Enter. You do not need to synchronize the cluster. The clRGmove utility waits until the rg_move event completes. This may result in a successful migration of the resource group, or a failure. If the move fails, check hacmp.out for the reason.
Moving a non-concurrent resource group to a node at another site with SMIT
To move a non-concurrent resource group to a node at another site:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Move Resource Groups to Another Node/Site > Move Resource Groups to Another Site and press Enter.
3. A picklist with resource groups appears. The list includes only those non-concurrent resource groups that are online on any node in the cluster. If you have defined dependencies between resource groups, these resource groups are listed on the same line in the same order you entered them when you defined the dependency.
Resource groups with same site dependencies are listed first, followed by resource groups with same node dependencies, then individual resource groups.
4. Select the resource group (or set) from the picklist and press Enter. The Select a Destination Site panel appears. If HACMP finds that an originally configured Primary site for the group is now available to host the group, it indicates this in the picklist with an asterisk (*). Here is an example:
5. Select a node at the destination site and press Enter.
6. Confirm your selections and press Enter. You do not need to synchronize the cluster. The clRGmove utility waits until the rg_move event completes. This may result in a successful migration of the resource group, or a failure.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was successfully moved. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If the resource group migration fails, HACMP displays a message with the cause of failure. Be sure to take action in this case to stabilize the cluster, if needed. For more information, see the section No Automatic Recovery for Resource Groups That Fail to Migrate.
Bringing a Non-Concurrent Resource Group Online with SMIT
To bring a non-concurrent resource group online:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Bring a Resource Group Online. The picklist appears. It lists resource groups that are offline or in the ERROR state on all nodes in the cluster.
3. Select the non-concurrent resource group from the picklist and press Enter.
The Select a Destination Node picklist appears. It displays only the nodes that have cluster services running, participate in the resource group nodelist, and have enough available resources to host the resource group. The nodes in this list appear in the same order of priority as in the resource group nodelist.
4. Select a destination node using one of the following options and press Enter if HACMP finds that an originally configured highest priority node for the group is now available to host the group, it indicates this in the picklist with an asterisk (*). Here is an example:
5. Enter field values as follows:
Resource Group to Bring Online Resource group to be activated. Destination Node Destination node you selected.
6. Confirm your selections and press Enter to start the execution of the rg_move event and bring the resource group online. You do not need to synchronize the cluster.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was successfully brought online on the specified node. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If you requested HACMP to activate the resource group on a particular node, and this node fails to bring the resource group online, the resource group is put into the ERROR state. In this case, HACMP does not attempt to activate the resource group on any other node in the cluster without your intervention. The error message in this case indicates that your intervention is required to activate the resource group on another node and stabilize the cluster.
Taking a Non-Concurrent Resource Group Offline with SMIT
To take a non-concurrent resource group offline:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Bring a Resource Group Offline. The picklist appears. It lists only the resource groups that are online or in the ERROR state on all nodes in the cluster.
3. Select the non-concurrent resource group from the picklist and press Enter.
After selecting a resource group, the Select a Destination Node picklist appears. It lists only the nodes on which cluster services are running and on which the resource group is currently online or in the ERROR state.
4. Select a destination node from the picklist. When you select it, it becomes a temporarily set highest priority node for this resource group.
5. Enter field values as follows:
Resource Group to Bring Offline Resource group that is stopped or brought offline. Destination Node Node on which the resource group will be stopped.
6. Confirm your selections and press Enter to start the execution of the rg_move event and bring the resource group offline. You do not need to synchronize the cluster.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was successfully stopped on the specified node. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If you requested to bring a resource group offline on a particular node, and the resource group fails to release from the node on which it is online, an error message indicates that your intervention is required to stabilize the cluster.
Bringing a Concurrent Resource Group Online with SMIT
You can use SMIT to bring a concurrent resource group online either on one node or on ALL nodes in the cluster.
Note: Concurrent resource groups are those resource groups that have the Online on All Available Nodes startup policy; that is, they start on all nodes in the cluster.
Bringing a resource group online in SMIT or through the command line activates (starts) the specified resource group on one node, or on ALL nodes.
To bring a concurrent resource group online:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Bring a Resource Group Online. The picklist appears. It lists only the resource groups that are offline or in the ERROR state on some or all nodes in the cluster.
3. Select the concurrent resource group from the picklist and press Enter.
The Select a Destination Node picklist appears. HACMP displays only the nodes that have cluster services running, participate in the resource group’s nodelist and have enough available resources to host the resource group. If HACMP finds that an originally configured highest priority node for the group is now available to host the group, it indicates this in the picklist with an asterisk (*).
4. Select a destination node and press Enter
:
After selecting a destination node, the panel Bring a Resource Group Online appears. It shows your selections:
5. Confirm your selections and press Enter to start the execution of the rg_move event and bring the resource group online. You do not need to synchronize the cluster.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was successfully brought online. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If you requested HACMP to activate the resource group on the node(s), and a particular node fails to bring the resource group online, the resource group is put into the ERROR state. In this case, HACMP does not attempt to activate the resource group on any other node in the cluster without your intervention. The error message in this case indicates that your intervention is required to activate the resource group on another node and stabilize the cluster.
Taking a Concurrent Resource Group Offline
You can use SMIT to take a concurrent resource group offline either on one node or on ALL nodes in the cluster.
When taking a resource group offline in SMIT or through the command line, you can select whether to make the offline state persist after rebooting the cluster, or not.
To take a concurrent resource group offline:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Bring a Resource Group Offline. The picklist appears. It lists only the resource groups that are online or in the ERROR state on at least one node in the resource group nodelist.
3. Select the concurrent resource group from the picklist and press Enter.
The Select an Online Node picklist appears. It displays a list of nodes that have cluster services running, and that appear in the same order of priority as in the resource group nodelist.
4. Select a destination node from the picklist and press Enter:
5. Enter field values as follows:
6. Press Enter to start the execution of the rg_move event to take the resource group offline. You do not need to synchronize the cluster.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was stopped on the node(s) in the cluster. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If you requested to bring a resource group offline on a particular node, and the resource group fails to release from the node on which it is online, an error message indicates that your intervention is required to stabilize the cluster.
No Automatic Recovery for Resource Groups That Fail to Migrate
If you request HACMP to move a resource group to a node and during this operation the destination node fails to acquire the group, the resource group is put into an ERROR state. If you try to move a resource group that has a dependency (parent/child or location) that prohibits the move, the resource group will be in the DEPENDENCY_ERROR state.
Similarly, if you request HACMP to activate the resource group on a particular node, and this node fails to bring the resource group online, the resource group is put into an ERROR state.
In either case, HACMP does not attempt to acquire or activate the resource group on any other node in the cluster. The error messages in these cases indicate that your intervention is required to move the resource group to another node.
If you request HACMP to migrate a resource group to another node, but the node that owns it fails to release it, or if you request to bring a resource group offline on a particular node, but the node fails to release it, an error message indicates that your intervention is required to stabilize the cluster.
Returning Previously Moved Resource Groups to Their Originally Configured Highest Priority Nodes
This section applies only to resource groups with:
Fallback to Highest Priority Nodes fallback policies Resource groups that are configured to fall back, according to timers, for instance Fallback to Primary Site (if sites are defined) site fallback policies. If you move such a resource group to a node other than its highest priority node (or to a site other than its Primary site) and the resource group is normally set to fall back to its highest priority node (or Primary site), then, after you moved it, it will fall back to the “new” node or site, not to the originally set highest priority node or Primary site.
Therefore, you may want to move this group back to the originally configured highest priority node, or Primary site, if they become available. You do so using the same SMIT panels as for moving a resource group and by selecting a node from the node list that has an asterisk (this indicates that this node is originally configured highest priority node). The same is true for sites.
These actions restore the highest priority node or Primary site for resource groups that you previously manually moved to other nodes. From this point on, these groups will continue to fall back to the highest priority node (or to a node at a Primary site).
Migrating Resource Groups from the Command Line
This section provides information about resource group migrations from the command line using the clRGmove command. For information on how to migrate a resource group using SMIT, see the section Requirements before Migrating a Resource Group. Before performing either method of resource group migration, you should read the preceding overview sections.
Note: You can access the resource group migration functions using the ASCII SMIT, WebSMIT, or the command line interface. This section includes instructions on how to move, bring online, or take offline concurrent and non-concurrent resource groups using the command line.
For more information on ASCII SMIT, see Migrating Resource Groups Using SMIT in this chapter.
For more information on WebSMIT, see Chapter 2: Administering a Cluster Using WebSMIT.
For full information on the clRGmove command and all of its associated flags, see the clRGmove man page in Appendix A: Script Utilities in the Troubleshooting Guide.
The clRGmove utility lets you manually control the location and the state of resource groups by calling the rg_move event. With this command you can bring a specified resource group offline or online, or move a resource group to a different node. This utility provides the command line interface to the Resource Group Migration functionality, which can be accessed through SMIT. You can also use this command from the command line, or include it in the pre- and post-event scripts.
In this section, the phrase “non-concurrent resource group” refers to resource groups with a startup policy that is not Online On All Available Nodes. The phrase concurrent resource group is used to refer to a resource group with the startup policy of Online On All Available Nodes.
For a non-concurrent resource group, you can:
Take the resource group offline from an online node Bring the resource group online to a specific node Move the resource group from its current hosting node to a new location. For a concurrent resource group, you can:
Take the resource group offline from all nodes in the group's nodelist Take the resource group offline from one node in the group's nodelist Bring the resource group online on all nodes in the group's nodelist Bring the resource group online on one node in the group's nodelist. Example: Using clRGmove to Swap Resource Groups
In the three-node cluster indicated here, each node—Node1, Node2, and Node3—has a service label/IP address and a standby label/IP address. There are three non-concurrent resource groups with the following policies:
Startup: Online on Home Node Only Fallover: Fallover to Next Priority Node in the List Fallback: Fallback to Higher Priority Node in the List. These resource groups have node priority lists as follows:
Each node is up and possesses a resource group as follows:
Node2’s resources—contained in CrucialRG—are of particular importance to your operation. A situation occurs in which two cluster nodes fail. Node1 fails first; its resources fall over to Node3, since Node3 is in RG1’s priority list. Then Node2 fails. In this case, Node2’s crucial resources remain down; they have nowhere to go, since Node3’s only standby label/IP address has already been taken. The cluster now looks like this:
The crucial resource group is unavailable. HACMP is able to take care of only one failure, because there are no more standby label/IP addresses, so it handles the first failure, Node1, but not the second. However, if you need CrucialRG’s resources more than you need RG1’s, you can use the Resource Group Management utility to “swap” the resource groups so you can access CrucialRG instead of RG1.
You do this by issuing the following commands:
clRGmove -g RG1 -n node3 -dto bring RG1 offline on Node3, and
clRGmove -g CrucialRG -n node3 -uto bring CrucialRG online on Node3.
For more information, see the reference page for the clRGmove in Appendix A: Script Utilities in the Troubleshooting Guide.
After these resource group migration commands are completed, access to CrucialRG is restored, and the cluster looks like this:
Special Considerations when Stopping a Resource Group
After taking a resource group offline, you should not assume that a joining or rejoining node will bring that resource group online. The following are instances when a resource group must be brought back online using the Resource Group and Application Management utility.
If you use clRGmove -d to bring down a resource group with Online on Home Node startup policy, Fallover to Next Priority Node in the List fallover policy and Fallback to Higher Priority Node in the List fallback policy, and which resides on the highest priority node, it will remain in an inactive state. You must manually bring the resource group online through resource group management. If you specify the fallover option of application monitoring for a resource group using the Customize Resource Recovery SMIT panel, which may cause resource groups to migrate from their original owner node, the possibility exists that while the highest priority node is up, the resource group remains down. Unless you bring the resource group up manually, it will remain in an inactive state. If your resource group was placed in an UNMANAGED state, due to stopping cluster services without stopping the applications, you may need to bring this resource group online manually. See Chapter 3: Investigating System Components and Solving Common Problems in the Troubleshooting Guide for more information.
Checking Resource Group State
As with regular cluster events, you can debug the status of resource groups using the messages logged by HACMP in the hacmp.out file.
In addition, you can use clRGinfo to view the resource group location and status. See the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster for an example of the command output. Use clRGinfo -p to view the node that is temporarily the highest priority node.
Migrating Resource Groups with Replicated Resources
Starting with HACMP 5.3, the Resource Group Management utility provides additional support for moving resource groups that contain replicated resources. If you have sites defined in your cluster, you can take a resource group offline, online, or move it to another node in either site in the cluster. You use the same SMIT panels to perform these operations as you use in a cluster without sites.
If sites are defined in the cluster, the resource group can be in one of the following states:
On the Primary Site On the Secondary Site ONLINE ONLINE SECONDARY OFFLINE OFFLINE ERROR ERROR SECONDARY UNMANAGED UNMANAGED SECONDARY
Resource group states depend upon the Inter-Site Management Policy defined for a resource group.
Whatever actions a resource group in the ONLINE or in the ONLINE SECONDARY state is allowed to perform (be acquired on a node, or mount filesystems) depends on the type of replicated resources you configure in the cluster.
You can use clRGmove to move a resource group online, offline, or to another node within a site, or to a node on the other site. See Migrating Resource Groups from the Command Line for more information.
If you have configured resource group dependencies, these may limit possible user actions. See Migrating Resource Groups with Dependencies for more information.
If you have configured replicated resources in your cluster, you can perform the following resource group migrations:
On the Primary Site:
On the Secondary Site:
Migrating between Sites in a Non-concurrent Node Policy /Non-concurrent Site Policy Configuration
Migrating between Sites in a Concurrent Node Policy/Non-concurrent Site policy Configuration
For more information on configuring replicated resources in HACMP, see the Planning Guide.
Moving Resource Groups with Dependencies in a Cluster with Sites
The following table shows the various actions you can take to migrate resource groups and the interactions or limitations caused by resource group dependencies in the cluster.:
Bringing Resource Groups in a Cluster with Sites Offline
You can take primary or secondary instances of non-concurrent or concurrent resource groups offline. However, if you try to bring a last primary instance of a concurrent or non-concurrent resource group offline on a node, HACMP prevents this to avoid an unnecessary inter-site fallover:
Bringing Resource Groups in a Cluster with Sites Online
The following table shows the various actions you can take to bring resource groups online and the limitations to this action.
Migrating Replicated Resource Groups Using SMIT
Using the SMIT interface is recommended for these migrations.
Moving a Concurrent Replicated Resource Group to Another Site
You can move a concurrent resource group to the other site in the cluster through the HACMP Resource Group Management > Move a Resource Group SMIT path.In effect, you swap the primary and secondary instances of the resource group between the sites.
To move a concurrent resource group to another site:
1. Enter smit cl_admin
2. In SMIT, select HACMP Resource Group and Application Management > Move a Resource Group. A picklist with resource groups appears. The list includes only those resource groups that are online on any node in the cluster and have defined a site policy other than Ignore or Online on Both Sites.
3. Select the resource group from the picklist and press Enter. The Select a Destination Site panel appears showing potential destination sites.
4. SMIT displays the sites on which the Cluster Manager is running, which are included in the resource group’s nodelist, and which can acquire the resource group.
5. Enter field values as follows:
6. Confirm your selections and press Enter. You do not need to synchronize the cluster.
If the event completes successfully, HACMP displays a message and the status and location of the resource group that was successfully moved. For an example of such output, see the section Using the clRGinfo Command in Chapter 10: Monitoring an HACMP Cluster.
If the resource group migration fails, HACMP displays a message with the cause of failure. Be sure to take action in this case to stabilize the cluster, if needed. For more information, see the section No Automatic Recovery for Resource Groups That Fail to Migrate.
Customizing Inter-Site Resource Group Recovery
Starting with HACMP 5.3, inter-site resource group recovery is enabled by default for a new installation.
Selective fallover of resource groups between sites is disabled by default when you upgrade to HACMP 5.3 or 5.4 from a previous release. This is the pre-5.3 release behavior for non-Ignore site management policy. A particular instance of a resource group can fall over within one site, but cannot move between sites. If no nodes are available on the site where the affected instance resides, that instance goes into ERROR or ERROR_SECONDARY state. It does not stay on the node where it failed. This behavior applies to both primary and secondary instances.
Note that even if the Cluster Manager is not enabled to initiate a selective fallover across sites, it will still move the resource group if a node_down or node_up event occurs. You can manually move a resource group between sites.
Enabling or Disabling Selective Fallover between Sites
You can change the resource group recovery policy to allow or disallow the Cluster Manager to move a resource group to another site in cases where it can use selective fallover to avoid having the resource group go into ERROR state.
Inter-Site Recovery of Replicated Resource Groups
If you enable selective fallover across sites, HACMP tries to recover both the primary and the secondary instance of a resource group in these situations:
If an acquisition failure occurs while the secondary instance of a resource group is being acquired, the Cluster Manager tries to recover the resource group's secondary instance, as it does for the primary instance. If no nodes are available for the acquisition, the resource group's secondary instance goes into global ERROR_SECONDARY state. If quorum loss is triggered, and the resource group has its secondary instance online on the affected node, HACMP tries to recover the secondary instance on another available node. If a local_network_down occurs on an XD_data or Geo_primary network, HACMP moves replicated resource groups that are ONLINE on the particular node that have GLVM or HAGEO resources to another available node on that site. This functionality of the primary instance is mirrored to the secondary instance so that secondary instances may be recovered via selective fallover. Using SMIT to Enable or Disable Inter-Site Selective Fallover
To enable or disable the Resource Group Recovery with Selective Fallover behavior:
1. In SMIT, select Extended Configuration > Extended Resource Configuration > Customize Resource Group and Resource Recovery > Customize Inter-site Resource Group Recovery and press Enter.
A selector screen lists the resource groups that contain nodes from more than one site (including those with a site management policy of Ignore. These are not affected by this function even if you select one of them.)
2. Select the resource groups for recovery customization.
The next screen lists the selected resource groups and includes the field to enable or disable inter-site selective fallover.
3. To enable inter-site selective fallover (initiated by the Cluster Manager), select true. The default is false for a cluster migrated from a previous release, and true for a new HACMP 5.3 or 5.4 cluster.
![]() ![]() ![]() |