![]() ![]() ![]() |
Appendix E: Using DLPAR and CUoD in an HACMP Cluster
This appendix describes how to configure and use HACMP in a hardware and software configuration that uses Dynamic Logical Partitions (DLPARs) and the Capacity Upgrade on Demand (CUoD) function.
The topics in this appendix include:
Overview
The IBM pSeries servers let you configure multiple Logical Partitions (LPARs) on a single physical frame, where each of the LPARs behaves as a standalone pSeries processor. Using this configuration, you can install and run multiple applications on different LPARs that use a single physical hardware component. The applications running on LPARs are completely isolated from each other at the software level. Each LPAR can be optimally tuned for a particular application that runs on it.
In addition, Dynamic Logical Partitioning (DLPAR) allows you to dynamically allocate additional resources (such as memory and CPUs) to each logical partition, if needed, without stopping the application. These additional resources must be physically present on the frame that uses logical partitions.
Capacity Upgrade on Demand is one of the features of the DLPAR function that lets you activate preinstalled but yet inactive (and unpaid for) processors as resource requirements change.
Related Documentation
The following related publications provide more information:
DLPAR information: Planning for Partitioned-System Operations, SA38-0626 CUoD information: IBM eServer pSeries Planning Guide for Capacity Upgrade on Demand HMC information: IBM Hardware Management Console for pSeries Installation and Operations Guide, SA38-0590. These guides and other related eServer pSeries documentation, are available at:
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/base/
The following whitepapers provide useful information:
Minimizing Downtime by Using HACMP in a Single Frame Environment, http://www.ibm.com/servers/eserver/pseries/software/whitepapers/hacmp_lpar.pdf
Dynamic Logical Partitioning in IBM eServer pSeries, http://www.ibm.com/servers/eserver/pseries/hardware/whitepapers/dlpar.pdf
LPAR, DLPAR, and CUoD Terminology
The appendix uses the terms listed in this section. For more information on the terms, refer to the IBM guides on LPARs and CUoD listed in the Related Documentation section.
Logical Partition (LPAR). The division of a computer's processors, memory, and hardware resources into multiple environments so that each environment can be operated independently with its own operating system and applications. The number of logical partitions that can be created depends on the system. Typically, partitions are used for different purposes, such as database operation, client/server operations, Web server operations, test environments, and production environments. Each partition can communicate with the other partitions as if each partition is a separate machine.
Dynamic Logical Partitioning (DLPAR). A facility in some pSeries processors that provides the ability to logically attach and detach a managed system's resources to and from a logical partition's operating system without rebooting. Some of the features of DLPAR include: Capacity Upgrade on Demand (CUoD), a feature of the pSeries, which allows you to activate preinstalled but inactive processors as resource requirements change. The Dynamic Processor Deallocation feature of the pSeries servers, and on some SMP models. It lets a processor to be taken offline dynamically when an internal threshold of recoverable errors is exceeded. DLPAR allows to substitute the inactive processor, if one exists, for the processor that is suspected of being defective. This online switch does not impact applications and kernel extensions. This function is not supported by HACMP. See the IBM’s Planning for Partitioned-System Operations Guide. Cross-partition workload management, which is particularly important for server consolidation in that it can be used to manage system resources across partitions. This function is not supported by HACMP. See the IBM’s Planning for Partitioned-System Operations Guide. Capacity Upgrade on Demand (CUoD or COD). A facility in some pSeries processors that lets you acquire— but not pay for—a fully configured system. The additional CPUs and memory, while physically present, are not used until you decide that the additional capacity you need is worth the cost. This provides you with a fast and easy upgrade in capacity to meet peak or unexpected loads. Hardware Management Console (HMC). An interface that lets you manage all DLPAR operations on several or all LPARs created on the frame, collect CUoD system profile information and enter activation codes for CUoD. For integration with HACMP, HMC should have a TCP/IP connection to the LPAR and a configured IP label through which a connection will be established. The lshmc command displays the HMC configuration. Managed System. A pSeries frame that is LPAR-capable and that is managed by an HMC. On/Off Capacity Upgrade on Demand (CUoD). A type of CUoD license allowing temporary activation of processors only. For more information, see Types of CUoD Licenses. Trial Capacity Upgrade on Demand (CUoD). A type of CUoD license allowing a no-charge usage for testing inactive CUoD processors and memory. For more information, see Types of CUoD Licenses. CUoD Vital Product Data (VPD). A collection of system profile information that describes the hardware configuration and identification numbers. In this document, use of the term VPD refers to CUoD VPD. Activation Code (or License Key). A password used to activate inactive (standby) processors or memory in CUoD. Each activation code is uniquely created for a system and requires the system VPD (Vital Product Data) to ensure correctness. Note: In HACMP SMIT and in this guide, the activation code is also referred to as the license key.
HACMP Integration with the CUoD Function
This section describes how HACMP integrates with the DLPAR and CUoD facilities.
By integrating with DLPAR and CUoD, HACMP ensures that each node can support the application with reasonable performance at a minimum cost. This way, you can upgrade the capacity of the logical partition in cases when your application requires more resources, without having to pay for idle capacity until you actually need it.
You can configure cluster resources so that the logical partition with minimally allocated resources serves as a standby node, and the application resides on another LPAR node that has more resources than the standby node. This way, you do not use any additional resources that the frames have until the resources are required by the application.
When it is necessary to run the application on the standby node, HACMP ensures that the node has sufficient resources to successfully run the application. The resources can be allocated from two sources:
The free pool. The DLPAR function provides the resources to the standby node, by allocating the resources available in the free pool on the frame. CUoD provisioned resources. If there are not enough available resources in the free pool that can be allocated through DLPAR, the CUoD function provides additional resources to the standby node, should the application require more memory or CPU. For information on how to plan CUoD in HACMP, see Planning for CUoD and DLPAR.
For information on how to configure CUoD in HACMP, see Configuring CUoD in HACMP.
A Typical HACMP Cluster to Use with the CUoD Function
You can configure an HACMP cluster within one or more pSeries servers, using two or more logical partitions. You can also configure a cluster on a subset of LPARs within one frame. Or, the cluster can use partitions from two or more frames, where the nodes can be defined as a subset of LPARs from one frame and a subset of LPARs from another frame, all connected to one or more HMCs. The following figure illustrates a typical two-frame configuration:
Terminology for Resource Types and Memory Allocation
The following terms are used in this guide. They help you to distinguish between different types of resources allocation that can occur in an HACMP cluster that uses DLPAR and CUoD functions.
Total amount of resources, or permanent resources. The number of CPUs and the amount of memory that are physically available for use by all LPARs on the frame. This amount includes all permanent, or paid for resources and may also include those CUoD resources that have already been paid for. Free pool. The number of CPUs and the amount of memory that can be dynamically allocated by HACMP through HMC to the LPAR, should the LPAR require additional resources. The free pool is the difference of the total amount of resources on the frame minus the resources that are currently used by the LPARs. Note: The free pool includes resources on a particular frame only. For instance, if a cluster is configured with LPARs that reside on frames A and B, HACMP does not request resources from a pool on frame B for an LPAR that resides on frame A.
CUoD pool. The number of CPUs and the amount of memory that can be allocated by HACMP using the CUoD license, should the LPAR require additional resources. The CUoD pool depends on the type of CUoD license you have. Note: Note: The CUoD pool includes resources on a particular frame only.
LPAR minimum amount. The minimum amount (or quantity) of a resource, such as CPU or memory, that an LPAR requires to be brought online or started. The LPAR does not start unless it meets the specified LPAR minimum. When DLPAR operations are performed between LPARs, the amount of resources removed from an LPAR cannot go below this value. This value is set on the HMC and is not modified by HACMP. Use the lshwres command on the HMC to verify this value. LPAR desired amount. The desired amount of a resource that an LPAR acquires when it starts, if the resources are available. This value is set on the HMC and is not modified by HACMP. Use the lshwres command on the HMC to verify this value. LPAR maximum amount. The maximum amount (or quantity) of a resource that an LPAR can acquire. When DLPAR operations are performed, the amount of resources added to an LPAR cannot go above this value. This value is set on the HMC and is not modified by HACMP. Use the lshwres command on the HMC to verify this value. The following figure illustrates the relationship between the total amount of memory and resources on the frame (server), the free pool, and the resources that could be obtained through CUoD:
Planning for CUoD and DLPAR
If you plan to utilize the DLPAR and CUoD functions in an HACMP cluster, it is assumed that:
You have already planned and allocated resources to the LPARs through the HMC You are familiar with the types of Capacity on Demand licenses that are available. For more information on licenses, see Configuring CUoD in HACMP in this appendix. Also refer to the Related Documentation section for a list of IBM guides that help with initial planning of LPARs and CUoD before using CUoD in HACMP.
Software and Hardware Requirements for CUoD and DLPAR
To use the CUoD and DLPAR functions in HACMP, all LPAR nodes in the cluster should have the following installed:
AIX 5L v. 5.2 or v.5.3 HACMP 5.3 or greater RSCT2.3.3.1 HMC 3 version 2.6 HMC build level/firmware 20040113.1 or greater OpenSSH 3.4p1 or greater. Planning Requirements
Planning for Capacity on Demand in the HACMP cluster requires performing the following steps for each LPAR that serves as a cluster node:
Obtain the LPAR resources information and resource group policies information: How much memory and resources the applications that are supported by your cluster require when they run on their regular hosting nodes. Under normal running conditions, check how much memory and what number of CPUs each application uses to run with optimum performance on the LPAR node on which its resource group resides normally (home node for the resource group). The startup, fallover and fallback policies of the resource group that contains the application server. Use the clRGinfo command. This identifies the LPAR node to which the resource group will fall over, in case of a failure. How much memory and what number of CPUs are allocated to the LPAR node on which the resource group will fall over, should a failure occur. This LPAR node is referred to as a standby node. With these numbers in mind, consider whether the application’s performance would be impaired on the standby node, if running with fewer resources. Check the existing values for the LPAR minimums, LPAR maximums and LPAR desired amounts (resources and memory) specified. Use the lshwres command on the standby node. Estimate the resources the application will require: The minimum amount of resources that the standby node requires to be allocated. Note that the CUoD resources will be used in addition to the existing LPAR resources currently available in the free pool on the frame through dynamic allocation of LPARs, and only if you explicitly tell HACMP to use them. The desired amount of resources that the standby node would need to obtain through DLPAR or CUoD so that the application can run with the performance equivalent to its performance on the home node. In other words, for each standby node that can be hosting a resource group, estimate the desired amount of resources (memory and CPU) that this node requires to be allocated so that the application runs successfully. Note that the CUoD resources will be used in addition to the existing LPAR resources currently available in the free pool on the frame through dynamic allocation of LPARs, and only if you explicitly tell HACMP to use them.
These minimum and desired amounts for memory and CPUs are the ones you will specify to HACMP. HACMP verifies that these amounts are contained within the boundaries of LPAR maximums that are set outside of HACMP for each LPAR node.
Revise existing pre-and post-event scripts that were used to allocated DLPAR resources. If you were using LPAR nodes in your cluster before utilizing the CUoD and HACMP integration function, you may need to revise and rewrite your existing pre- and post-event scripts. For more information on this subject, see Using Pre- and Post-Event Scripts.
Types of CUoD Licenses
The following table describes the types of Capacity Upgrade on Demand (CUoD) licenses that are available. It also indicates whether HACMP allows the use of a particular license:
License Type Description Supported by HACMP Comments On/Off CPU: Allows you to start and stop using processors as needs change.Memory: not allowed. CPU: YesMemory: N/A HACMP does not manage licenses. The resources remain allocated to an LPAR until HACMP releases them through a DLPAR operation, or until you release them dynamically outside of HACMP.If the LPAR node goes down outside of HACMP, the CUoD resources are also released. For more information see Stopping LPAR Nodes. Trial CPU and Memory: The resources are activated for a single period of 30 consecutive days. If your system was ordered with CUoD features and they have not yet been activated, you can turn the features on for a one-time trial period.With the trial capability, you can gauge how much capacity you might need in the future, if you decide to permanently activate the resources you need. CPU: YesMemory:Yes HACMP activates and deactivates trial CUoD resources.Note: Once the resources are deactivated, the trial license is used and cannot be reactivated.
Allocation and Release of Resources when Using Different Licenses
The trial and on/off types of licenses differ in how they allocate and release resources during the initial application startup (when resources are requested), and during the fallover of the resource group containing the application to another LPAR node.
To summarize, the differences are as follows:
When HACMP determines that a particular type of trial CUoD resource is needed, it activates the entire amount of that resource. This is done because only one activation instance of a particular trial CUoD resource is allowed. Note that even though HACMP activates all trial CUoD resources, only what is needed for the application is allocated to the LPAR node. The remaining resources are left in the free pool. For instance, if you have 16 GB of trial CUoD memory available, and request 2 GB, all 16 GB will be put into the free pool, but the application will acquire 2 GB.
When, during an application fallover or when stopping the application server HACMP places the allocated trial CUoD resources back into the free pool, these resources are de-activated (that is, moved to the CUoD pool) only if the trial time has expired, otherwise they remain in the free pool.
In detail, the resources are allocated and released as follows:
Startup and Fallover when Using the Trial License
Application startup. If an application server needs to allocate any additional memory or CPU resources from the CUoD pool, then all available memory or CPU resources from the CUoD pool (that is, all resources from the trial license) are allocated to the free pool and then the necessary resources from that amount are allocated to the LPAR node.
Application Fallover. If an application server has allocated any CPU or memory resources from the CUoD pool, upon fallover all CPU or memory resources are released into the free pool. Upon fallover to another LPAR node, the application then acquires only the required memory and CPU resources for the application server to come online. No CPU or memory resources are released back into the CUoD pool until the trial time expires.
Startup and Fallover when Using the On/Off License
Application Startup. If an application server needs to allocate CPU resources from the CUoD pool, only the needed amount of resources is allocated to the free pool and then to the LPAR node, in order for an application server to come online.
Application Fallover. If an application server has allocated CPU resources, upon fallover resources that were initially allocated from the CUoD pool will be released into the CUoD pool. Similarly, resources that were initially allocated from the free pool will be released into the free pool. Upon fallover, the application on the fallover node initiates the allocation again and allocates the required resources from the free and CUoD pools in order for the application server to come online.
Configuring CUoD in HACMP
This section contains the following:
Overview
You can configure resources that use DLPAR and CUoD operations to make these resources available to applications, if needed, for instance, when applications fall over to standby LPAR nodes.
At a high level, to enable the use of dynamically allocated LPARs and CUoD in HACMP:
1. Configure application servers.
For each application server, check its resource group policies and identify to which LPAR nodes the group (and its application) could potentially fall over, should fallover occur in the cluster.
2. For each LPAR node, establish a communication path between the node and one or more Hardware Management Consoles (HMCs). This includes configuring IP addresses HACMP will use to communicate with the HMC(s), and the Managed System name.
If no connection to the HMC is configured for this node, HACMP assumes that this node is not DLPAR-capable. This allows you to have a cluster configuration with non-LPAR backup nodes (along with LPAR nodes).
3. Confirm that you want to use CUoD if the amount of DLPAR resources in the free pool is not sufficient. (Note that this requires accepting the license first and may result in additional charges). By default, this option in SMIT is set to No.
4. Configure the minimum and the desired amounts of CPU and memory that you would like to provision for your application server with the use of DLPAR function and CUoD. This task is often referred to as application provisioning.
Prerequisites
Prior to using the Capacity Upgrade on Demand (CUoD) function in HACMP do the following:
Check software and hardware levels. Verify that the system is configured to use the required software and hardware for the DLPAR/CUoD integration. For information, see Software and Hardware Requirements for CUoD and DLPAR.
Check the LPAR node name. The node name on the HMC must be the same as the AIX 5L hostname. HACMP uses the hostname to pass DLPAR commands to the HMC.
Check what DLPAR resources you have available, and to what CUoD licenses you have access. HACMP does not tell in advance whether required resources are going to be available in all circumstances. HACMP has no control over whether you actually made the resources physically available on the pSeries frame and whether they currently remain unallocated and free. In addition, HACMP provides dynamic allocations only for CPU and Memory resources. HACMP does not allow dynamic changes of the I/O slots.
Enter the license key (activation code) for CUoD. Obtain and enter the license key (also called the activation code) on the Hardware Management Console (HMC). Note that this may result in extra charges due to the usage of the CUoD license.
For information on the activation code, see the IBM’s Planning Guide for Capacity Upgrade on Demand.
Establish secure connections to the HMC. Since HACMP has to securely communicate with LPAR nodes through HMCs, you must correctly install and configure SSH to allow HACMP’s access to the HMC without forcing a username and password to be entered each time.
In the HMC’s System Configuration window, put a checkmark for the option Enable remote command execution using the SSH facility. Install SSH on top of AIX 5L and generate the public and private keys necessary. HACMP 5.2 and up always uses the “root” user on the cluster nodes to issue SSH commands to the HMC. On the HMC system, the commands run as “hscroot” user.
For further information on configuring SSH and remote execution, refer to the Hardware Management Console for pSeries Installations and Operations Guide.
Steps for Configuring DLPAR and CUoD in HACMP
To configure HACMP for DLPAR and CUoD resources:
1. Enter smit hacmp
2. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP for Dynamic LPAR and CUoD Resources and press Enter.
The Configure HACMP for Dynamic LPAR and CUoD Resources screen appears.
3. Select one of the two options:
4. Press Enter.
Depending on which option you selected, HACMP prompts you to either configure a communication path first, or to configure resource requirements.
Configuring a Communication Path to HMC
To establish a communication path between an LPAR node and an HMC, for each LPAR node:
1. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP for Dynamic LPAR and CUoD Resources > Configure Communication Path to HMC > Add HMC IP Address for a Node and press Enter.
2. Enter field values as follows:
3. Press Enter.
HACMP verifies that the HMC is reachable and establishes a communication path to the HMC for the node.
Note that in some LPAR configurations outside of HACMP, LPARs could use two or more HMCs. HACMP uses only those HMCs to which the communication paths are configured. You may want to configure HACMP paths for all HMCs in the system, or only for a subset of HMCs. If HACMP successfully establishes a connection to an HMC and sends a command to it but the command fails, HACMP does not attempt to run this command on other HMCs for which IP labels are configured in HACMP.
Changing, Showing or Deleting Communication Path to HMC
Use the parent panel Configure Communication Path to HMC for changing the communication paths, and also for deleting the communication paths. Before proceeding, HACMP prompts you to select an LPAR node for which you want to change, show or delete the communication path.
Configuring Dynamic LPAR and CUoD Resources for Applications
To configure dynamic LPAR and CUoD resources, for each application server that could use DLPAR-allocated or CUoD resources:
1. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP for Dynamic LPAR and CUoD Resources > Configure Dynamic LPAR and CUoD Resources for Applications > Add Dynamic LPAR and CUoD Resources for Applications and press Enter.
2. Select an application server from the list and press Enter.
3. Enter field values as follows:
4. Press Enter.
When the application requires additional resources to be allocated on this node, HACMP performs its calculations to see whether it needs to request only the DLPAR resources from the free pool on the frame and whether that would already satisfy the requirement, or if CUoD resources are also needed for the application server. After that, HACMP proceeds with requesting the desired amounts of memory and numbers of CPU, if you selected to use them.
During verification, HACMP ensures that the entered values are below LPAR maximum values for memory and CPU. Otherwise HACMP issues an error, stating these requirements.
HACMP also verifies that the total of required resources for ALL application servers that can run concurrently on the LPAR is less than the LPAR maximum. If this requirement is not met, HACMP issues a warning. Note that this scenario can happen upon subsequent fallovers. That is, if the LPAR node is already hosting application servers that require DLPAR and CUoD resources, then upon acquiring yet another application server, it is possible that the LPAR cannot acquire any additional resources beyond its LPAR maximum. HACMP verifies this case and issues a warning.
Changing Dynamic LPAR and CUoD Resources for Applications
To change or show dynamic LPAR and CUoD resources:
1. In SMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP for Dynamic LPAR and CUoD Resources > Configure Dynamic LPAR and CUoD Resources for Applications > Change/Show Dynamic LPAR and CUoD Resources for Applications and press Enter.
2. Select an application server from the list and press Enter.
The screen displays the previously entered values for the application server’s minimum and desired amounts. Note that each time you request CUoD resources, you must select Yes in the appropriate fields to allow HACMP to proceed with using CUoD.
3. Change field values as follows:
4. Press Enter.
When the application requires additional resources to be allocated on this node, HACMP performs its calculations to see whether it needs to request only the DLPAR resources from the free pool on the frame and whether that would already satisfy the requirement, or if CUoD resources are also needed for the application server. After that, HACMP proceeds with requesting the desired amounts of memory and numbers of CPU, if you selected to use them.
Deleting Dynamic LPAR and CUoD Resources for Applications
Use the parent panel Configure Dynamic LPAR and CUoD Resources for Applications for for removing the resource requirements for application servers. The Remove Dynamic LPAR and CUoD Resources for Applications screen prompts you to select the application server, and lets you remove the application resource provisioning information.
If you delete the application server, HACMP also deletes the application provisioning information for it.
Changing the DLPAR and CUoD Resources Dynamically
You can change the DLPAR and CUoD resource requirements for application servers without stopping the cluster services. Synchronize the cluster after making the changes.
The new configuration is not reflected until the next event that causes the application (hence the resource group) to be released and reacquired on another node. In other words, a change in the resource requirements for CPUs, memory or both does not cause the recalculation of the DLPAR resources. HACMP does not stop and restart application servers solely for the purpose of making the application provisioning changes.
If another dynamic reconfiguration change causes the resource groups to be released and reacquired, the new resource requirements for DLPAR and CUoD are used at the end of this dynamic reconfiguration event.
How Application Provisioning Works in HACMP
This section describes the flow of actions in the HACMP cluster, if the application provisioning function through DLPAR and CUoD is configured. It also includes several examples that illustrate how resources are allocated, depending on different resource requirements.
In addition, the section provides some recommendations on using pre-and post-scripts.
Overview
When you configure an LPAR on the HMC (outside of HACMP), you provide LPAR minimum and LPAR maximum values for the number of CPUs and amount of memory. You can obtain these values by running the commands on the HMC. The stated minimum values of the resources must be available when an LPAR node starts. If more resources are available in the free pool on the frame, an LPAR can allocate up to the stated desired values. During dynamic allocation operations, the system does not allow that the values for CPU and memory go below the minimum or above the maximum amounts specified for the LPAR.
HACMP obtains the LPAR minimums and LPAR maximums amounts and uses them to allocate and release CPU and memory when application servers are started and stopped on the LPAR node.
HACMP requests the DLPAR resource allocation on the HMC before the application servers are started, and releases the resources after the application servers are stopped. The Cluster Manager waits for the completion of these events before continuing the event processing in the cluster.
HACMP handles the resource allocation and release for application servers serially, regardless if the resource groups are processed in parallel. This minimizes conflicts between application servers trying to allocate or release the same CPU or memory resources. Therefore, you must carefully configure the cluster to properly handle all CPU and memory requests on an LPAR.
These considerations are important:
Once HACMP has acquired additional resources for the application server, when the application server moves again to another node, HACMP releases only those resources that are no longer necessary to support this application on the node. HACMP does not start and stop LPAR nodes. Acquiring DLPAR and CUoD Resources
If you configure an application server that requires a minimum and a desired amount of resources (CPU or memory), HACMP determines if additional resources need to be allocated for the node and allocates them if possible.
In general, HACMP tries to allocate as many resources as possible to meet the desired amount for the application, and uses CUoD, if allowed, to do this.
The LPAR Node has the LPAR Minimum
If the node owns only the minimum amount of resources, HACMP requests additional resources through DLPAR and CUoD.
In general, HACMP starts counting the extra resources required for the application from the minimum amount. That is, the minimum resources are retained for the node’s overhead operations, and are not utilized to host an application.
The LPAR Node has Enough Resources to Host an Application
The LPAR node that is about to host an application may already contain enough resources (in addition to the LPAR minimum) to meet the desired amount of resources for this application.
In this case, HACMP does not allocate any additional resources and the application can be successfully started on the LPAR node. HACMP also calculates that the node has enough resources for this application in addition to hosting all other application servers that may be currently running on the node.
Resources Requested from the Free Pool and from the CUoD Pool
If the amount of resources in the free pool is insufficient to satisfy the total amount requested for allocation (minimum requirements for one or more applications), HACMP requests resources from CUoD.
If HACMP meets the requirement for a minimum amount of resources for the application server, application server processing continues. Application server processing continues even if the total desired resources (for one or more applications) have not been met or are only partially met. In general, HACMP attempts to acquire up to the desired amount of resources requested for an application.
If the amount of resources is insufficient to host an application, HACMP starts resource group recovery actions to move the resource group to another node.
The Minimum Amount Requested for an Application Cannot be Satisfied
In some cases, even after HACMP requests to use resources from the CUoD pool, the amount of resources it can allocate is less than the minimum amount specified for an application.
If the amount of resources is still insufficient to host an application, HACMP starts resource group recovery actions to move the resource group to another node.
The LPAR node is Hosting Application Servers
In all cases, HACMP checks whether the node is already hosting application servers that required application provisioning, and that the LPAR maximum for the node is not exceeded:
Upon subsequent fallovers, HACMP checks if the minimum amount of requested resources for yet another application server plus the amount of resources already allocated to applications residing on the node exceeds the LPAR maximum. In this case, HACMP attempts resource group recovery actions to move the resource group to another LPAR. Note that when you configure the DLPAR and CUoD requirements for this application server, then during cluster verification, HACMP warns you if the total number of resources requested for all applications exceeds the LPAR maximum. Allocation of Resources in a Cluster With Multiple Applications
If you have multiple applications in different resource groups in the cluster with LPAR nodes, and more than one application is configured to potentially request additional resources through the DLPAR and CUoD function, the resource allocation in the cluster becomes more complex.
Based on the resource group processing order, some resource groups (hence the applications) might not be started. See Example 2: Failure to Allocate CPUs due to Resource Group Processing Order.
In general, to better understand how HACMP allocates resources in different scenarios, see Examples of using DLPAR and CUoD Resources.
Releasing DLPAR and CUoD Resources
When the application server is stopped on the LPAR node (the resource group moves to another node), HACMP releases only those resources that are no longer necessary to support this application server on the node. The resources are released to the free pool on the frame.
HACMP first releases the DLPAR or CUoD resources it acquired last. This implies that the CUoD resources may not always be released before the dynamic LPAR resources are released.
The free pool is limited to the single frame only. That is, for clusters configured on two frames, HACMP does not request resources from the second frame for an LPAR node residing on the first frame.
Also, if LPAR 1 releases an application that puts some DLPAR resources into free pool, LPAR 2, which is using the CUoD resources, does not make any attempt to release its CUoD resources and acquire the free DLPAR resources.
Stopping LPAR Nodes
When the Cluster Manager is forced down on an LPAR node, and that LPAR is subsequently shutdown (outside of HACMP), the CPU and memory resources are released (not by HACMP) and become available for other resource groups running on other LPARs. HACMP does not track CPU and memory resources that were allocated to the LPAR and does not retain them for use when the LPAR node rejoins the cluster.
Note: If you are using the On/Off license for CUoD resources, and the LPAR node is shutdown (outside of HACMP), the CUoD resources are released (not by HACMP) to the free pool, but the On/Off license continues to be turned on. You may need to manually turn off the licence for the CUoD resources that are now in the free pool. (This ensures that you do not pay for resources that are not being currently used).
If the LPAR is not stopped after the Cluster Manager is forced down on the node, the CPU and memory resources remain allocated to the LPAR for use when the LPAR rejoins the cluster.
Examples of using DLPAR and CUoD Resources
The following examples show CPU allocation and release. (Memory allocation process is similar).
It is important to remember that once HACMP acquires additional resources for an application server, when the server moves again to another node, it takes the resources with it, that is, the LPAR node releases all the additional resources it acquired, and remains with just the minimum.
The configuration is an 8 CPU frame, with a two-node (each an LPAR) cluster. There are 2 CPUs available in the CUoD pool, that is through the CUoD activations. The nodes have the following characteristics:
The following application servers are defined in separate resource groups:
Application server name CPU Desired CPU Minimum Allow to Use CUoD? AS1 1 1 Yes AS2 2 2 No AS3 4 4 No
Example 1: No CPUs Are Allocated at Application Server Start, some CPUs are Released at Server Stop
Current configuration settings:
Node1 has 3 CPUs allocated. Node2 has 1 CPU allocated. The free pool has 4 CPUs. HACMP starts application servers as follows:
Node1 starts AS2, no CPUs are allocated to meet the requirement of 3 CPUs. (3 CPUs is equal to the sum on Node1's LPAR minimum of 1 plus AS2 desired amount of 2). Node1 stops AS2. 2 CPUs are released, leaving 1 CPU, the minimum requirement. (Since no other application servers are running, the only requirement is Node1 LPAR minimum of 1). Example 2: Failure to Allocate CPUs due to Resource Group Processing Order
Current configuration settings:
Node1 has 3 CPUs allocated. Node2 has 1 CPU allocated. The free pool has 4 CPUs. HACMP starts application servers as follows:
Node1 starts AS1, no CPUs are allocated since the requirement of 2 is met. Node1 starts AS3, 3 CPUs are allocated to meet the requirement of 6. There is now 1 CPU in the free pool.
Node1 attempts to start AS2. After Node 1 has acquired AS1 and AS3, the total amount of CPUs Node1 must now own to satisfy these requirements is 6, which is the sum of Node1 LPAR minimum of 1 plus AS1 desired amount of 1 plus AS3 desired amount of 4. Since AS2 minimum amount is 2, in order to acquire AS2, Node1 needs to allocate 2 more CPUs, but there is only 1 CPU left in the free pool and it does not meet the minimum requirement of 2 CPUs for AS2. The resource group with AS2 goes into error state since there is only 1 CPU in the free pool and CUoD use is not allowed.
Example 3: Successful CUoD Resources Allocation and Release
Current configuration settings:
Node1 has 3 CPUs allocated. Node2 has 1 CPU allocated. The free pool has 4 CPUs. HACMP starts application servers as follows:
Node1 starts AS3, 2 CPUs are allocated to meet the requirement of 5. Node1 starts AS2, 2 CPUs are allocated to meet the requirement of 7. There are now no CPUs in the free pool. Node1 starts AS1, 1 CPU is taken from CUoD and allocated to meet the requirement of 8. Node1 stops AS3, 4 CPUs are released and 1 of those CPUs is put back into the CUoD pool. Example 4: Resource Group Failure (the Minimum for the Server is not Met, but the LPAR Maximum for the Node is Reached)
Current configuration settings:
Node1 has 1 CPU allocated. Node2 has 1 CPU allocated. The free pool has 6 CPUs. HACMP starts application servers as follows:
Node2 starts AS3, 4 CPUs are allocated to meet the requirement of 5. There are now 2 CPUs in the free pool. Node2 attempts to start AS2, but AS2 goes into error state since the LPAR maximum for Node2 is 5 and Node2 cannot acquire more CPUs. Example 5: Resource Group Fallover
Current configuration settings:
Node1 has 3 CPUs allocated. Node2 has 1 CPU allocated. The Free pool has 4 CPUs. HACMP starts application servers as follows:
Node1 starts AS2, no CPUs are allocated to meet the requirement of 3. The resource group with AS2 falls over from Node1 to Node2. Node1 stops AS2. 2 CPUs are released, leaving 1 CPU on the LPAR, the minimum requirement for the node. Node2 start AS2, 2 CPUs are allocated to meet the requirement of 3. Using Pre- and Post-Event Scripts
The existing pre- and post-event scripts that you were using in a cluster with LPARs (before using the CUoD integration with HACMP) may need to be modified or rewritten, if you plan to configure CUoD and DLPAR requirements in HACMP.
Keep in mind the following:
HACMP performs all the DLPAR operations before the application servers are started, and after they are stopped. You may need to rewrite the scripts to account for this. Since HACMP takes care of the resource calculations, requests additional resources from the DLPAR operations and, if allowed, from CUoD, you may get rid of the portions of your scripts that do that. HACMP only takes into consideration the free pool on a single frame. If your cluster is configured within one frame, then modifying the scripts as stated above is sufficient. However, if a cluster is configured with LPAR nodes residing on two frames, you may still require the portions of the existing pre- and post-event scripts that deal with dynamically allocating resources from the free pool on one frame to the node on another frame, should the application require these resources.
Troubleshooting DLPAR and CUoD Operations in HACMP
To troubleshoot the DLPAR operations in your cluster, use the event summaries in the hacmp.out file and syslog.
HACMP logs the following information:
All resource allocation attempts along with the outcome in the event summaries. If CUoD resources are used, a separate log entry indicates this in the event summary in hacmp.out and in syslog. The information about the released resources. Each time you use CUoD, remember to select Yes in the appropriate SMIT fields for application provisioning configuration. HACMP logs your replies in syslog. HACMP processing may wait until the DLPAR operations are completed on a particular node. This also affects the time it takes HACMP to start a resource group. Use the event summaries to track processing.
Use the following commands on the LPAR node or on the HMC:
![]() ![]() ![]() |