PreviousNextIndex

Chapter 8: HACMP 5.4: Summary of Changes


This chapter lists all new or enhanced features in HACMP 5.4 and also notes discontinued features.

  • List of New Features
  • Discontinued Features.
  • List of New Features

    HACMP 5.4 includes the following new or enhanced features. The features are listed under a particular type of enhancement, although some improve both usability and performance, and some also offer improved interoperability with other IBM products.

    New Features That Enhance Ease of Use

    These features make the product easier to use:

  • HACMP Smart Assist Programs Enhancements
  • Better Handling of Stopping and Starting HACMP Cluster Services
  • Resource Group Management (clRGmove) Enhancements
  • Verification Enhancements
  • Improved WebSMIT Application
  • HACMP Smart Assist Programs Enhancements

    In HACMP 5.4, the three HACMP Smart Assists are enhanced to include more automatic discovery to help you easily integrate these applications into an HACMP cluster:

  • Smart Assist for WebSphere. Extends an existing HACMP configuration to include monitoring and recovery support for various WebSphere components.
  • Smart Assist for DB2. Extends an existing HACMP configuration to include monitoring and recovery support for DB2 Universal Database (UDB) Enterprise Server Edition.
  • Smart Assist for Oracle. Provides assistance to those involved with the installation of Oracle® Application Server 10g (9.0.4) (AS10g) Cold Failover Cluster (CFC) solution on IBM AIX 5L™ (5200) operating system.
  • HACMP 5.4 also includes a General Configuration Smart Assist that helps you quickly configure other applications.

    HACMP 5.4 provides a documented Smart Assist Framework and API that allows users to write a Smart Assist program to integrate their own applications with HACMP.

    HACMP Smart Assist for Oracle

    HACMP Smart Assist for Oracle has been greatly improved. It helps you to configure Oracle Application Server and/or an associated Oracle database, in a highly available cluster environment. HACMP 5.4 Smart Assist for Oracle extends and improves upon the high availability solutions available in HACMP 5.3.

    HACMP 5.4 Smart Assist for Oracle helps you to configure Oracle components in one of the multiple high availability configurations. You can choose the best high availability configuration for your environment based upon Oracle's recommendations, and then use the SMIT screens to implement the configuration in HACMP under the AIX 5L environment.

    HACMP Smart Assist for Oracle also monitors Oracle Application Server and Oracle database instances and processes. In order to bring the Application Server and database under the control of HACMP, you must have pre-installed the respective Oracle components.

    Better Handling of Stopping and Starting HACMP Cluster Services

    In HACMP 5.4, your options for starting, stopping and restarting cluster services have been streamlined and improved to allow you full control over your applications, without disrupting them.

    You can:

  • Start and restart HACMP cluster services. When you start cluster services, or restart them after a shutdown, HACMP by default automatically activates the resources according to how you defined them, taking into consideration application dependencies, application start and stop scripts, dynamic attributes and other parameters. That is, HACMP automatically manages (and activates, if needed) resource groups and applications in them.
  • Note that you can also start HACMP cluster services and tell it not to start up any resource groups (and applications) automatically for you. If an application is already running, you no longer need to stop it before starting the cluster services. HACMP relies on the application monitor and application startup scripts to verify whether it needs to start the application for you or if the application is already running (HACMP tries not to start a second instance of the application).
  • Shut down HACMP cluster services. You can tell HACMP to stop cluster services on the node(s) and along with this action to either bring the resources and applications offline, move them to other nodes, or keep them running on the same nodes (but stop managing them for high availability).
  • The Cluster Manager “remembers” the state of the nodes and responds appropriately when users attempt to restart the nodes.

    For detailed information on how to configure application monitors as well as HACMP cluster startup and shutdown options, see the chapter on Starting and Stopping Cluster Services in the Administration Guide.

    Resource Group Management (clRGmove) Enhancements

    In HACMP 5.4, the Resource Group Management utility, clRGmove, has been improved:

  • Improved SMIT interface. Using a more straightforward user interface in SMIT, you can select a resource group and then a node (at either site) to which you want to move it. SMIT also informs you (in the picklists with destination nodes) if it finds that a node with a higher priority exists that can host a group. You can always choose to move the group to that node. (This is useful for groups with Fallback to Highest Priority Node fallback policy. Such groups will fall back to their “new” nodes, once you move them).
  • Proper handling of non-concurrent resource groups with No Fallback resource group policies. In HACMP 5.4, once you move such a resource group, it remains on the destination node until you tell HACMP where to move it again, and does not fall back immediately to the node from which it was moved. Note, it may move to other nodes later, if HACMP reacts to some potential cluster events and has to take action to recover or redistribute resource groups.
  • Proper handling of non-concurrent resource groups with No Fallback site policy. Similarly, HACMP 5.4 now honors the No Fallback policy for sites, when you move resource groups between sites. For instance, a resource group contains nodes that belong to different sites and the site policy is No Fallback. If you move the group to a node at another site, it does not fall back to its primary site immediately after you move it. Instead, it remains on the node at another site either until you tell HACMP to move it again, or, until, upon subsequent cluster events, HACMP may decide to move it.
  • Clear method to maintain the previously configured behavior for a resource group. Once you move a resource group, this does not change the nodelist that was specified for this resource group before you moved it, or its startup, fallover or fallback policies. It may temporarily change the home node (for groups that have Fallback to Highest Priority Node fallback policy). The node to which you move such a group becomes its temporary home node and it falls back to this node.
  • In HACMP 5.4, when you move a resource group, it either stays on the destination node until you move it again, or HACMP moves it upon subsequent cluster events. This consistent behavior is especially important to notice for those resource groups that have the Fallback to Highest Priority Node fallback policy: Such resource groups fall back to their “new” nodes. Note that this new node may not be the highest priority node available. If HACMP sees this, SMIT clearly indicates to you that in the list of destination nodes there is a higher priority node available to which you can always move the group.
  • Improved status and troubleshooting utilities. You can now use clRGinfo -p to obtain the history of the last cluster event that caused the resource group to move. (You can use this command only if HACMP is running on the nodes).
  • No need to set the Priority Override Location (POL) for the node to which a resource group is moved. POL is a setting you had to specify for manually-moved resource groups in releases prior to HACMP 5.4. In HACMP 5.4, you no longer have set it when moving a resource group to another node.
  • Verification Enhancements

    The following enhancements have been added to the verification process:

  • Non-IP Network Enhancement. Verification checks that each node can reach each other node in the cluster through non-IP connections. If this is not true, a message is displayed. In addition, the SMIT Configure HACMP Communication Interfaces/Devices panel has been enhanced so that all 16 characters of the PVID field are viewable.
  • Failed component warnings. The final verification report lists any nodes, networks and/or network interfaces that are in the 'failed' state at the time that cluster verification is run. The final verification report also lists other 'failed' components, if accessible from the Cluster Manager, such as applications, resource groups, sites, and application monitors that are in the suspended state. (Resource groups may be listed as in the ‘unmanaged’ state.)
  • Volume group verification checks. Volume group verification checks have been restructured for faster processing.
  • Message format. Messages have been reformatted for consistence and to remove repetitious entries.
  • Invalid netmasks. Verification now checks for valid netmasks.
  • Broadcast addresses. Verification now checks for valid broadcast addresses.
  • Mixed Volume Group state. All Volume Groups and PVIDs must be on the vpath devices, not the hdisks when an SDD VPATH Device Volume Group is installed. When PVs are on hdisks on one node but vpaths on another, an error occurs.
  • Persistent labels. Verification now checks collocation or anti-collocation used with persistent label distribution preferences when persistent labels have not been defined. When a “with persistent” distribution preference is selected and no persistent label has been defined, a warning message is displayed during cluster verification,
  • Improved WebSMIT Application

    With HACMP 5.4, WebSMIT expands its cluster management function with these enhancements:

  • New WebSMIT framework for the user interface
  • Graphical representation of:
  • Resource groups and their dependencies
  • Cluster site, network, and node information
  • Ability to simultaneously view the cluster configuration and the cluster status
  • Ability to navigate the running cluster
  • Assisted WebSMIT set up
  • To configure WebSMIT, you modify a sample post-install script to:
  • Copy and/or link the WebSMIT HTML and CGI files to the appropriate location
  • Change file permissions as needed
  • Update the Web server's httpd.conf file
  • Update the wsm_smit.conf file with the proper settings, if needed.
  • User Authentication utility (optional). Administrators can specify a group of users that have read-only access. Those users can view the cluster configuration and status, and navigate through SMIT stanza screens, but not execute commands or make changes.
  • Support for Mozilla-based browsers (Mozilla 1.7.3 for AIX and FireFox 1.0.6) in addition to Internet Explorer versions 6.0 and higher.
  • Cluster Test Tool

    HACMP 5.4 enhances the Cluster Test Tool to support a more complete set of cluster events, including tests for managing resources, resource groups, and sites. The tool performs tests to manage moving resource groups, failing and joining various resources, and rudimentary support for stopping and starting entire sites.

    In addition, HACMP 5.4 adds specific test plans for running site tests, non-IP network tests, IP network tests, and volume group tests, and extends the logic in the automated test tool to run these test plans as appropriate based on the cluster configuration. For more information, see the chapter about testing your cluster in the Administration Guide.

    Fast Method for Node Failure Detection

    HACMP uses the fast method for node failure detection and takes considerably less time to detect a node failure that occurred in the cluster, while reliably detecting node failures. As a mechanism for fast node failure detection, when a node fails, HACMP 5.4 uses disk heartbeating to place a departing message on the shared disk so neighboring nodes are aware of the node failure within one heartbeat period.

    Remote nodes that share the disks receive this message and broadcast that the node has failed. Directly broadcasting the node failure event greatly reduces the time it takes for the entire cluster to become aware of the failure compared to waiting for the missed heartbeats, and therefore HACMP can take over critical resources faster.

    Starting with HACMP 5.4, you can reduce the time it takes to detect a node failure by configuring disk heartbeating networks and specifying an FFD_ON parameter for the disk heartbeating network NIM. Once the cluster configuration contains disk heartbeating networks and this parameter is specified in SMIT, HACMP uses the fast method of node failure detection.

    For more information, see the section Decreasing Node Fallover Time in Chapter 3: Planning Cluster Network Connectivity of the Planning Guide, and a chapter on configuring NIMs in the Administration Guide.

    New Features That Enhance Geographic Distance Capability

    These features add to the capability for distributing the cluster over a geographic distance, for improved availability and disaster recovery.

    The following new functions are supported in clusters with HACMP/XD for GLVM 5.4:

  • Enhanced Concurrent Volume Groups. In addition to non-concurrent volume groups, you can have enhanced concurrent mode volume groups configured with RPVs, so that they can serve as geographically mirrored volume groups. You can include such volume groups into both concurrent and non-concurrent resource groups in an HACMP cluster with GLVM.
  • If you have enhanced concurrent volume groups, this means that you can also configure disk heartbeating over disks that belong to a geographically mirrored volume group, and that also belong to the nodes at the same site. (NOTE: disk heartbeating across sites is not supported. The heartbeating function is already performed in the HACMP cluster through an XD_ip network). Note, however, that another useful function that is allowed on enhanced concurrent volume groups in base HACMP—fast disk takeover, is not supported for geographically mirrored volume groups that are also enhanced concurrent.
  • Multiple Data Mirroring Networks. In an HACMP cluster that has sites configured, you can now have up to four XD_data networks used for data mirroring (in previous releases, HACMP/XD for GLVM allowed only one XD_data network). Having more mirroring networks in the cluster increases data availability and mirroring performance. For instance, if one of the mirroring networks fails, the GLVM mirroring can continue over the redundant networks. Also, you have the flexibility to configure several low bandwidth XD_data networks and take advantage of the aggregate network bandwidth (you can also combine high bandwidth networks in the same manner).
  • Plan the data mirroring networks in a way that they provide similar network latency and bandwidth, since each RPV client communicates with its corresponding RPV server over more than one IP-based network at the same time to ensure load balancing.
  • IP Address Takeover (IPAT) via IP Aliasing is Supported on XD-type Networks. HACMP/XD for GLVM 5.4 supports adding highly available alias service IP labels to XD-type networks. IP Address Takeover via IP Aliases is now the default on XD networks.This allows users to create an alias service IP label on an XD network that can reside on multiple nodes.
  • HACMP 5.4 lets you configure site-specific service IP labels, thus you can create a resource group that activates a given IP address on one site and a different IP address on another site.

    Other Features

    This section lists the features that are related to HACMP.

    Upgrading HACMP 5.4 with HACMP Cluster Services and Applications Running

    With HACMP 5.4, you can upgrade the HACMP software on an individual node using rolling migration, while your critical applications and resources continue running on that node though they will not be highly available during the upgrade.

    Starting Cluster Services While Applications Continue to Run

    With HACMP 5.4, you can allow your applications that run outside of HACMP to continue running during installation of HACMP and when starting HACMP. There is no need to stop, restart or reboot the system or applications.

    Stopping Cluster Services

    With HACMP 5.4 you can stop cluster services using the Unmanage Resource Groups option (prior to HACMP 5.4, this was known as forced down) on a maximum of one node at a time. You then upgrade the node, start cluster services to begin monitoring resource groups and make the running applications highly available, and reintegrate the node into the cluster before upgrading the next node.

    Other options for stopping cluster services include Bring Resource Groups Offline (formerly known as Graceful stop) and Move Resource Groups (formerly known as Graceful with Takeover).

    Prior to HACMP 5.4, after installing HACMP you were required to restart your nodes in order to start all HACMP subsystems and to set some attributes in the scripts. In addition, the reboot process was required to keep the clinfo utility and the RSCT deadman switch (DMS) synchronized. HACMP 5.4 takes care of these issues without requiring you to stop and restart your applications.

    For more information about installing HACMP while keeping critical applications running, see the Installation Guide.

    Discontinued Features

    Features discontinued in HACMP 5.2 and 5.3 are listed here.

    Features Discontinued in HACMP 5.2 and Up

    The following features, utilities, or functions are discontinued starting with HACMP 5.2:

  • The cllockd or cllockdES (the Cluster Lock Manager) is no longer supported. During node-by-node migration, it is uninstalled. Installing HACMP 5.2 or 5.3 removes the Lock Manager binaries and definitions. Once a node is upgraded to HACMP 5.2 or 5.3, the Lock Manager state information in SNMP and clinfo shows the Lock Manager as being in the down state on all nodes, regardless of whether or not the Lock Manager is still running on a back-level node.
  • Before upgrading, make sure your applications use their proprietary locking mechanism. Check with your application’s vendor about concurrent access support.

  • Cascading, rotating and predefined concurrent resource groups are not supported. Also, Cascading without Fallback and Inactive Takeover settings are not used. In HACMP 5.2 and 5.3 you can continue using the groups migrated from previous releases. You can also configure these types of groups using the combinations of startup, fallover and fallback policies available for resource groups in HACMP 5.2 and 5.3. For information on how the predefined resource groups and their settings are mapped to the startup, fallover and fallback policies, see the chapter on upgrading to HACMP 5.3 in the Installation Guide.
  • Manual reconfiguration of user-defined events is required. HACMP 5.2 and 5.3 interact with the RSCT Resource Monitoring and Control (RMC) subsystem instead of with the Event Management subsystem. This affects the following utilities:
  • Dynamic Node Priority (DNP)
  • Application Monitoring
  • User-Defined Events (UDE).
  • You must manually reconfigure all user-defined event definitions with the exception of the several user-defined event definitions defined by DB2. The clconvert command only converts a subset of Event Management user-defined event definitions to the corresponding RMC event definitions. For complete information on the mapping of the Event Management resource variables to the RMC resource attributes, see Appendix D: RSCT: Resource Monitoring and Control Subsystem in the Administration Guide.

    Features Discontinued in HACMP 5.3

    The following features, utilities, or functions are discontinued starting with HACMP 5.3:

  • Changes due to the new architecture of Clinfo and Cluster Manager communication:
  • The clsmuxpd daemon is eliminated in HACMP 5.3. Clinfo now only obtains data from SNMP when requested; it no longer obtains the entire cluster configuration at startup.
  • Shared memory is no longer used by Clinfo in HACMP 5.3. Client API requests and responses now flow through the message queue connection with the clinfo daemon.
  • HACMP 5.3 removes the cl_registerwithclsmuxpd() API routine; application monitoring effectively supersedes this function. If before upgrading to HACMP 5.3, you were using the pre- or post-event script for your application that referenced the cl_registerwithclsmuxpd API, the scripts will no longer work as expected. Instead, use the application monitoring function in HACMP, accessible through HACMP SMIT. See Chapter 5: Ensuring Application Availability in this guide for more information.
  • Changes to utilities available from the command line:
  • The cldiag utility. The cldiag utility is no longer supported from the command line; however it is available from the HACMP SMIT Problem Determination Tools menu in an easier, more robust format. The cldiag command line utility is deprecated for HACMP 5.3.
  • The clverify utility.The clverify utility is no longer supported from the command line; however it is available from the HACMP SMIT Verification and Synchronization menu in an easier, more robust format. The clverify command line utility is deprecated for HACMP 5.3.
  • Where You Go from Here

    For planning an HACMP cluster and installing HACMP, see the Planning and Installation Guides.

    For configuring HACMP cluster components and troubleshooting HACMP clusters, see the Administration and Troubleshooting Guides.


    PreviousNextIndex