Table of Contents

Administration Guide

Contents

About This Guide

Chapter 1: Administering an HACMP Cluster

Options for Configuring an HACMP Cluster

Configuration Tasks

Configuring HACMP Using the Standard Configuration Path

Configuring HACMP Using the Extended Configuration Path

Configuring Cluster Events

Verifying and Synchronizing the Configuration

Testing the Cluster

Maintaining an HACMP Cluster

Starting and Stopping Cluster Services

Maintaining Shared Logical Volume Manager Components

Managing the Cluster Topology

Managing Cluster Resources

Managing Cluster Resource Groups

Managing Users and Groups in a Cluster

Managing Cluster Security and Inter-Node Communications

Understanding the /usr/es/sbin/cluster/etc/rhosts File

Saving and Restoring HACMP Cluster Configurations

Additional HACMP Maintenance Tasks

Monitoring the Cluster

Troubleshooting an HACMP Cluster

Related Administrative Tasks

Backing Up Your System

Documenting Your System

Maintaining Highly Available Applications

Helping Users

AIX 5L Files Modified by HACMP

/etc/hosts

/etc/inittab

/etc/rc.net

/etc/services

/etc/snmpd.conf

/etc/snmpd.peers

/etc/syslog.conf

/etc/trcfmt

/var/spool/cron/crontab/root

HACMP Scripts

Startup and Shutdown Scripts

Event Scripts

Chapter 2: Administering a Cluster Using WebSMIT

Working with WebSMIT

Header Frame

Navigation Frame

Activity Frame

Configuring HACMP Using WebSMIT

Common WebSMIT Panel Options

Browser Controls

Functional Limitations

WebSMIT Logs

Configuring and Managing Nodes and Networks in WebSMIT

Configuring and Managing Resources in WebSMIT

Viewing the Cluster Components

Viewing Cluster Configuration Information in WebSMIT

Viewing HACMP Documentation in WebSMIT

Chapter 3: Configuring an HACMP Cluster (Standard)

Overview

Prerequisite Tasks for Using the Standard Path

Assumptions and Defaults for the Standard Path

Steps for Configuring a Cluster Using the Initialization and Standard Configuration Path

Configuring a Two-Node Cluster, or Using Smart Assists

Limitations and Prerequisites

Configuring Applications with the General Configuration Smart Assist

Defining HACMP Cluster Topology (Standard)

Configuring HACMP Resources (Standard)

Configuring Application Servers

Configuring HACMP Service IP Labels/Addresses

Configuring Volume Groups, Logical Volumes, and Filesystems as Cluster Shared Resources

Configuring Concurrent Volume Groups, Logical Volumes, and Filesystems

Configuring HACMP Resource Groups (Standard)

Creating HACMP Resource Groups Using the Standard Path

Configuring Resources in Resource Groups (Standard)

Resource Group Configuration Considerations

Assigning Resources to Resource Groups (Standard)

Verifying and Synchronizing the Standard Configuration

The Cluster Topology Summary

Procedure to Verify and Synchronize the HACMP Configuration

Viewing the HACMP Configuration

Additional Configuration Tasks

Testing Your Configuration

Chapter 4: Configuring HACMP Cluster Topology and Resources (Extended)

Understanding the Extended Configuration Options

Steps for Configuring an HACMP Cluster Using the Extended SMIT Menu

Discovering HACMP-Related Information

Configuring Cluster Topology (Extended)

Configuring an HACMP Cluster

Resetting Cluster Tunables

Configuring HACMP Nodes

Defining HACMP Sites

Configuring HACMP Networks and Heartbeat Paths

Configuring Communication Interfaces/Devices to HACMP

Configuring Heartbeating over Disk

Configuring HACMP Persistent Node IP Labels/Addresses

Configuring Node-Bound Service IP Labels

Configuring HACMP Global Networks

Configuring HACMP Network Modules

Configuring Topology Services and Group Services Logs

Showing HACMP Topology

Configuring HACMP Resources (Extended)

Configuring Service IP Labels as HACMP Resources

Configuring HACMP Application Servers

Configuring Volume Groups, Logical Volumes, and Filesystems as Resources

Configuring Concurrent Volume Groups, Logical Volumes, and Filesystems as Resources

Configuring Multiple Application Monitors

Steps for Configuring Multiple Application Monitors

Configuring Tape Drives as HACMP Resources

Configuring AIX 5L Fast Connect

Configuring Highly Available Communication Links

Configuring SNA-Over-LAN Communication Links

Configuring X.25 Communication Links

Configuring SNA-Over-X.25 Communication Links

Notes on Application Service Scripts for Communication Links

Customizing Resource Recovery

Where You Go From Here

Chapter 5: Configuring HACMP Resource Groups (Extended)

Overview

Configuring Resource Groups

Limitations and Prerequisites for Configuring Resource Groups

Steps for Configuring Resource Groups in SMIT

Dynamic Node Priority Policies

Configuring Resource Group Runtime Policies

Configuring Dependencies between Resource Groups

Considerations for Dependencies between Resource Groups

Steps to Configure Dependencies between Resource Groups

Configuring Resource Groups with Dependencies

Configuring Processing Order for Resource Groups

Configuring Workload Manager

Reconfiguration, Startup, and Shutdown of WLM by HACMP

Configuring a Settling Time for Resource Groups

Defining Delayed Fallback Timers

Assigning a Delayed Fallback Policy to a Resource Group

Using the Node Distribution Startup Policy

Adding Resources and Attributes to Resource Groups Using the Extended Path

Steps for Adding Resources and Attributes to Resource Groups (Extended Path)

Customizing Inter-Site Resource Group Recovery

Enabling or Disabling Selective Fallover between Sites

Reliable NFS Function

Relinquishing Control over NFS Filesystems in an HACMP Cluster

NFS Exporting Filesystems and Directories

Forcing a Varyon of Volume Groups

When HACMP Attempts a Forced Varyon

Avoiding a Partitioned Cluster

Verification Checks for Forced Varyon

Testing Your Configuration

Chapter 6: Configuring Cluster Events

Considerations for Pre- and Post-Event Scripts

Using Shell Environment Variables in Pre- and Post-Event Scripts

event_error Now Indicates Failure on a Remote Node

Parallel Processing of Resource Groups Affects Event Processing

Dependent Resource Groups and the Use of Pre- and Post-Event Scripts

Configuring Pre- and Post-Event Commands

Configuring Pre- and Post- Event Processing

Configuring User-Defined Events

Changing or Showing User-Defined Events

Removing User-Defined Events

Tuning Event Duration Time Until Warning

Prerequisites and Notes

Changing Event Duration Time Until Warning

Configuring a Custom Remote Notification Method

Prerequisites

Defining a Remote Notification Method

Changing or Removing a Custom Remote Notification Method

Chapter 7: Verifying and Synchronizing an HACMP Cluster

Overview

Running Cluster Verification

Automatic Verification and Synchronization

Understanding the HACMP Cluster Verification Process

Cluster Verification during a Dynamic Cluster Reconfiguration Event

Parameters Automatically Corrected

Understanding the Detailed Phases of Verification

Verifying the HACMP Configuration Using SMIT

Verifying and Synchronizing a Cluster Configuration

Verifying and Synchronizing the Cluster Configuration

Running Corrective Actions during Verification

Managing HACMP File Collections

Default HACMP File Collections

Options for Propagating an HACMP File Collection

Using SMIT to Manage HACMP File Collections

Adding a Custom Verification Method

Changing or Showing a Custom Verification Method

Removing a Custom Verification Method

List of Reserved Words

Chapter 8: Testing an HACMP Cluster

Prerequisites

Overview

Automated Testing

Custom Testing

Test Duration

Security

Limitations

Running Automated Tests

Launching the Cluster Test Tool

Modifying Logging and Stopping Processing in the Cluster Test Tool

Understanding Automated Testing

General Topology Tests

Network Tests

Volume Group Tests

Site-Specific Tests

Catastrophic Failure Test

Setting up Custom Cluster Testing

Planning a Test Procedure

Creating a Custom Test Procedure

Creating a Test Plan

Specifying Parameters for Tests

Using a Variables File

Using Environment Variables

Using the Test Plan

Description of Tests

Test Syntax

Node Tests

Network Tests for an IP Network

Network Interface Tests for IP Networks

Network Tests for a Non-IP Network

Resource Group Tests

Volume Group Tests

Site Tests

General Tests

Example Test Plan

Running Custom Test Procedures

Launching a Custom Test Procedure

Evaluating Results

Criteria for Test Success or Failure

Recovering the Control Node after Cluster Manager Stops

How to Avoid Manual Intervention

Error Logging

Log Files: Overview

Log File Example

The hacmp.out File

Verbose Logging: Overview

Customizing the Types of Information to Collect

Adding Data from hacmp.out to the Cluster Test Tool Log File

Fixing Problems when Running Cluster Tests

Cluster Test Tool Stops Running

Control Node Becomes Unavailable

Cluster Does Not Return to a Stable State

Working with Timer Settings

Testing Does Not Progress as Expected

Unexpected Test Results

Chapter 9: Starting and Stopping Cluster Services

Overview

Starting Cluster Services

A Note on Application Monitors

Procedure for Starting Cluster Services

Modifying the Startup of Cluster Services

Stopping Cluster Services

Procedure for Stopping Cluster Services

Stopping HACMP Cluster Services without Stopping Applications

Abnormal Termination of Cluster Manager Daemon

AIX 5L Shutdown and Cluster Services

Stopping HACMP Cluster Services and RSCT

Maintaining Cluster Information Services

Starting Clinfo on a Client

Stopping Clinfo on a Client

Enabling Clinfo for Asynchronous Event Notification

Gratuitous ARP Support

Chapter 10: Monitoring an HACMP Cluster

Periodically Monitoring an HACMP Cluster

Automatic Cluster Configuration Monitoring

Tools for Monitoring an HACMP Cluster

Monitoring a Cluster with HAView

HAView Installation Requirements

HAView File Modification Considerations

Tivoli NetView Hostname Requirements for HAView

Starting HAView

Viewing Clusters and Components

Obtaining Component Details in HAView

Customizing HAView Polling Intervals

Removing a Cluster from HAView

Using the HAView Cluster Administration Utility

HAView Browsers

Monitoring Clusters with Tivoli Distributed Monitoring

Cluster Monitoring and Cluster Administration Options

Using Tivoli to Monitor the Cluster

Using Tivoli to Perform Cluster Administration Tasks

Uninstalling HACMP-Related Files from Tivoli

Monitoring Clusters with clstat

Viewing clstat with WebSMIT

Viewing clstat in ASCII Display Mode

Viewing clstat in X Window System Display Mode

Viewing clstat with a Web Browser

Monitoring Applications

A Note on Application Monitors

Displaying an Application-Centric Cluster View

Measuring Application Availability

Planning and Configuring for Measuring Application Availability

Configuring and Using the Application Availability Analysis Tool

Reading the clavan.log File

Using Resource Groups Information Commands

Using the clRGinfo Command

Using the cldisp Command

Using HACMP Topology Information Commands

Monitoring Cluster Services

Monitoring Cluster Services on a Node

Monitoring Cluster Services on a Client

HACMP Log Files

Size of /var Filesystem May Need to Be Increased

/tmp/clinfo.debug File

/tmp/clsmuxtrmgr.debug Log File

/tmp/hacmp.out File

/tmp/clstrmgr.debug Log File

/tmp/cspoc.log File

/tmp/emuhacmp.out File

/usr/es/adm/cluster.log File

/usr/es/sbin/cluster/history/cluster.mmddyyyy File

/var/adm/clavan.log File

/var/hacmp/clcomd/clcomd.log File

/var/hacmp/clcomd/clcomddiag.log File

/var/hacmp/clverify/clverify.log File

/var/hacmp/log/clutils.log File

/var/ha/log/grpsvcs.<filename> File

/var/ha/log/topsvcs.<filename> File

/var/ha/log/grpglsm File

Chapter 11: Managing Shared LVM Components

Overview

Common Maintenance Tasks

Understanding C-SPOC

Understanding C-SPOC and Its Relation to Resource Groups

Updating LVM Components in an HACMP Cluster

Lazy Update Processing in an HACMP Cluster

Forcing an Update before Fallover

Maintaining Shared Volume Groups

Enabling Fast Disk Takeover

Understanding Active and Passive Varyon in Enhanced Concurrent Mode

Collecting Information on Current Volume Group Configuration

Importing Shared Volume Groups

Creating a Shared Volume Group with C-SPOC

Setting Characteristics of a Shared Volume Group

Mirroring a Volume Group Using C-SPOC

Unmirroring a Volume Group Using C-SPOC

Synchronizing Volume Group Mirrors

Synchronizing a Shared Volume Group Definition

Maintaining Logical Volumes

Adding a Logical Volume to a Cluster Using C-SPOC

Setting Characteristics of a Shared Logical Volume Using C-SPOC

Changing a Shared Logical Volume

Removing a Logical Volume Using C-SPOC

Synchronizing LVM Mirrors by Logical Volume

Maintaining Shared Filesystems

Journaled Filesystem and Enhanced Journaled Filesystem

Creating Shared Filesystems with C-SPOC

Adding the Filesystem to an HACMP Cluster Logical Volume

Changing a Shared Filesystem in HACMP Using C-SPOC

Removing a Shared Filesystem Using C-SPOC

Maintaining Physical Volumes

Adding a Disk Definition to Cluster Nodes Using C-SPOC

Removing a Disk Definition on Cluster Nodes Using C-SPOC

Using SMIT to Replace a Cluster Disk

Managing Data Path Devices with C-SPOC

Configuring Cross-Site LVM Mirroring

Prerequisites

Steps to Configure Cross-Site LVM Mirroring

Showing and Changing Cross-Site LVM Mirroring Definition

Removing a Disk from a Cross-Site LVM Mirroring Site Definition

Troubleshooting Cross-Site LVM Mirroring

Chapter 12: Managing Shared LVM Components in a Concurrent Access Environment

Overview

Understanding Concurrent Access and HACMP Scripts

Nodes Join the Cluster

Nodes Leave the Cluster

Maintaining Concurrent Access Volume Groups

Activating a Volume Group in Concurrent Access Mode

Determining the Access Mode of a Volume Group

Restarting the Concurrent Access Daemon (clvmd)

Verifying a Concurrent Volume Group

Maintaining Concurrent Volume Groups with C-SPOC

Creating a Concurrent Volume Group on Cluster Nodes Using C-SPOC

Converting Volume Groups to Enhanced Concurrent Mode

Listing All Concurrent Volume Groups in the Cluster

Importing a Concurrent Volume Group with C-SPOC

Extending a Concurrent Volume Group with C-SPOC

Enabling or Disabling Cross-Site LVM Mirroring

Removing a Physical Volume from a Concurrent Volume Group with C-SPOC

Mirroring a Concurrent Volume Group Using C-SPOC

Unmirroring a Concurrent Volume Group Using C-SPOC

Synchronizing Concurrent Volume Group Mirrors

Maintaining Concurrent Logical Volumes

Listing All Concurrent Logical Volumes in the Cluster

Adding a Concurrent Logical Volume to a Cluster

Removing a Concurrent Logical Volume

Setting Characteristics of a Concurrent Logical Volume

Chapter 13: Managing the Cluster Topology

Reconfiguring a Cluster Dynamically

Requirements before Reconfiguring

Viewing the Cluster Topology

Using the cltopinfo Command

Managing Communication Interfaces in HACMP

Configuring Communication Interfaces/Devices to the Operating System on a Node

Updating HACMP Communication Interfaces/Devices with AIX 5L Settings

Swapping IP Addresses between Communication Interfaces Dynamically

Replacing a PCI Hot-Pluggable Network Interface Card

Changing a Cluster Name

Changing the Configuration of Cluster Nodes

Adding a Cluster Node to the HACMP Configuration

Removing a Cluster Node from the HACMP Configuration

Changing the Name of a Cluster Node

Changing the Configuration of an HACMP Network

Adding a Network

Changing Network Attributes

Removing an HACMP Network

Converting an HACMP Network to use IP Aliasing

Establishing Default and Static Routes on Aliased Networks

Converting an SP Switch Network to an Aliased Network

Disabling IPAT via IP Aliases

Controlling Distribution Preferences for Service IP Label Aliases

Changing the Configuration of Communication Interfaces

Configuring Multiple Logical Interfaces on the Same ATM NIC

Adding HACMP Communication Interfaces/Devices

Removing a Communications Interface from a Cluster Node

Managing Persistent Node IP Labels

Configuring Persistent Node IP Labels/Addresses

Changing Persistent Node IP Labels

Deleting Persistent Node IP Labels

Changing the Configuration of a Global Network

Adding an HACMP Network to a Global Network

Removing an HACMP Network from a Global Network

Changing the Configuration of a Network Module

Understanding Network Module Settings

Resetting the Network Module Tunable Values to Defaults

Behavior of Network Down on Serial Networks

Changing the Failure Detection Rate of a Network Module

Showing a Network Module

Removing a Network Module

Changing an RS232 Network Module Baud Rate

Changing the Configuration of a Site

Removing a Site Definition

Synchronizing the Cluster Configuration

Dynamic Reconfiguration Issues and Synchronization

Releasing a Dynamic Reconfiguration Lock

Chapter 14: Managing the Cluster Resources

Dynamic Reconfiguration: Overview

Reconfiguring a Cluster Dynamically

Requirements before Reconfiguring

Dynamic Cluster Resource Changes

Reconfiguring Application Servers

Changing an Application Server

Removing an Application Server

Changing or Removing Application Monitors

Suspending and Resuming Application Monitoring

Changing the Configuration of an Application Monitor

Removing an Application Monitor

Reconfiguring Service IP Labels as Resources in Resource Groups

Steps for Changing the Service IP Labels/Addresses Definitions

Deleting Service IP Labels

Changing Distribution Preference for Service IP Label Aliases

Viewing Distribution Preference for Service IP Label Aliases

Reconfiguring Communication Links

Changing Communication Adapter Information

Removing a Communication Adapter from HACMP

Changing Communication Link Information

Removing a Communication Link from HACMP

Reconfiguring Tape Drive Resources

Changing a Tape Resource

Removing a Tape Device Resource

Using NFS with HACMP

Reconfiguring Resources in Clusters with Dependent Resource Groups

Reconfiguring Resources and Topology Dynamically

Making Dynamic Changes to Dependent Resource Groups

Cluster Processing During DARE in Clusters with Dependent Resource Groups

Synchronizing Cluster Resources

Chapter 15: Managing Resource Groups in a Cluster

Changes to Resource Groups

Reconfiguring Cluster Resources and Resource Groups

Adding a Resource Group

Removing a Resource Group

Changing Resource Group Processing Order

Resource Group Ordering during DARE

Changing the Configuration of a Resource Group

Changing Resource Group Attributes

Changing a Dynamic Node Priority Policy

Changing a Delayed Fallback Timer Policy

Showing, Changing, or Deleting a Settling Time Policy

Changing a Location Dependency between Resource Groups

Changing a Parent/Child Dependency between Resource Groups

Displaying a Parent/Child Dependency between Resource Groups

Removing a Dependency between Resource Groups

Adding or Removing Individual Resources

Reconfiguring Resources in a Resource Group

Forcing a Varyon of a Volume Group

Resource Group Migration

Requirements before Migrating a Resource Group

Migrating Resource Groups with Dependencies

Migrating Resource Groups Using SMIT

Migrating Resource Groups from the Command Line

Special Considerations when Stopping a Resource Group

Checking Resource Group State

Migrating Resource Groups with Replicated Resources

Customizing Inter-Site Resource Group Recovery

Chapter 16: Managing User and Groups

Overview

Requirements for Managing User Accounts in an HACMP Cluster

User Account Configuration

Status of C-SPOC Actions

Managing User Accounts across a Cluster

Listing Users On All Cluster Nodes

Adding User Accounts on all Cluster Nodes

Changing Attributes of User Accounts in a Cluster

Removing User Accounts from a Cluster

Managing Password Changes for Users

Prerequisites for Allowing Users to Change Passwords

Allowing Users to Change Their Own Passwords

Configuring the Cluster Password Utility

Configuring Authorization

Changing Passwords for User Accounts

Changing the Password for Your Own User Account

Managing Group Accounts

Listing Groups on All Cluster Nodes

Adding Groups on Cluster Nodes

Changing Characteristics of Groups in a Cluster

Removing Groups from the Cluster

Chapter 17: Managing Cluster Security

Overview

Configuring Cluster Security

Configuring Connection Authentication

Standard Security Mode

Kerberos Security Mode

Setting the HACMP Security Mode

Setting Up Cluster Communications over a VPN

Configuring Message Authentication and Encryption

Prerequisites

Managing Keys

About Configuring Message Authentication and Encryption

Configuring Message Authentication and Encryption using Automatic Key Distribution

Configuring Message Authentication and Encryption using Manual Key Distribution

Changing the Security Authentication Mode

Changing a Key

Troubleshooting Message Authentication and Encryption

Chapter 18: Saving and Restoring Cluster Configurations

Overview

Relationship between the OLPW Cluster Definition File and a Cluster Snapshot

Information Saved in a Cluster Snapshot

Format of a Cluster Snapshot

clconvert_snapshot Utility

Defining a Custom Snapshot Method

Changing or Removing a Custom Snapshot Method

Creating (Adding) a Cluster Snapshot

Applying a Cluster Snapshot

Dynamic Changes and Cluster Snapshots

Undoing an Applied Snapshot

Changing a Cluster Snapshot

Removing a Cluster Snapshot

Appendix A: 7x24 Maintenance

Appendix B: Resource Group Behavior during Cluster Events

Appendix C: HACMP for AIX Commands

Appendix D: RSCT: Resource Monitoring and Control Subsystem

Appendix E: Using DLPAR and CUoD in an HACMP Cluster

Notices for HACMP Administration Guide

Index