PreviousNextIndex

Chapter 18: Saving and Restoring Cluster Configurations


This chapter explains how to use the cluster snapshot utility to save and restore cluster configurations. The following sections explain the utility:

  • Overview
  • Defining a Custom Snapshot Method
  • Changing or Removing a Custom Snapshot Method
  • Creating (Adding) a Cluster Snapshot
  • Applying a Cluster Snapshot
  • Changing a Cluster Snapshot
  • Removing a Cluster Snapshot
  • Overview

    The cluster snapshot utility allows you to save to a file a record of all the data that defines a particular cluster configuration. This facility gives you the ability to recreate a particular cluster configuration—a process called applying a snapshot—provided the cluster is configured with the requisite hardware and software to support the configuration.

    In addition, a snapshot can provide useful information for troubleshooting cluster problems. Because the snapshots are simple ASCII files that can be sent via e-mail, they can make remote problem determination easier.

    Note: You cannot use the cluster snapshot facility in a cluster concurrently running different versions of HACMP.

    By default, HACMP does not collect cluster log files when you create the cluster snapshot. Cluster snapshots are used for recording the cluster configuration information, whereas cluster logs only record the operation of the cluster and not the configuration information. Skipping the log collection reduces the size of the snapshot and speeds up running the snapshot utility. The size of the cluster snapshot depends on the configuration. For instance, a basic two-node configuration requires roughly 40KB.

    Note: You can change the default to collect cluster log files using SMIT if you need logs for problem reporting. This option is available under the SMIT menu Problem Determination Tools > HACMP Log Viewing and Management. It is recommended to use this option only if IBM support personnel request logs.

    You can also add your own custom snapshot methods to store additional user-specified cluster and system information in your snapshots. The output from these user-defined custom methods is reported along with the conventional snapshot information.

    Relationship between the OLPW Cluster Definition File and a Cluster Snapshot

    There is more than one way of capturing the cluster configuration:

  • The cluster snapshot
  • The cluster definition file that you create and edit using the Online Planning Worksheets application.
  • This section clarifies the relationship between these two utilities and helps you to decide when to use each utility.

    The Online Planning Worksheets (OLPW) application allows you to save your cluster configuration data, as does the cluster snapshot. In addition, OLPW allows you to edit your configuration data, reading data exported from your current HACMP configuration or data exported from a converted snapshot file. However, the set of data saved from OLPW in the cluster definition file is less comprehensive than a cluster snapshot. For example, OLPW does not contain all ODM information.

    For information about using the Online Planning Worksheets application, see Chapter 9: Using Online Planning Worksheets in the Planning Guide.

    To help you decide whether you should use a cluster snapshot or a cluster definition file, see the following table, which lists various scenarios:

    Use this type of configuration file:
    When you are:
    Cluster Snapshot
    • Upgrading HACMP to the current version
    • Troubleshooting HACMP configuration problems.
    OLPW Cluster Definition File
    • Planning your HACMP cluster configuration
    • Viewing your HACMP cluster configuration in an easy-to-read format
    • Editing your HACMP cluster configuration information.
    Either type
    • Capturing cluster and node configuration information to record the state of your cluster.

    Information Saved in a Cluster Snapshot

    The primary information saved in a cluster snapshot is the data stored in the HACMP Configuration Database classes (such as HACMPcluster, HACMPnode, HACMPnetwork, HACMPdaemons).This information is used to recreate the cluster configuration when a cluster snapshot is applied to nodes installed with HACMP.

    The cluster snapshot does not save any user-customized scripts, applications, or other non-HACMP configuration parameters. For example, the names of application servers and the locations of their start and stop scripts are stored in the HACMPserver Configuration Database object class. However, the scripts themselves as well as any applications they may call are not saved.

    The cluster snapshot also does not save any device- or configuration-specific data that is outside the scope of HACMP. For instance, the facility saves the names of shared filesystems and volume groups; however, other details, such as NFS options or LVM mirroring configuration are not saved.

    If you moved resource groups using the Resource Group Management utility clRGmove, once you apply a snapshot, the resource groups return to behaviors specified by their default nodelists.

    To investigate a cluster after a snapshot has been applied, run clRGinfo to view the locations and states of resource groups.

    Note: In HACMP 5.2 and up, you can reset cluster tunable values using the SMIT interface. HACMP creates a cluster snapshot, prior to resetting. After the values have been reset to their defaults, you can apply the snapshot and return to customized cluster settings, if needed. For more information, see Resetting HACMP Tunable Values section in Chapter 1: Troubleshooting HACMP Clusters in the Troubleshooting Guide.

    Format of a Cluster Snapshot

    The cluster snapshot utility stores the data it saves in two separate files created in the directory /usr/es/sbin/cluster/snapshots:

    ODM Data File (.odm)
    This file contains all the data stored in the HACMP Configuration Database object classes for the cluster. This file is given a user-defined basename with the .odm file extension. Because the Configuration Database information is largely the same on every cluster node, the cluster snapshot saves the values from only one node.
    Cluster State Information
    File (.info)
    This file contains the output from standard AIX 5L and HACMP system management commands. This file is given the same user-defined basename file with the .info file extension. Output from custom snapshot methods is appended to this file.

    Cluster Snapshot ODM Data File

    The cluster snapshot Configuration Database data file is an ASCII text file divided into three delimited sections:

    Version section
    This section identifies the version of the cluster snapshot. The characters <VER identify the start of this section; the characters </VER identify the end of this section. The version number is set by the cluster snapshot software.
    Description section
    This section contains user-defined text that describes the cluster snapshot. You can specify up to 255 characters of descriptive text. The characters <DSC identify the start of this section; the characters </DSC identify the end of this section.
    ODM data section
    This section contains the HACMP Configuration Database object classes in generic AIX 5L ODM stanza format. The characters <ODM identify the start of this section; the characters </ODM identify the end of this section.

    The following is an excerpt from a sample cluster snapshot Configuration Database data file showing some of the ODM stanzas that are saved.

    <VER 
    1.0 
    </VER 
    <DSC 
    My Cluster Snapshot 
    </DSC 
    <ODM 
    HACMPcluster: 
    
    id = 97531
    name = "Breeze1"
    nodename = “mynode”
    sec_level = “Standard”
    last_node_ids = “2,3”
    highest_node_id = 3
    last_network_ids = “3,6”
    highest_network_id = 6
    last_site_ides = “ “
    highest_site_id = 0
    handle = 3
    cluster_version = 5
    reserved1 = 0
    reserved2 = 0
    wlm_subdir = “ “
    HACMPnode: 
    
    name = “mynode”
    object = “VERBOSE_LOGGING”
    value = “high”
    . 
    . 
    . 
    </ODM 
    

    clconvert_snapshot Utility

    You can run clconvert_snapshot to convert cluster snapshots from a release supported for upgrade to a recent HACMP release. The clconvert_snapshot is not run automatically during installation, and you must always run it from the command line. Each time you run the clconvert_snapshot command, conversion progress is logged to the /tmp/clconvert.log file.

    Note: Root user privilege is required to run clconvert_snapshot. You must know the HACMP version from which you are converting in order to run this utility.

    For more information on the clconvert_snapshot utility, refer to the clconvert_snapshot man page or to Appendix C: HACMP for AIX Commands.

    Defining a Custom Snapshot Method

    If you want additional, customized system and cluster information to be appended to the .info file, you should define custom snapshot methods to be executed when you create your cluster snapshot.

    To define a custom snapshot method, perform the following steps.

      1. Enter smit hacmp
      2. In SMIT, select HACMP Extended Configuration > Snapshot Configuration > Configure Custom Snapshot Method > Add a Custom Snapshot Method and press Enter.
      3. Enter field values as follows:
    Custom Snapshot Method Name
    A name for the custom snapshot method you would like to create.
    Custom Snapshot Method Description
    Add any descriptive information about the custom method.
    Custom Snapshot Script Filename
    Add the full pathname to the custom snapshot scriptfile.

    Once you have defined one or more custom snapshot methods, when you create a cluster snapshot you are asked to specify which custom method(s) you wish to run in addition to the conventional snapshot.

    Changing or Removing a Custom Snapshot Method

    After you have defined a custom snapshot method, you can change or delete it using the other menu items in the Configure Custom Snapshot Method SMIT panel: Change/Show a Custom Snapshot Method and Remove a Custom Snapshot Method.

    When you select one of these menus, a picklist of existing custom snapshot methods appears. Select the one you wish to change or remove and fill in the appropriate fields, or answer the prompt to confirm deletion.

    Creating (Adding) a Cluster Snapshot

    You can initiate cluster snapshot creation from any cluster node. You can create a cluster snapshot on a running cluster. The cluster snapshot facility retrieves information from each node in the cluster. Accessibility to all nodes is required. The snapshot is stored on the local node.

    Note: To get an accurate snapshot of a system that has been configured with Kerberos security, you must set up all Kerberos service principals before taking the snapshot. For details about configuring cluster security see Chapter 16: Managing User and Groups.

    To create a cluster snapshot:

      1. Enter smit hacmp
      2. In SMIT, select HACMP Extended Configuration > Snapshot Configuration > Add a Cluster Snapshot and press Enter.
      3. Enter field values as follows:
    Cluster Snapshot Name
    The name you want for the basename for the cluster snapshot files. The default directory path for storage and retrieval of the snapshot is /usr/es/sbin/cluster/snapshots. You can specify an alternate path using the SNAPSHOTPATH environment variable.
    Custom Defined Snapshot Methods
    Specify one or more custom snapshot methods to be executed if desired; press F4 for a picklist of custom methods on this node. If you select All, the custom methods will be executed in alphabetical order on each node.
    Save Cluster Log Files in a Snapshot
    The default is No. If you select Yes, HACMP collects cluster log files from all nodes and saves them in the snapshot. Saving log files can significantly increase the size of the snapshot.
    Cluster Snapshot Description
    Enter any descriptive text you want inserted into the cluster snapshot. You can specify any text string up to 255 characters in length.

    Applying a Cluster Snapshot

    Applying a cluster snapshot overwrites the data in the existing HACMP Configuration Database classes on all nodes in the cluster with the new Configuration Database data contained in the snapshot. You can apply a cluster snapshot from any cluster node.

    Note: Only the information in the .odm file is applied. The .info file is not needed to apply a snapshot.

    Applying a cluster snapshot may affect HACMP Configuration Database objects and system files as well as user-defined files.

  • If cluster services are inactive on all cluster nodes, applying the snapshot changes the Configuration Database data stored in the system default configuration directory (DCD).
  • If cluster services are active on the local node, applying a snapshot triggers a cluster-wide dynamic reconfiguration event.
  • During dynamic reconfiguration, in addition to synchronizing the Configuration Database data stored in the DCDs on each node, HACMP replaces the current configuration data stored in the active configuration directory (ACD) with the updated configuration data in the DCD. The snapshot becomes the active configuration. For more information about dynamic reconfiguration of a cluster, see Chapter 15: Managing Resource Groups in a Cluster.
    Note: A cluster snapshot used for dynamic reconfiguration may contain changes to the cluster topology and to the cluster resources. You can change both the cluster topology and cluster resources in a single dynamic reconfiguration event.

    To apply a cluster snapshot using SMIT, perform the following steps:

      1. Enter smit hacmp
      2. In SMIT, select HACMP Extended Configuration > Snapshot Configuration > Apply a Cluster Snapshot and press Enter.
    SMIT displays the Cluster Snapshot to Apply panel containing a list of all the cluster snapshots that exist in the directory specified by the SNAPSHOTPATH environment variable.
      3. Select the cluster snapshot that you want to apply and press Enter. SMIT displays the Apply a Cluster Snapshot panel.
      4. Enter field values as follows:
    Cluster Snapshot Name
    Displays the current basename of the cluster snapshot. This field is not editable.
    Cluster Snapshot Description
    Displays the text stored in the description section of the snapshot files. This field is not editable.
    Un/Configure Cluster Resources?
    If you set this field to Yes, HACMP changes the definition of the resource in the Configuration Database and performs any configuration triggered by the resource change. For example, if you remove a filesystem, HACMP removes the filesystem from the Configuration Database and also unmounts the filesystem. By default, this field is set to Yes.
    If you set this field to No, HACMP changes the definition of the resource in the Configuration Database but does not perform any configuration processing that the change may require. For example, a filesystem would be removed from the HACMP cluster definition but would not be unmounted. This processing is left to be performed by HACMP during a fallover.
    HACMP attempts to limit the impact on the resource group when a component resource is changed. For example, if you add a filesystem to the resource group that already includes the underlying volume group as a resource, HACMP does not require any processing of the volume group. Other modifications made to the contents of a resource group may cause the entire resource group to be unconfigured and reconfigured during the dynamic reconfiguration. Cluster clients experience an interruption in related services while the dynamic reconfiguration is in progress.
    Force apply if verify fails?
    If this field is set to No, synchronization aborts if verification of the new configuration fails. As part of dynamic reconfiguration processing, the new configuration is verified before it is made the active configuration. By default, this field is set to No.
    If you want synchronization to proceed even if verification fails, set this value to Yes.
    Note: In some cases, the verification uncovers errors that do not cause the synchronization to fail. HACMP reports the errors in the SMIT command status window so that you are aware of an area of the configuration that may be a problem. You should investigate any error reports, even when they do not interfere with the synchronization.

    If the apply process fails or you want to go back to the previous configuration for any reason, you can re-apply an automatically saved configuration. See Undoing an Applied Snapshot below for details.

    Dynamic Changes and Cluster Snapshots

    If you create a cluster snapshot and make a dynamic (DARE) change to a working cluster, such as removing and then re-adding a network, the snapshot may fail due to naming issues. For example, the following steps would make a snapshot fail:

      1. Start the cluster.
      2. Create a snapshot.
      3. Remove a network dynamically.
      4. Add a network dynamically using the same name as the one that was removed in step 3.
      5. Attempt to apply snapshot from step 2.

    However, if you use a different network name in step 4 than the network that was removed, you can apply the snapshot successfully. (The problem is that a different network ID is used when the network is added back into the cluster.)

    Undoing an Applied Snapshot

    Before the new configuration is applied, the cluster snapshot facility automatically saves the current configuration in a snapshot called ~snapshot.n.odm, where n is either 1 (the most recent), 2, or 3. The saved snapshots are cycled so that only three generations of snapshots exist. If the apply process fails or you want to go back to the previous configuration for any reason, you can re-apply the saved configuration. The saved snapshot are stored in the directory specified by the SNAPSHOTPATH environment variable.

    Changing a Cluster Snapshot

    After creating a cluster snapshot, you can change the basename assigned to cluster snapshot files and the description contained in these files. Note that you must use the SMIT interface to perform this task.

    To change a cluster snapshot, perform the following steps:

      1. Enter smit hacmp
      2. In SMIT, select HACMP Extended Configuration > Snapshot Configuration > Change/Show a Cluster Snapshot and press Enter.

    SMIT displays the Change/Show a Cluster Snapshot panel with a list of all the cluster snapshots that exist in the directory specified by SNAPSHOTPATH.

      3. Select the cluster snapshot to change and press Enter.
      4. Enter field values as follows:
    Cluster Snapshot Name
    Displays the current basename of the cluster snapshot.
    New Cluster
    Snapshot Name
    Enter the new name you want assigned as the basename of the cluster snapshot files.
    Cluster Snapshot Description
    SMIT displays the current description. You can edit the text using up to 255 characters.

    Removing a Cluster Snapshot

    Removing a cluster snapshot deletes both of the ASCII files (.odm and .info) that define the snapshot from the snapshots directory. (The directory in which the snapshots are stored is defined in the SNAPSHOTPATH environment variable.) You must use SMIT to remove a cluster snapshot.

    To remove a cluster snapshot using the SMIT interface, perform the following steps:

      1. Enter smit hacmp
      2. In SMIT, select HACMP Extended Configuration > HACMP Snapshot Configuration > Remove a Cluster Snapshot and press Enter.
    SMIT generates and displays a list of all the cluster snapshots that exist in the directory specified by the SNAPSHOTPATH environment variable.
      3. Select the cluster snapshot to remove and press Enter.
    The cluster snapshot facility deletes the files in the snapshot directory that are associated with that snapshot.

    PreviousNextIndex