PreviousNextIndex

Chapter 1: Cluster Information Program


This chapter provides an overview of the Cluster Information Program (Clinfo) and a description of the status information that Clinfo receives and maintains about an HACMP for AIX 5L cluster.

Cluster Information Program Overview

An HACMP cluster can undergo various transitions in its state over time. For example, a node can join or leave the cluster or an application can fallover to a backup node. Each of these changes affects the state of the cluster. Because a cluster is dynamic, an application must be able to obtain current, accurate information about the cluster so that it can respond to changes as they occur. Clinfo provides this service.

The HACMP cluster software exports state information and cluster events through SNMP, an industry standard network protocol. Writing programs to the SNMP API can be non-trivial and may require several transactions to obtain all the information related to a single cluster entity like a node or resource group.

The Clinfo API and associated components provide a library of access routines that supply the same cluster information using a straightforward programming model. For more information about the SNMP information available from HACMP or details of the Clinfo API implementation, please refer to Appendix A: Implementation Specifics.

The following sections provide additional details on using Clinfo in the HACMP environment:

  • Getting Started
  • Clinfo APIs
  • Events Tracked by Clinfo
  • Cluster Information Tracked by Clinfo.
  • Getting Started

    The Clinfo API and its associated components are installed and configured when you install HACMP. There are a few things to consider before you begin using Clinfo:

      1. In order to use Clinfo you must specify that the Clinfo agent be started when you start HACMP cluster services. This is the Startup cluster information Daemon option in smitty clstart.
      2. Clinfo will recognize and use SNMP community names other than public. If your installation is using a non-public name or if you want to specify a specific name, see Appendix A: Implementation Specifics.
      3. Clinfo will run on each node where you install HACMP. It can also run on other machines provided they have TCP/IP connectivity to one of the HACMP nodes. For additional information about this capability, see Appendix A: Implementation Specifics.

    Clinfo APIs

    An application accesses cluster information through the Clinfo API functions. Developers can use either the Clinfo C API or the Clinfo C++ API to access the cluster status information, which is then available locally for clients running the Clinfo program.

    HACMP for AIX 5L includes two versions of the Clinfo C and C++ API libraries: one for single-threaded (non-threaded) applications (libcl.a and libclpp.a) and one for multi-threaded applications (libcl_r.a and libclpp_r.a). Be sure to link with the appropriate version for your application. The libcl_r.a is a thread-safe version of the libcl.a; the libclpp_r.a is a thread-safe version of the libclpp.a.

    Each of these libraries contain 32-bit and 64-bit objects, which are loaded at runtime depending on the AIX 5L operating environment.

    See Chapter 2: Clinfo C API and Chapter 3: Clinfo C++ API, for detailed descriptions of the routines in these APIs.

    Events Tracked by Clinfo

    Clinfo receives status information about the cluster events from the Cluster Manager. This information is accessible by the routines in the APIs. Clinfo tracks topology events, as the cluster passes through various states. Some of the states Clinfo tracks include:

  • Cluster state is up or down
  • Cluster substate has become stable or unstable
  • Application has come online or failed over to a backup node
  • Network has failed
  • Node is in the process of joining the cluster
  • Node has completed joining the cluster
  • Node is leaving the cluster (that is, the node has failed)
  • Node has left the cluster
  • New primary Cluster Manager has been elected (optional event)
  • Change of state for a cluster site (if configured).
  • A complete listing of events can be found in subsequent sections of this document or in the clinfo.h include file, which is compiled into your application.

    Clinfo receives dynamic reconfiguration events but does not track them; that is, applications cannot register to receive notification of dynamic reconfiguration events. Clinfo sets the cluster substate to CLSS_RECONFIG when it receives dynamic reconfiguration events. Applications can obtain this information using the cl_getcluster routine. The events triggered by the dynamic reconfiguration, such as a node up or node down event, are visible to applications.

    Cluster Information Tracked by Clinfo

    Clinfo maintains information about the following entities:

  • Clusters
  • Nodes in the clusters
  • Network Interfaces attached to each node
  • Available Network Interface Information
  • State and location of Resource Group
  • Cluster Networks
  • Cluster Site(s).
  • Clusters

    An HACMP cluster is a group of processors that cooperate to provide a highly available environment.

    Available Cluster Information

    Clinfo maintains the following information about configured clusters:

  • Cluster Name
  • Cluster ID
  • Cluster State
  • Cluster Substate
  • Primary Node Name (note that the Primary node is an HACMP concept and does necessarily correspond to the role of the node with respect to the application in use)
  • Cluster Number of Nodes
  • Cluster Number of Networks
  • Cluster Number of Resource Groups
  • Cluster Number of Sites.
  • Cluster Name

    A cluster name uniquely identifies a cluster. The administrator specifies the name when the cluster is configured.

    Cluster ID

    A cluster ID identifies each cluster. The cluster ID is a numeric value assigned by HACMP (this is not user-defined) at the time the cluster is configured.

    Cluster State

    A cluster can be in one of the following defined states:

    CLS_UP
    At least one node in the cluster is up, and a primary is defined.
    CLS_DOWN
    No nodes are up.
    CLS_UNKNOWN
    Clinfo is unable to communicate, or is not yet communicating with an SNMP process on any active cluster.
    CLS_ NOTCONFIGURED
    The cluster is not yet configured.

    Cluster Substate

    A cluster can be in one of several defined substates:

    CLSS_ERROR
    An error occurred and manual intervention is required.
    CLSS_RECONFIG
    A dynamic reconfiguration of the cluster is in progress.
    CLSS_STABLE
    The cluster is stable (no reconfiguration is occurring).
    CLSS_UNSTABLE
    The cluster is unstable. The event history will show what events are happening.
    CLSS_UNKNOWN
    Clinfo is unable to communicate with an SNMP process on a cluster node
    CLSS_ NOTCONFIGURED
    The cluster is not yet configured.
    CLSS_NOTSYNCHED
    Some basic cluster configuration information exists, but the cluster has not yet been verified and synchronized.

    Primary Node Name

    The name of the node elected primary by its peers. This is an optional function carried over from a previous version of the software.

    Cluster Number of Nodes

    The number of nodes defined in the cluster.

    Cluster Number of Networks

    The total number of networks in the cluster.

    Cluster Number of Resource Groups

    The total number of resource groups configured.

    Cluster Number of Sites

    If sites are in use, the number of sites configured.

    Nodes

    A node is one of the processors that make up the cluster. Each node in the cluster runs the clstrmgr, clinfo, and clsmuxpd daemons.

    Available Node Information

    Clinfo maintains the following information about a node:

  • Cluster ID
  • Node Name
  • Node State
  • Network Interfaces (service, standby, and tty).
  • Cluster ID

    The ID of the cluster to which this node belongs.

    Node Name

    The node name is a user-assigned string. The node name can contain up to 32 characters, and cannot begin with a leading numeric.

    Node State

    A node can be in one of following defined states:

    CLS_UP
    The node is up and running
    CLS_DOWN
    The node is down
    CLS_JOINING
    The node is in the process of joining the cluster
    CLS_LEAVING
    The node is in the process of leaving the cluster.

    Network Interfaces

    The number and addresses of service interfaces attached to the node.

    Network Interfaces

    A network interface is the physical connection between a node and a network.

    An HACMP cluster can support multiple networks and point-to-point connections, as well as an RS232 serial line point-to-point connection. Each network will have one or more network interfaces on each cluster node.

    Available Network Interface Information

    Clinfo maintains the following information about a network interface:

  • Cluster ID
  • Node Name
  • Active Node ID
  • Interface Name
  • Interface ID
  • Interface Address
  • Interface State
  • Interface Role.
  • Cluster ID

    The ID of the cluster to which this interface belongs.

    Node Name

    The name of the node to which this interface is attached.

    Active Node ID

    ID of the node where the address is currently active.

    Interface Name

    An interface's name is the same as the name in the /etc/hosts file for the interface (that is, the name associated with the IP address of the host).

    Interface ID

    The network ID of the network to which this interface is connected.

    Interface Address

    The IP address for the interface as defined in the /etc/hosts file.

    Interface State

    An interface can be in one of several defined states. The following values describe the state of a network interface:

    CLS_UP
    The interface is up and running.
    CLS_DOWN
    The network interface or network is down.
    CLS_INVALID
    This interface is not defined for this node.

    Interface Role

    An interface can be in one of several defined roles. The following values describe the roles of a network interface:

    CL_INT_ROLE_INVALID
    The network interface is invalid.
    CL_INT_ROLE_SERVICE
    The network interface is defined as a service interface.
    CL_INT_ROLE_STANDBY
    The network interface role is deprecated.
    CL_INT_ROLE_BOOT
    The network interface is defined as a service interface.
    CL_INT_ROLE_SH_SERVICE
    The network interface role is deprecated.

    Resource Group

    A resource group contains all the resources associated with an instance of an application that HACMP is keeping highly available.

    Resource groups are brought online as nodes join the cluster. Resource groups are moved between cluster nodes as failures occur. The groups’ state and location are available through Clinfo.

    Available Resource Group Information

    Clinfo maintains the following information about a resource group:

  • Cluster ID
  • Group Name
  • Group ID
  • Group Startup Policy
  • Group Fallover Policy
  • Group Fallback Policy
  • Group Site Policy
  • Number of Nodes
  • Group Node IDs
  • Group Node State.
  • Cluster ID

    The ID of the cluster to which this resource group belongs.

    Group Name

    This is the name given to the resource group when it is first defined.

    Group ID

    The numeric ID associated with the group.

    Group Startup Policy

    The startup policy used for this resource group:

  • Online on home node only
  • Online on first available node
  • Online on all available nodes
  • Online using distribution policy.
  • Group Fallover Policy

    The fallover policy for this resource group:

  • Fall over to the next priority node in the list
  • Fall over using dynamic node priority
  • Bring offline (or error node only).
  • Group Fallback Policy

    The fallback policy for this resource group:

  • Fall back to a higher priority node in the list
  • Never fall back.
  • Group Site Policy

    When the group contains replicated resources, the following policies are used for resource group startup, fallover, and fallback as it occurs between two sites:

  • Prefer primary site
  • Online on either site
  • Online on both sites.
  • Number of Nodes

    The number of nodes participating in the resource group.

    Group Node IDs

    The node ID of all nodes participating in the resource group.

    Group Node State

    The state of the resource group on each node.

    Resource Group States

    Resource groups can be in one of the following states on one or more cluster nodes:

    CL_RGNS_INVALID
    The node is not part of this resource group.
    CL_RGNS_ONLINE
    The group and all its resources are up and available on the node.
    CL_RGNS_OFFLINE
    The group is currently not active on the node.
    CL_RGNS_ACQUIRING
    The group is in the process of being acquired on this node.
    CL_RGNS_RELEASING
    The group is being released from this node.
    CL_RGNS_ERROR
    An error occurred when trying to bring the group online or offline. Resources for this group are not active. Manual intervention is required.

    Note: This list does not include all possible resource group states: if sites are defined, a primary and a secondary instance of the resource group could be online, offline, in the error state, or unmanaged. In addition, the resource group instances could be in the process of acquiring or releasing. The corresponding resource group states are not listed here, but have descriptive names that explain which actions take place.

    Cluster Networks

    An HACMP cluster may contain both TCP/IP and non-IP based networks. Typically, non-IP based networks are used for disk heartbeating traffic.

    Available Network Information

    Clinfo provides the following information about the networks in the cluster:

  • Network Name
  • Network ID
  • Network Type
  • Network Attribute
  • Network Node Information
  • Network State.
  • Network Name

    The name associated with the network. You can supply this name or HACMP can generate it.

    Network ID

    A numeric network identifier generated by HACMP.

    Network Type

    The physical type of the network, such as ether, token, or hps.

    Network Attribute

    The network attribute can be one of the following:

    Serial
    The network is non-IP based, such as RS232, target mode, or disk heartbeating.
    Public
    or
    Private
    For IP based networks, this attribute can be set to Public or Private. Setting the attribute to Private identifies this network for use by Oracle as a private network. The default is Public.

    Network Node Information

    For each cluster node connected to the network, the following information is provided:

    Node ID
    The node ID of each node.
    Node State
    The state of the network on the node. Note that this may be different than the global network state.

    Network State

    The network state, including the global state and the per node state, can be one of several predefined values:

    CLS_UP
    The network is up. If the network is up on any node, the global state will be set to up as well.
    CLS_DOWN
    The network is not up. A network can be down on one or more nodes (that is the per node state is CLS_DOWN) and still have a global state of CLS_UP.

    Cluster Site(s)

    When sites are configured, nodes in the cluster are grouped into cluster sites. The state of the cluster sites is tracked and cluster events are generated.

    Available Cluster Site Information

    Clinfo maintains the following information about a cluster site:

  • Site ID
  • Site Name
  • Site Priority
  • Site Backup Method
  • Site State
  • Site Number of nodes
  • Site Node IDs.
  • Site ID

    The HACMP generated ID for the site.

    Site Name

    The site name specified when the site was created.

    Site Priority

    Site priority determines the precedence and direction for data mirroring between sites. A cluster site can have a priority of one of the following:

    CL_SITE_PRIMARY
    This is the site where the application runs. Data is mirrored from this site to the backup site.
    CL_SITE_SECONDARY
    This site serves as a backup.
    CL_SITE_TERTIARY
    This site is mirroring data sent from the secondary site. This value is reserved for future use.

    Site Backup Method

    This is the backup communications method for the site. If a backup method is configured, it can be one of the following:

    CL_SITE_BACKUP_DBFS
    Dial Back Fail Safe
    CL_SITE_BACKUP_SGN
    Serial Global Network
    CL_SITE_BACKUP_NONE
    No backup communications are configured

    Site State

    The global state of the site:

    CLS_UP
    One or more node(s) in the site is up
    CLS_DOWN
    All the nodes in the site are down

    Site Number of nodes

    The number of nodes in this site.

    Site Node IDs

    A list of node IDs that participate in this site.


    PreviousNextIndex