PreviousNextIndex

Appendix C: HACMP Tracing


This appendix describes how to trace HACMP-related events.

HACMP Tracing Overview

The trace facility helps you isolate a problem within an HACMP system by allowing you to monitor selected events. Using the trace facility, you can capture a sequential flow of time-stamped system events that provide a fine level of detail on the activity within an HACMP cluster.

The trace facility is a low-level debugging tool that augments the troubleshooting facilities described earlier in this book. While tracing is extremely useful for problem determination and analysis, interpreting a trace report typically requires IBM support.

The trace facility generates large amounts of data. The most practical way to use the trace facility is for short periods of time—from a few seconds to a few minutes. This should be ample time to gather sufficient information about the event you are tracking and to limit use of space on your storage device.

The trace facility has a negligible impact on system performance because of its efficiency.

Using the Trace Facility for HACMP Daemons

Use the trace facility to track the operation of the following HACMP daemons:

  • The Cluster Manager daemon (clstrmgrES)
  • The Cluster Information Program daemon (clinfoES)
  • The Cluster Communication daemon (clcomdES)
  • The clstrmgrES, clinfoES and clcomd daemons are controlled by the System Resource Controller (SRC).

    Enabling Daemons under the Control of the System Resource Controller

    The clstrmgrES, and clinfoES daemons are user-level applications under the control of the SRC. Before you can start a trace on one of these daemons, enable tracing for that daemon. Enabling tracing on a daemon adds that daemon to the master list of daemons for which you want to record trace data.

    clcomd Daemon

    The clcomd daemon is also under control of the SRC. To start a trace of this daemon, use the AIX 5L traceson command and specify the clcomd subsystem.

    Initiating a Trace Session

    Use SMIT to initiate a trace session for the clstrmgr or clinfo utilities. SMIT lets you enable tracing in the HACMP SRC-controlled daemons, start and stop a trace session in the daemons, and generate a trace report. The following sections describe how to use SMIT for initiating a trace session.

    Using SMIT to Obtain Trace Information

    To initiate a trace session using the SMIT interface:

      1. Enable tracing on the SRC-controlled daemon or daemons you specify.
    Use the SMIT Problem Determination Tools > HACMP Trace Facility > Enable/Disable Tracing of HACMP for AIX Daemons panel to indicate that the selected daemons should have trace data recorded for them.
      2. Start the trace session.
    Use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to trigger the collection of data.
      3. Stop the trace session.
    You must stop the trace session before you can generate a report. The tracing session stops either when either you use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to stop the tracing session or when the log file becomes full.
      4. Generate a trace report.
    Once the trace session is stopped, use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to generate a report.

    Each step is described in the following sections.

    Enabling Tracing on SRC-controlled Daemons

    To enable tracing on the following SRC-controlled daemons (clstrmgrES or clinfoES):

      1. Enter: smit hacmp
      2. Select Problem Determination Tools > HACMP Trace Facility and press Enter.
      3. Select Enable/Disable Tracing of HACMP for AIX Daemons and press Enter.
      4. Select Start Trace and press Enter. SMIT displays the Start Trace panel. Note that you only use this panel to enable tracing, not to actually start a trace session. It indicates that you want events related to this particular daemon captured the next time you start a trace session. See Starting a Trace Session for more information.
      5. Enter the PID of the daemon whose trace data you want to capture in the Subsystem PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the daemon and press Enter. Note that you can select only one daemon at a time. Repeat these steps for each additional daemon that you want to trace.
      6. Indicate whether you want a short or long trace event in the Trace Type field. A short trace contains terse information. For the clstrmgrES daemon, a short trace produces messages only when topology events occur. A long trace contains detailed information on time-stamped events.
      7. Press Enter to enable the trace. SMIT displays a panel that indicates that tracing for the specified process is enabled.

    Disabling Tracing on SRC-controlled Daemons

    To disable tracing on the clstrmgrES or clinfoES daemons:

      1. Enter: smit hacmp
      2. Select Problem Determination Tools > HACMP Trace Facility > Enable/Disable Tracing of HACMP for AIX Daemons > Stop Trace. SMIT displays the Stop Trace panel. Note that you only use this panel to disable tracing, not to actually stop a trace session. It indicates that you do not want events related to this particular daemon captured the next time you run a trace session.
      3. Enter the PID of the process for which you want to disable tracing in the Subsystem PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the process for which you want to disable tracing and press Enter. Note that you can disable only one daemon at a time. To disable more than one daemon, repeat these steps.
      4. Press Enter to disable the trace. SMIT displays a panel that indicates that tracing for the specified daemon has been disabled.

    Starting a Trace Session

    Starting a trace session triggers the actual recording of data on system events into the system trace log from which you can later generate a report.

    Remember, you can start a trace on the clstrmgrES and clinfoES daemons only if you have previously enabled tracing for them.

    To start a trace session:

      1. Enter: smit hacmp
      2. Select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Start Trace. SMIT displays the Start Trace panel.
      3. Enter the trace IDs of the daemons that you want to trace in the ADDITIONAL event IDs to trace field.
    Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the cursor to the first daemon whose events you want to trace and press F7 to select it. Repeat this process for each event that you want to trace. When you are done, press Enter. The values that you selected are displayed in the ADDITIONAL event IDs to trace field. The HACMP daemons have the following trace IDs:
    clstrmgrES
    910
    clinfoES
    911

      4. Enter values as necessary into the remaining fields and press Enter. SMIT displays a panel that indicates that the trace session has started.

    Stopping a Trace Session

    You need to stop a trace session before you can generate a trace report. A trace session ends when you actively stop it or when the log file is full.

    To stop a trace session.

      1. Enter smit hacmp
      2. In SMIT, select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Stop Trace. SMIT displays the Command Status panel, indicating that the trace session has stopped.

    Generating a Trace Report

    A trace report formats the information stored in the trace log file and displays it in a readable form. The report displays text and data for each event according to the rules provided in the trace format file.

    When you generate a report, you can specify:

  • Events to include (or omit)
  • The format of the report.
  • To generate a trace report:

      1. Enter: smit hacmp
      2. In SMIT, select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Generate a Trace Report. A dialog box prompts you for a destination, either a filename or a printer.
      3. Indicate the destination and press Enter. SMIT displays the Generate a Trace Report panel.
      4. Enter the trace IDs of the daemons whose events you want to include in the report in the IDs of events to INCLUDE in Report field.
      5. Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the cursor to the first daemon whose events you want to include in the report and press F7 to select it. Repeat this procedure for each event that you want to include in the report. When you are done, press Enter. The values that you selected are displayed in the IDs of events to INCLUDE in Report field.The HACMP daemons have the following trace IDs:
    clstrmgrES
    910
    clinfoES
    911
      6. Enter values as necessary in the remaining fields and press Enter.
      7. When the information is complete, press Enter to generate the report. The output is sent to the specified destination. For an example of a trace report, see the following Sample Trace Report section.

    Sample Trace Report

    The following is a sample trace report.

    Wed Mar  10 13:01:37 1998 
    System: AIX steamer Node: 3 
    Machine: 000040542000 
    Internet Address: 00000000 0.0.0.0 
    trace -j 011  -s -a  
    ID  PROCESS NAME   I SYSTEM CALL   ELAPSED     APPL    SYSCALL KERNEL  INTERRUPT 
    001 trace                          0.000000    TRACE ON channel 0 
    Fri Mar 10 13:01:38 1995 
    011 trace                          19.569326   HACMP for AIX:clinfo Exiting Function: 
    broadcast_map_request 
    011 trace                          19.569336   HACMP for AIX:clinfo Entering 
    Function: skew_delay 
    011 trace                          19.569351   HACMP for AIX:clinfo Exiting Function: 
    skew_delay, amount: 718650720 
    011 trace                          19.569360   HACMP for AIX:clinfo Exiting Function: 
    service_context 
    011 trace                          19.569368   HACMP for AIX:clinfo Entering 
    Function: dump_valid_nodes 
    011 trace                          19.569380   HACMP for AIX:clinfo Entering 
    Function: dump_valid_nodes 
    011 trace                          19.569387   HACMP for AIX:clinfo Entering 
    Function: dump_valid_nodes 
    011 trace                          19.569394   HACMP for AIX:clinfo Entering 
    Function: dump_valid_nodes 
    011 trace                          19.569402   HACMP for AIX:clinfo Waiting for event 
    011 trace                          22.569933   HACMP for AIX:clinfo Entering 
    Function: service_context 
    011 trace                          22.569995   HACMP for AIX:clinfo Cluster ID: -1 
    011 trace                          22.570075   HACMP for AIX:clinfo Cluster ID: -1 
    011 trace                          22.570087   HACMP for AIX:clinfo Cluster ID: -1 
    011 trace                          22.570097   HACMP for AIX:clinfo Time Expired: -1 
    011 trace                          22.570106   HACMP for AIX:clinfo Entering 
    Function: broadcast_map_request 
    002 trace                          23.575955                TRACE OFF channel 0 
                                                                Wed Nov 15 13:02:01 1999 
    

    PreviousNextIndex