![]() ![]() ![]() |
Appendix C: HACMP Tracing
This appendix describes how to trace HACMP-related events.
HACMP Tracing Overview
The trace facility helps you isolate a problem within an HACMP system by allowing you to monitor selected events. Using the trace facility, you can capture a sequential flow of time-stamped system events that provide a fine level of detail on the activity within an HACMP cluster.
The trace facility is a low-level debugging tool that augments the troubleshooting facilities described earlier in this book. While tracing is extremely useful for problem determination and analysis, interpreting a trace report typically requires IBM support.
The trace facility generates large amounts of data. The most practical way to use the trace facility is for short periods of timeāfrom a few seconds to a few minutes. This should be ample time to gather sufficient information about the event you are tracking and to limit use of space on your storage device.
The trace facility has a negligible impact on system performance because of its efficiency.
Using the Trace Facility for HACMP Daemons
Use the trace facility to track the operation of the following HACMP daemons:
The Cluster Manager daemon (clstrmgrES) The Cluster Information Program daemon (clinfoES) The Cluster Communication daemon (clcomdES) The clstrmgrES, clinfoES and clcomd daemons are controlled by the System Resource Controller (SRC).
Enabling Daemons under the Control of the System Resource Controller
The clstrmgrES, and clinfoES daemons are user-level applications under the control of the SRC. Before you can start a trace on one of these daemons, enable tracing for that daemon. Enabling tracing on a daemon adds that daemon to the master list of daemons for which you want to record trace data.
clcomd Daemon
The clcomd daemon is also under control of the SRC. To start a trace of this daemon, use the AIX 5L traceson command and specify the clcomd subsystem.
Initiating a Trace Session
Use SMIT to initiate a trace session for the clstrmgr or clinfo utilities. SMIT lets you enable tracing in the HACMP SRC-controlled daemons, start and stop a trace session in the daemons, and generate a trace report. The following sections describe how to use SMIT for initiating a trace session.
Using SMIT to Obtain Trace Information
To initiate a trace session using the SMIT interface:
1. Enable tracing on the SRC-controlled daemon or daemons you specify.
Use the SMIT Problem Determination Tools > HACMP Trace Facility > Enable/Disable Tracing of HACMP for AIX Daemons panel to indicate that the selected daemons should have trace data recorded for them.
2. Start the trace session.
Use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to trigger the collection of data.
3. Stop the trace session.
You must stop the trace session before you can generate a report. The tracing session stops either when either you use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to stop the tracing session or when the log file becomes full.
4. Generate a trace report.
Once the trace session is stopped, use the SMIT Start/Stop/Report Tracing of HACMP for AIX Services panel to generate a report.
Each step is described in the following sections.
Enabling Tracing on SRC-controlled Daemons
To enable tracing on the following SRC-controlled daemons (clstrmgrES or clinfoES):
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility and press Enter.
3. Select Enable/Disable Tracing of HACMP for AIX Daemons and press Enter.
4. Select Start Trace and press Enter. SMIT displays the Start Trace panel. Note that you only use this panel to enable tracing, not to actually start a trace session. It indicates that you want events related to this particular daemon captured the next time you start a trace session. See Starting a Trace Session for more information.
5. Enter the PID of the daemon whose trace data you want to capture in the Subsystem PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the daemon and press Enter. Note that you can select only one daemon at a time. Repeat these steps for each additional daemon that you want to trace.
6. Indicate whether you want a short or long trace event in the Trace Type field. A short trace contains terse information. For the clstrmgrES daemon, a short trace produces messages only when topology events occur. A long trace contains detailed information on time-stamped events.
7. Press Enter to enable the trace. SMIT displays a panel that indicates that tracing for the specified process is enabled.
Disabling Tracing on SRC-controlled Daemons
To disable tracing on the clstrmgrES or clinfoES daemons:
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility > Enable/Disable Tracing of HACMP for AIX Daemons > Stop Trace. SMIT displays the Stop Trace panel. Note that you only use this panel to disable tracing, not to actually stop a trace session. It indicates that you do not want events related to this particular daemon captured the next time you run a trace session.
3. Enter the PID of the process for which you want to disable tracing in the Subsystem PROCESS ID field. Press F4 to see a list of all processes and their PIDs. Select the process for which you want to disable tracing and press Enter. Note that you can disable only one daemon at a time. To disable more than one daemon, repeat these steps.
4. Press Enter to disable the trace. SMIT displays a panel that indicates that tracing for the specified daemon has been disabled.
Starting a Trace Session
Starting a trace session triggers the actual recording of data on system events into the system trace log from which you can later generate a report.
Remember, you can start a trace on the clstrmgrES and clinfoES daemons only if you have previously enabled tracing for them.
To start a trace session:
1. Enter: smit hacmp
2. Select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Start Trace. SMIT displays the Start Trace panel.
3. Enter the trace IDs of the daemons that you want to trace in the ADDITIONAL event IDs to trace field.
Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the cursor to the first daemon whose events you want to trace and press F7 to select it. Repeat this process for each event that you want to trace. When you are done, press Enter. The values that you selected are displayed in the ADDITIONAL event IDs to trace field. The HACMP daemons have the following trace IDs:
4. Enter values as necessary into the remaining fields and press Enter. SMIT displays a panel that indicates that the trace session has started.
Stopping a Trace Session
You need to stop a trace session before you can generate a trace report. A trace session ends when you actively stop it or when the log file is full.
To stop a trace session.
1. Enter smit hacmp
2. In SMIT, select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Stop Trace. SMIT displays the Command Status panel, indicating that the trace session has stopped.
Generating a Trace Report
A trace report formats the information stored in the trace log file and displays it in a readable form. The report displays text and data for each event according to the rules provided in the trace format file.
When you generate a report, you can specify:
Events to include (or omit) The format of the report. To generate a trace report:
1. Enter: smit hacmp
2. In SMIT, select Problem Determination Tools > HACMP Trace Facility > Start/Stop/Report Tracing of HACMP for AIX Services > Generate a Trace Report. A dialog box prompts you for a destination, either a filename or a printer.
3. Indicate the destination and press Enter. SMIT displays the Generate a Trace Report panel.
4. Enter the trace IDs of the daemons whose events you want to include in the report in the IDs of events to INCLUDE in Report field.
5. Press F4 to see a list of the trace IDs. (Press Ctrl-v to scroll through the list.) Move the cursor to the first daemon whose events you want to include in the report and press F7 to select it. Repeat this procedure for each event that you want to include in the report. When you are done, press Enter. The values that you selected are displayed in the IDs of events to INCLUDE in Report field.The HACMP daemons have the following trace IDs:
6. Enter values as necessary in the remaining fields and press Enter.
7. When the information is complete, press Enter to generate the report. The output is sent to the specified destination. For an example of a trace report, see the following Sample Trace Report section.
Sample Trace Report
The following is a sample trace report.
Wed Mar 10 13:01:37 1998 System: AIX steamer Node: 3 Machine: 000040542000 Internet Address: 00000000 0.0.0.0 trace -j 011 -s -a ID PROCESS NAME I SYSTEM CALL ELAPSED APPL SYSCALL KERNEL INTERRUPT 001 trace 0.000000 TRACE ON channel 0 Fri Mar 10 13:01:38 1995 011 trace 19.569326 HACMP for AIX:clinfo Exiting Function: broadcast_map_request 011 trace 19.569336 HACMP for AIX:clinfo Entering Function: skew_delay 011 trace 19.569351 HACMP for AIX:clinfo Exiting Function: skew_delay, amount: 718650720 011 trace 19.569360 HACMP for AIX:clinfo Exiting Function: service_context 011 trace 19.569368 HACMP for AIX:clinfo Entering Function: dump_valid_nodes 011 trace 19.569380 HACMP for AIX:clinfo Entering Function: dump_valid_nodes 011 trace 19.569387 HACMP for AIX:clinfo Entering Function: dump_valid_nodes 011 trace 19.569394 HACMP for AIX:clinfo Entering Function: dump_valid_nodes 011 trace 19.569402 HACMP for AIX:clinfo Waiting for event 011 trace 22.569933 HACMP for AIX:clinfo Entering Function: service_context 011 trace 22.569995 HACMP for AIX:clinfo Cluster ID: -1 011 trace 22.570075 HACMP for AIX:clinfo Cluster ID: -1 011 trace 22.570087 HACMP for AIX:clinfo Cluster ID: -1 011 trace 22.570097 HACMP for AIX:clinfo Time Expired: -1 011 trace 22.570106 HACMP for AIX:clinfo Entering Function: broadcast_map_request 002 trace 23.575955 TRACE OFF channel 0 Wed Nov 15 13:02:01 1999
![]() ![]() ![]() |