Recovery and problem determination

Recovery and restart
Making sure that messages are not lost (logging)
What logs look like
The log control file
Types of logging
Circular logging
Linear logging
Using checkpointing to ensure complete recovery
Checkpointing with long-running transactions
Calculating the size of the log
Managing logs
What happens when a disk gets full
Managing log files
Log file location
Using the log for recovery
Recovering from power loss or communications failures
Recovering damaged objects
Media recovery
Recovering from media images
Recovering damaged objects during start up
Recovering damaged objects at other times
Protecting WebSphere MQ log files
Backing up and restoring WebSphere MQ
Backing up queue manager data
Restoring queue manager data
Using a backup queue manager
Creating a backup queue manager
Updating a backup queue manager
Starting a backup queue manager
Recovery scenarios
Disk drive failures
Damaged queue manager object
Damaged single object
Automatic media recovery failure
Dumping the contents of the log using the dmpmqlog command
Problem determination
Preliminary checks
Has WebSphere MQ run successfully before?
Are there any error messages?
Are there any return codes explaining the problem?
Can you reproduce the problem?
Have any changes been made since the last successful run?
Has the application run successfully before?
If the application has not run successfully before
Common programming errors
Problems with commands
Does the problem affect specific parts of the network?
Does the problem occur at specific times of the day?
Is the problem intermittent?
Have you applied any service updates?
Looking at problems in more detail
Have you obtained incorrect output?
Messages that do not appear on the queue
Messages that contain unexpected or corrupted information
Problems with incorrect output when using distributed queues
Have you failed to receive a response from a PCF command?
Are some of your queues failing?
Does the problem affect only remote queues?
Is your application or system running slowly?
Tuning performance for nonpersistent messages on AIX
Application design considerations
Effect of message length
Effect of message persistence
Searching for a particular message
Queues that contain messages of different lengths
Frequency of syncpoints
Use of the MQPUT1 call
Number of threads in use
Error logs
Error log files
Early errors
An example of an error log
Error log access restrictions under UNIX systems
Ignoring error codes under UNIX systems
Ignoring error codes under Windows systems
Operator messages
Dead-letter queues
Configuration files and problem determination
Tracing
Tracing WebSphere MQ for Windows
Selective component tracing on WebSphere MQ for Windows
Trace files
An example of WebSphere MQ for Windows trace data
Tracing WebSphere MQ for UNIX systems
Selective component tracing on WebSphere MQ for UNIX systems
Example trace data for WebSphere MQ for UNIX systems
Trace files
Tracing Secure Sockets Layer (SSL) on UNIX systems
Tracing with the AIX system trace
Selective component tracing on WebSphere MQ for AIX
An example of WebSphere MQ for AIX trace data
First-failure support technology (FFST)
FFST: WebSphere MQ for Windows
FFST: WebSphere MQ for UNIX systems
Problem determination with WebSphere MQ clients
Terminating clients
Java diagnostics
Using com.ibm.mq.commonservices
Java trace and FFDC files