IBM Cluster Systems Management for Linux(R)
Overview HOWTO
Version 1 Release 1
Document Number SA22-7857-00
5799-GNJ
Note! |
---|
Before using this information and the product it supports, read the information in Notices. |
First Edition (June 2001)
This edition of the IBM Cluster Systems Management for Linux Overview HOWTO applies to IBM Cluster Systems Management for Linux Version 1 Release 1, program number 5799-GNJ, and to all subsequent releases of this product until otherwise indicated in new editions.
IBM(R) welcomes your comments. A form for readers' comments may be provided at the back of this publication, or you may address your comments to the following address:
If you would like a reply, be sure to include your name, address, telephone number, or FAX number.
Make sure to include the following in your comment or note:
When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 2001. All rights reserved.
U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
IBM Cluster Systems Management for Linux Overview
This HOWTO is an overview of the the IBM Cluster Systems Management for Linux set of tools. It briefly describes the adminstrative tasks that can be accomplished with greater ease and efficiency by using this set of tools and then directs you to detailed related information for each task.
This HOWTO is intended for system administrators who want to use Cluster Systems Management for Linux. The system administrator should have experience in UNIX(R) administration and networked systems.
This book uses the following typographic conventions:
Typographic | Usage |
---|---|
Bold |
|
Italic |
|
Constant width | Examples and information that the system displays appear in constant width typeface. |
[ ] | Brackets enclose optional items in format and syntax descriptions. |
{ } | Braces enclose a list from which you must choose an item in format and syntax descriptions. |
| | A vertical bar separates items in a list of choices. (In other words, it means "or.") |
< > | Angle brackets (less-than and greater-than) enclose the name of a key on the keyboard. For example, <Enter> refers to the key on your terminal or workstation that is labeled with the word Enter. |
... | An ellipsis indicates that you can repeat the preceding item one or more times. |
<Ctrl-x> | The notation <Ctrl-x> indicates a control character sequence. For example, <Ctrl-c> means that you hold down the control key while pressing <c>. |
\ | The continuation character is used in coding examples in this book for formatting purposes. |
IBM Cluster Systems Management for Linux Monitoring HOWTO, SA22-7852-00
IBM Cluster Systems Management for Linux Remote Control HOWTO, SA22-7856-00
IBM Cluster Systems Management for Linux Set-Up HOWTO, SA22-7853-00
IBM Cluster Systems Management for Linux Technical Reference, SA22-7851-00
The IBM Cluster Systems Management for Linux publications are available as HTML and PDF files on the CD-ROM in the /doc directory or on the installed system in the /opt/csm/doc directory.
A README is available on the CD-ROM in the root directory (/). The file names are as follows:
Publications for IBM Cluster Systems Management for Linux were available also at the time of this release at the following URL:
http://www.ibm.com/eserver/clusters/linux
IBM Cluster Systems Management for Linux (CSM) provides a distributed system management solution for machines, or nodes, that are running the Linux operating system. With this software, an administrator can easily set up and maintain a Linux cluster by using functions like monitoring, hardware control, and configuration file management. . The concepts and software are derived from IBM Parallel System Support Programs for AIX (PSSP) and from applications available as open source tools.
Specifically, within the cluster, nodes can be added, removed, changed, or listed (with persistent configuration information displayed about each node in the list). Commands can be run across nodes or node groups in the cluster, and responses can be gathered. Nodes and applications can be monitored as to whether they are up or down; CPU, memory, and system utilization can be monitored; and automated responses can be run when events occur in the cluster. Configuration File Manager is provided for synchronization of files across multiple nodes. A single management server is the control point for the CSM cluster.
Note that CSM manages a loose cluster of machines. It does not provide high availability services or fail-over technology although high-availability clusters can be part of the set of machines that CSM is managing.
More information is provided on these tasks as follows:
The IBM Cluster Systems Management for Linux Set-Up HOWTO provides a simple process for installing and configuring CSM on an existing Linux system. This process allows you to do the following:
For more information, see the man pages or IBM Cluster Systems Management for Linux Technical Reference for the following set up commands and files:
Installs the management server
Can be used instead of definenode and installnode for suitable installations
Gathers all the information necessary to install the nodes
Installs the nodes and brings up the necessary servers on them
Node definition file for cluster nodes
The distributed management server provides a set of commands for managing nodes. It stores information about nodes in a central repository, and it defines static and dynamic node groups. These definitions are then accessible to the Configuration File Manager (cfm) command for configuration file management, the dsh command for running shell commands remotely, for hardware control, and for monitoring the cluster by using the Event Response subsystem (ERRM). All of these functions rely on the definitions stored by the nodegrp command. Thus, a node group is defined in only one place and is then accessible for use by other functions.
Persistent information on each node is kept, including operating system type, host name, machine type, model, and serial number. In addition, the status of the node is determined periodically by means of the fping command.
The node and node group commands are built on top of a Perl DBI layer backed by a set of DBDs (database drivers) so that data can be stored in a variety of formats and shared with other tools.
See the man pages or the IBM Cluster Systems Management for Linux Technical Reference for details on the following commands that manage node and node-group information:
Adds a node to the CSM cluster database.
Changes an attribute of a node in the CSM cluster database.
Allows fping and power status parameters to be changed.
Displays information about the nodes in the CSM cluster, for example, the cached status on whether the node has been reachable.
Defines node groups within the CSM cluster for use by other functions such as the configuration file manager, the dsh command, the event response subsystem, and the hardware control commands.
Removes a node from the CSM cluster database.
Specifies the repository for node information.
You can control the hardware on remote nodes by using the remote control commands. For example, you can control computers on a ship from an office on the mainland, provided the correct connectivity exists.
See IBM Cluster Systems Management for Linux Remote Control HOWTO for details on how to set up remote power control. See the man pages or IBM Cluster Systems Management for Linux Technical Reference for details on the following commands:
Opens a remote console.
Boots and resets hardware, powers hardware on and off, and queries the power state; for example, the resetsp option resets the service processor.
The distributed shell (dsh) command runs commands remotely across multiple nodes. It optionally can use any underlying remote shell that is specified by the user (for instance, a remote secure shell that complies with the IETF (Internet Engineering Task Force) Secure Shell protocol). By default, rsh is used.
The dsh command can retrieve a complete list of the nodes in the CSM cluster or the list of nodes in a specified node group.
See the man pages or IBM Cluster Systems Management for Linux Technical Reference for details on the following commands:
Issues remote shell commands and the options associated with them to multiple nodes.
Presents formatted output from the dsh command.
.
Configuration File Manager provides a file repository for the common configuration files among nodes in a cluster. In general, all the configuration files that need to be shared are stored in one location on the management server. Changes to these files are propagated and synchronized throughout the cluster. Though the files are common, there are mechanisms to allow for variations based on groups, IP address, and host name.
Configuration File Manager is built on top of the GNU software package cfengine. The cfengine software package is a scripting package that uses a class-based decision structure to test and configure UNIX-like systems attached to a TCP/IP network. There are many capabilities built into cfengine itself, which a system administrator can use over and above what Configuration File Manager uses.
Configuration File Manager greatly enhances the copy functionality and usability of cfengine by providing the concept of a repository. Instead of requiring an administrator to write a cfengine script to keep files up to date, the repository allows automatic updating without script changes.
See the man pages or IBM Cluster Systems Management for Linux Technical Reference for more details on the cfm and cforce commands.
At the time this document was written, detailed information on cfengine could be found at the following URL: http://www.iu.hioslo.no/cfengine
A flexible distributed system monitoring application is provided by CSM. This monitoring application allows the administrator to define conditions of interest to monitor on the system. An event occurs when a monitored condition of interest reaches a threshold that is defined in an event expression. When an event occurs, automated responses to that event take place. Many actions can be defined as part of these responses, including notification, running a predefined response script, or running a user-defined script.
A full set of commands is provided to tailor this application to your needs. In addition, predefined conditions and responses are provided for easy implementation so that you can get up and running quickly and easily. This rich set of predefined conditions and responses can be used directly or can be taken as examples to be copied and modified. Among the system resources that can be monitored are:
The application, its components, and the predefined conditions and responses are fully described in the IBM Cluster Systems Management for Linux Monitoring HOWTO. The commands are available as man pages and are also compiled for easy reference in the IBM Cluster Systems Management for Linux Technical Reference.
Security on a single system is provided by the operating system in that only root can run or modify functions. Flexibility is provided for the degree of security required by the specific environment because remote shells that conform to the IETF (Internet Engineering Task Force) Secure Shell protocol can be specified by using the dsh command for the appropriate situations. Network security for other functions is built on the identd function.
See the IBM Cluster Systems Management for Linux Monitoring HOWTO for details on authorization and the dsh man page or the IBM Cluster Systems Management for Linux Technical Reference for details on how to specify the remote shell of your choice by using the DSH_REMOTE_CMD environment variable.
Cluster Systems Management (CSM) makes use of several other tools.
It is helpful to understand the relationship between CSM and these tools in
order to diagnose problems. The tools that CSM uses are described in
the following table:
Tool | What It Does |
---|---|
Perl DBI package | Stores database information in a variety of formats |
Resource Monitoring and Control (RMC) subsystem | Monitors conditions and communicates with all nodes. RMC needs to be running on each node, and the security access control list (ACL) file needs to allow the nodes to communicate with the management server. See "Security Considerations" in the "Overview" chapter of the IBM Cluster Systems Management for Linux Monitoring HOWTO. |
dsh | Runs some commands on the nodes. Security needs to be set up on each node to allow this for the remote shell that is used by dsh. The default remote shell is rsh. |
fping | Periodically gets the status of each node |
cfengine | Transfers files for the Configuration File Manager |
Here are a few tips to help diagnose problems with a CSM cluster:
lsnode -Al
lssrc -a
lsaudrec
See the "Diagnosis Information" chapter and the "Security Considerations" section of the "Overview" chapter in the IBM Cluster Systems Management for Linux Monitoring HOWTO for troubleshooting hints and tips and for detailed information on authorization and the ACL file respectively. See the ACL File FAQ in the IBM Cluster Systems Management for Linux Set-Up HOWTO for information on troubleshooting the RMC ACL file.
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of LicensingFor license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
IBM World Trade Asia CorporationThe following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM CorporationSuch information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.
The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.
The following trademarks apply to this book:
IBM and AIX are registered trademarks of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
Red Hat and RPM are trademarks of Red Hat, Inc.
UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group.
Other company, product, and service names may be the trademarks or service marks of others.
IBM Cluster Systems Management for Linux includes software that is publicly available:
This book discusses the use of these products only as they apply specifically to the IBM Cluster Systems Management for Linux product.