How to integrate Nagios and CCMS to monitor SAP system landscapes

Stefan Schindewolf

<sch1nd0r@gmx.net>

Copyright 2004 by Stefan Schindewolf

2004-07-28 ver 1.0


Abstract

SAP AGs Computing Center Monitoring System is a unified platform for monitoring
and administrating all systems in a SAP System Landscape.
Nearly everything can be checked by CCMS, from the lowest hardware level over
operating system data and databases up to whole business processes in an ERP
system.

Nagios is an open source tool for system monitoring and offers also a wide
range of tools to check and manage a system landscape. It displays its data in
a nice browser interface and is cabable of automatic reaction to failures, es-
calation routes and notifying admins via mail, sms or pager.

This document aims to explain how to configure Nagios so that it collects data
from SAP systems. Strong Nagios know how is recommended.

...............................................................................

Table of Contents
1. Administrata
   1.1. Authorship and Copyright
   1.2. Acknowldedgements
   1.3. Comments and Corrections
   1.4. Latest Versions and Translations

2. How Nagios works
   2.1. The Architecture
   2.2. The Plugins

3. How CCMS works
   3.1. The Architecture
   3.2. Important Technical Terms
   3.3. How CCMS Data can be retrieved by Nagios

4. How to configure Nagios
   4.1. Installing the Plugins
      4.1.1. RPM installation
      4.1.2. Source tarball installation
      4.1.3. Further steps
   4.2. Features and Choosing Configuration
   4.3. Preparations in your SAP System
   4.4. The Files in /etc/sapmon
      4.4.1. The agent.cfg
      4.4.2. The login.cfg
      4.4.3. The moni_tr.cfg
   4.5. Nagios Configuration
      4.5.1. Nagios` hosts.cfg
      4.5.2. Nagios` commands.cfg
      4.5.3. Nagios` services.cfg
   4.6. Testing

5. Errata
   5.1. Resources

...............................................................................



1. Administrata

1.1. Authorship and Copyright

This document is copyright (c) 2004 Stefan Schindewolf, <sch1nd0r@gmx.net>
Permission is granted to copy, distribute and/or modify this document under the
terms of the Open Software Licens (OSL), version 1.1, except for the provisions
I list in the next paragraph. The OSL license terms can be found at http://
opensource.org/licences/osl-1.1.txt.

If you want to create a derivative work or publish this HOWTO for commercial
purposes, I really would appreciate it if you contact me first. You will then be
given the most recent version of this document. Additionally it would be very
nice if you would provide me with a copy of your work, so that the benefit is
also on my side of the table. If someone recognizes factual errors, please do
not hesitate to provide me with a corrected version of this document.
-------------------------------------------------------------------------------

1.2. Acknowledgements
Thanks to the SAP LinuxLab for all their great work they do and all the support
they give even to the slightest of problems.
On their behalf I took over the maintenance of this project. If anyone wants to
join and help me - I would more than appreciate it!
-------------------------------------------------------------------------------

1.3. Comments and Corrections
There are no Comments or Corrections in version 1.0 at this time.
-------------------------------------------------------------------------------

1.4. Latest Versions and Translations
You are now reading the latest version. No translations into whatsoever
language are planned.
-------------------------------------------------------------------------------



2. How Nagios works

2.1. The Architecture

Nagios is a plugin- based system monitoring solution. The plugins can be
written in any language. What Nagios itself does is starting the plugin in a
shell and reading the output of the command and the returncode. Therefore the
plugin should always return an exit code and some text. This text will later
be displayed in the Field "Status Information" in Nagios while the exit code
is the basis for the "Status" field of Nagios (Unknown, Warning, Critical,OK)
Nagios is free software by terms of the GPL and can be obtained via http://
www.nagios.org
-------------------------------------------------------------------------------


2.2. The Plugins

Nagios' plugins provide the actual data gathering for the Nagios core. They
connect to hosts and databases, collect various data of system resources and
deliver the data back to Nagios. The plugins for nagios can be obtained via
http://nagiosplug.sourceforge.net
-------------------------------------------------------------------------------



3. How CCMS works
3.1. The Architecture

With SAP Basis Release 4.0 the CCMS Monitoring Architecture was introduced.
It is a system to notice problems (called alerts) in a tree structure, assig-
ning different warning levels to them (green, yellow, red).
CCMS consists of several layers and different elements making it very flexible
and capable.
The basis of CCMS are the Data Suppliers. They provide monitoring data for ex-
ternal monitoring systems, the so called Data Consumer.
SAP Alert monitor is such a Data Consumers. It provides the necessary infra-
structure to control monitoring objects and their thresholds and initiating
alerts to administrators.

For more information on CCMS please see http://service.sap.com/monitoring
-------------------------------------------------------------------------------


3.2. Important Technical Terms

Alert -   Problem notification within the CCMS Alert monitor. The values of a
          data supplier are compared to the given threholds. The Administrators
          attention is then guided to the problems by different colors.

Monitor - Collection of MTE (Monitor Tree Elements), assembled in a hierarchy

MTE -     Monitor Tree Elements are knots in the monitor tree. There are three
          different kinds: monitor attributes, monitor objects and monitor col-
          lections.

Monitor
objects - Represent objects which you can monitor. They can be database table-
          spaces, hard disks or SAP System components.

Monitor
attributes - These are the basic elements in the monitor tree. They describe
             the status of the monitor objects. Four different kinds exist:
             Performance attr., Status attr., Protocol attr. and Text attr.
             Except for Text attr. all attributes can raise alerts (and are
             therefore valuable to Nagios).
-------------------------------------------------------------------------------


3.3. How CCMS Data can be retrieved by Nagios
-------------------------------------------------------------------------------



4. How to configure Nagios

4.1. Installing the CCMS Plugins

The easiest way of installation at the moment should be the RPM way. Please
download the RPMs from one of the various FTP Servers of the SuSE company. I
will provide Source RPMs and self compiled RMPs asap. Additionally there will
be a tarball soon.
-------------------------------------------------------------------------------


4.1.1. RPM installation

Please refer to the most recent version of SuSE Linux Professional and take 
the packages from its FTP server. You should always install the whole Nagios
suite from RPM. Better do not compile Nagios yourself and then try to install
the RPMs because the source code version of Nagios lacks certain features we
will need for the CCMS Plugins.

Example:
- Go to ftp.gwdg.de and enter the directory /pub/suse/i386/9.1/suse/i586
- Download all packages by mgetting nagios*
- Install the RPMs on your Linux box

You now only would have to edit the configuration files and start Nagios.
-------------------------------------------------------------------------------


4.1.2. RPM installation

This section will follow when I finished the source tarballs.
-------------------------------------------------------------------------------


4.1.3. Further steps

Please configure now your nagios system for all the monitoring you want to do
except the SAP part. I will not cover the whole configuration procedure of
Nagios in this document. This is very well and detailed documented on 
http://www.nagios.org/docs/
-------------------------------------------------------------------------------


4.2. Features and Choosing Configuration

- Starting/Stopping of SAP Systems
- Start SAP GUI and connect to SAP System
- Open Secure Shell to SAP Host
Given that you configured your Nagios "alias" values according to SAP standard
you will be able to start/stop your SAP System by clicking on the host name in
Nagios and choosing the option in the next window. The same counts for SSH and
SAP GUI connection.

- Change thresholds of CCMS Monitors
Clicking on the Nagios command result on the right of the Nagios status screen
you will be able to change the thresholds of the queried monitors.

- Command List:
These commands provide HTML/CGI output for nagios, enabling the features above
mentioned. Note that for full feature use you will have to install the SuSE
nagios version which provides the correct interface for all features

check_sap		Shows a single value from the given monitor template
check_sap_multiple	Shows a set of values at a time
check_sap_instance	Provides SAP standard data for an app server - CCMS
			Ping has to be installed on the Host
check_sap_system	Checks a whole system landscape including application
			servers and their data

The commands provide the same data as the above ones, but their output is
plain text, making them compatible with every nagios version.

check_sap_cons
check_sap_system_cons
check_sap_instance_cons
check_sap_mult_no_thr	Console version of check_sap_multiple
-------------------------------------------------------------------------------


4.3. Preparations in your SAP Systems

Given, you have chosen a scenario for Nagios (see section 3.3 for deatils)
you can continue with the configuration of the Solution Manager or your SAP
Systems.
In each SAP Instance you want to monitor with Nagios you will have to create
a RFC user with minimum credentials. Or, as already mentioned, you create the
user once in the Solution Manager.
Mostly you will only want to check your productive systems and leave the rest
unattended so that procedure should be of minimum impact.
I will describe the procedure for an R3 4.6C system, it might differ for other
SAP products.

First create the user credentials in your SAP system. Please check with your
user administration for a minimum set of credentials and an explicit role.

For example:
- Log on to your SAP System
- Go to transaction "PFCG"
- Choose "0000:B:NAG" as role name and "Credentials for Nagios user" as des-
cription
- Select Create from the Buttons below
- Select the credentials for your role. As I did this point not myself, I
strongly recommend you to let an experienced role admnistrator do this.
- Apply the settings, generate the role and exit

Next step is to create the user and assign the role to him.

Example:
- Log on to your SAP System
- Go to transaction "SU01"
- Choose "nagios" as the user id
- Select Create from the buttons above
- Maintain minimum personal data for this user
- Select the tab "Logon Data"
- Choose "User group for authorization check" -> SUPPORT
- Choose "User type" -> SERVICE
- Choose "Initial Password" -> "monitor"
- Select tab "roles"
- Choose "0000:B:NAG"
- Select tab "Profiles"
- Choose "0000:B:NAG"
- Save your data and exit

You have now created your RFC user for nagios. After editing the config files
for the CCMS Plugins you should be able to retrieve data from your system.
------------------------------------------------------------------------------


4.4. The Files in /etc/sapmon

In a standard installation all the configuration data of the plugins is stored
in /etc/sapmon. A detailed description of each file follows.
------------------------------------------------------------------------------


4.4.1. The agent.cfg

This file provides information about the CCMS Agents you want to query with
Nagios. These Agents are represented in this file as TEMPLATES.
There are two kind of templates, simple ones and extended ones.
------------------------------------------------------------------------------

4.4.1.1. Command structure and names of the templates

As described in detail in section 4.4.2. the command structure of the Nagios
plugins is as follows.

4.4.1.2. Simple Templates

Simple Templates:
[TEMPLATE_<NAME>]                            Choose a name for the template
DESCRIPTION = <Description of the Template>  Arbitrary description field
SYSTEM = <SAPSID>                            Wildcards may be used
APPL-SERVER = <Logical System Name>          Wildcards may be used
VALUE = <What to retrieve>                   What do you want to monitor?

Example for a simple template:
[TEMPLATE_MYFIRSTTEMPLATE]
SYSTEM = P03
APPL-SERVER = *
VALUE = DIALOG_RESPONSE_TIME

This template will show the dialog responsetime of all application servers of
the system P03.
Note: The VALUE parameter is a string which has to be taken character for
character from the monitor names in transaction RZ20 in your SAP system.
Important: The name of the template in TEMPLATE_<NAME> must be in upper case!!
-------------------------------------------------------------------------------

4.4.1.3. Extended Templates

Extended Templates provide much more information as they can be used to query
whole CCMS monitor sets. Wildcards are also allowed.

The templates are structured as follows:

[TEMPLATE_<NAME>]
DESCRIPTION = <Description of the Template>	Arbitrary description field
MONI_SET_NAME= <Monitor collection>		Monitor collection in RZ20
MONI_NAME= <Monitor name>			Name of the specific monitor
MAX_TREE_DEPTH = <number>			Monitor loading depth
PATTERN_0=<SAPSID>\<Context>\<Monitor object>\<Monitor attribut>

MONI_SET_NAME ist the name of the monitor collection in the entry screen of
transaction RZ20, e.g. "SAP Admin Workplace" or any self configured monitor
set.
MONI_NAME is the name of the monitor within the given monitor set, for
example "Operating System".
MAX_TREE_DEPTH controls how much levels of the monitor are loaded. In RZ20
you can expand the monitor tree to see the monitor objects and attributes.
Each expanding level is a step in the monitor tree. The default value is 0,
loading the whole monitor.
PATTERN_0 is a pattern which you could use to select the data provided to
Nagios more granular.
Examples:

PATTERN_0="P03\sapserv01_P03_00\CPU\5minLoadAverage" selects the monitor
attribut "5minLoadAverage" from the monitor object "CPU". The server (context)
will be "sapserv01_P03_00".

PATTERN_0="P03\*\CPU\5minLoadAverage" will select the same data as above but
for all servers of the system landscape of system P03.

PATTERN_0="*\*\/oracle\Percentage_Used" will select the "Percentage_Used"
value of the filesystem "/oracle" in the system towards the command is run.

Important: The name of the template in TEMPLATE_<NAME> must be in upper case!!
-------------------------------------------------------------------------------


4.4.2. The login.cfg

This file provides the logon information for each SAP System you want to check
with Nagios. Please cut its credentials for others and the group so that no
one can spy out your logon data. Again -> this file is a potential security
risk to your system landscape! Make it safe and do not assign any administra-
tive roles or profiles to your nagios RFC user!

The entries in this file follow this scheme:
[LOGIN_<HOSTNAME>]
LOGIN=-d <SID> -u <RFC User> -p <RFC Password> -c <Client> -h <HOSTNAME> -s \
<System Number>

For instance:
[LOGIN_sapser01]
LOGIN=-d P03 -u nagios -p monitor -c 030 -h sapser01 -s 02

This will logon the user "nagios" with the password "monitor" into the SAP
System "P03" on host "sapser01". Client will be "030", system number "02".
-------------------------------------------------------------------------------


4.4.3. The moni_tr.cfg

Nothing to edit here for Nagios Admins.
-------------------------------------------------------------------------------


4.5. Nagios Configuration

Depending on which features of the CCMS plugins you would like to use several
changes to your Nagios installation and configuration have to be made.
Note that the output of your commands should not exceed 80 characters as this
is the maximum Nagios will accept. This is especially important if you use the
commands providing HTML output.
-------------------------------------------------------------------------------


4.5.1. Nagios` hosts.cfg

Define your SAP Host like any other host in your nagios landscape. After that
edit the parameter "alias" like "<HOSTNAME>_<SAPSID>_<SYSNR>.
Example:

define host{
	host_name	sapserv01
	alias		sapserv01_P03_00
	address		10.17.72.234
	....
}
-------------------------------------------------------------------------------


4.5.2. Nagios` commands.cfg

The CCMS commands have to be defined as follows:

define command{
	command_name	check_sap
	command_line	$USER1$/check_sap $ARG1$ $ARG2$
}

Repeat these definition for each command you would like to use. See section
4.2. "Features and Choosing Configuration" for all commands you can configure.
-------------------------------------------------------------------------------


4.5.3. Nagios` services.cfg

A service for CCMS is defined like any other service for Nagios. The most im-
portant line is the check_command where you define the parameters for your
check commands.

define service{
	use	generic service
	host_name	sapserv01
	....
	check_command	check_sap!<Template Name>!<RFC Template>
	...
}
-------------------------------------------------------------------------------


4.6. Testing and Troubleshooting

Use the console command output (e.g. check_sap_cons) to test your configu-
ration. In your working directory a file named "def_rfc.trc" will be generated
indicating problems with you logon data.
If your command does not return any data, perhaps you did not configure your
agent.cfg correctly. Look out for misspelling, especially upper/lower case
miswritings.

Testing extended templates:
Start configuring your extended template like this:
[TEMPLATE_01] 
DESCRIPTION=Load average<> 
MONI_SET_NAME=SAP CCMS Admin Workplace 
MONI_NAME=Operating system 
MAX_TREE_DEPTH=0 
PATTERN_0="*" 

Then check it with ./check_sap_cons 01 <RFC Template>. Given you have spelled
MONI_SET_NAME and MONI_NAME correctly you should receive a list from your SAP
system with all monitor contexts you can query.
Once you have the list, edit PATTERN_0 to refine you list. Repeat this until
you have configured your template correctly.
-------------------------------------------------------------------------------


5. Errata

Feel free to use this document as a basis for your Nagios and CCMS confi-
guration, but only at your own risk!
I give no guarantee that everything works fine or that your system landscape
is safe from damage.
Also I cannot take responsibility for the results the use of this document
leads to.
If you are uncertain of some points please write to me directly or to the
mailing list for general SAP/Linux issues:

sch1nd0r@gmx.net
linux.general@listserv-sap.com
-------------------------------------------------------------------------------