mirror of
				https://github.com/Icinga/icinga2.git
				synced 2025-10-26 08:43:51 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			360 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			360 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Additional Agent-based Checks <a id="agent-based-checks-addon"></a>
 | |
| 
 | |
| If the remote services are not directly accessible through the network, a
 | |
| local agent installation exposing the results to check queries can
 | |
| become handy.
 | |
| 
 | |
| ## SNMP <a id="agent-based-checks-snmp"></a>
 | |
| 
 | |
| The SNMP daemon runs on the remote system and answers SNMP queries by plugin
 | |
| binaries. The [Monitoring Plugins package](02-getting-started.md#setting-up-check-plugins) ships
 | |
| the `check_snmp` plugin binary, but there are plenty of [existing plugins](05-service-monitoring.md#service-monitoring-plugins)
 | |
| for specific use cases already around, for example monitoring Cisco routers.
 | |
| 
 | |
| The following example uses the [SNMP ITL](10-icinga-template-library.md#plugin-check-command-snmp) `CheckCommand` and just
 | |
| overrides the `snmp_oid` custom attribute. A service is created for all hosts which
 | |
| have the `snmp-community` custom attribute.
 | |
| 
 | |
|     apply Service "uptime" {
 | |
|       import "generic-service"
 | |
| 
 | |
|       check_command = "snmp"
 | |
|       vars.snmp_oid = "1.3.6.1.2.1.1.3.0"
 | |
|       vars.snmp_miblist = "DISMAN-EVENT-MIB"
 | |
| 
 | |
|       assign where host.vars.snmp_community != ""
 | |
|     }
 | |
| 
 | |
| Additional SNMP plugins are available using the [Manubulon SNMP Plugins](10-icinga-template-library.md#snmp-manubulon-plugin-check-commands).
 | |
| 
 | |
| If no `snmp_miblist` is specified, the plugin will default to `ALL`. As the number of available MIB files
 | |
| on the system increases so will the load generated by this plugin if no `MIB` is specified.
 | |
| As such, it is recommended to always specify at least one `MIB`.
 | |
| 
 | |
| ## SSH <a id="agent-based-checks-ssh"></a>
 | |
| 
 | |
| Calling a plugin using the SSH protocol to execute a plugin on the remote server fetching
 | |
| its return code and output. The `by_ssh` command object is part of the built-in templates and
 | |
| requires the `check_by_ssh` check plugin which is available in the [Monitoring Plugins package](02-getting-started.md#setting-up-check-plugins).
 | |
| 
 | |
|     object CheckCommand "by_ssh_swap" {
 | |
|       import "by_ssh"
 | |
| 
 | |
|       vars.by_ssh_command = "/usr/lib/nagios/plugins/check_swap -w $by_ssh_swap_warn$ -c $by_ssh_swap_crit$"
 | |
|       vars.by_ssh_swap_warn = "75%"
 | |
|       vars.by_ssh_swap_crit = "50%"
 | |
|     }
 | |
| 
 | |
|     object Service "swap" {
 | |
|       import "generic-service"
 | |
| 
 | |
|       host_name = "remote-ssh-host"
 | |
| 
 | |
|       check_command = "by_ssh_swap"
 | |
| 
 | |
|       vars.by_ssh_logname = "icinga"
 | |
|     }
 | |
| 
 | |
| ## NSClient++ <a id="agent-based-checks-nsclient"></a>
 | |
| 
 | |
| [NSClient++](https://nsclient.org/) works on both Windows and Linux platforms and is well
 | |
| known for its magnificent Windows support. There are alternatives like the WMI interface,
 | |
| but using `NSClient++` will allow you to run local scripts similar to check plugins fetching
 | |
| the required output and performance counters.
 | |
| 
 | |
| You can use the `check_nt` plugin from the Monitoring Plugins project to query NSClient++.
 | |
| Icinga 2 provides the [nscp check command](10-icinga-template-library.md#plugin-check-command-nscp) for this:
 | |
| 
 | |
| Example:
 | |
| 
 | |
|     object Service "disk" {
 | |
|       import "generic-service"
 | |
| 
 | |
|       host_name = "remote-windows-host"
 | |
| 
 | |
|       check_command = "nscp"
 | |
| 
 | |
|       vars.nscp_variable = "USEDDISKSPACE"
 | |
|       vars.nscp_params = "c"
 | |
|       vars.nscp_warn = 70
 | |
|       vars.nscp_crit = 80
 | |
|     }
 | |
| 
 | |
| For details on the `NSClient++` configuration please refer to the [official documentation](https://docs.nsclient.org/).
 | |
| 
 | |
| ## NSCA-NG <a id="agent-based-checks-nsca-ng"></a>
 | |
| 
 | |
| [NSCA-ng](http://www.nsca-ng.org) provides a client-server pair that allows the
 | |
| remote sender to push check results into the Icinga 2 `ExternalCommandListener`
 | |
| feature.
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > This addon works in a similar fashion like the Icinga 1.x distributed model. If you
 | |
| > are looking for a real distributed architecture with Icinga 2, scroll down.
 | |
| 
 | |
| ## NRPE <a id="agent-based-checks-nrpe"></a>
 | |
| 
 | |
| [NRPE](https://docs.icinga.com/latest/en/nrpe.html) runs as daemon on the remote client including
 | |
| the required plugins and command definitions.
 | |
| Icinga 2 calls the `check_nrpe` plugin binary in order to query the configured command on the
 | |
| remote client.
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > The NRPE protocol is considered insecure and has multiple flaws in its
 | |
| > design. Upstream is not willing to fix these issues.
 | |
| >
 | |
| > In order to stay safe, please use the native [Icinga 2 client](06-distributed-monitoring.md#distributed-monitoring)
 | |
| > instead.
 | |
| 
 | |
| The NRPE daemon uses its own configuration format in nrpe.cfg while `check_nrpe`
 | |
| can be embedded into the Icinga 2 `CheckCommand` configuration syntax.
 | |
| 
 | |
| You can use the `check_nrpe` plugin from the NRPE project to query the NRPE daemon.
 | |
| Icinga 2 provides the [nrpe check command](10-icinga-template-library.md#plugin-check-command-nrpe) for this:
 | |
| 
 | |
| Example:
 | |
| 
 | |
|     object Service "users" {
 | |
|       import "generic-service"
 | |
| 
 | |
|       host_name = "remote-nrpe-host"
 | |
| 
 | |
|       check_command = "nrpe"
 | |
|       vars.nrpe_command = "check_users"
 | |
|     }
 | |
| 
 | |
| nrpe.cfg:
 | |
| 
 | |
|     command[check_users]=/usr/local/icinga/libexec/check_users -w 5 -c 10
 | |
| 
 | |
| If you are planning to pass arguments to NRPE using the `-a`
 | |
| command line parameter, make sure that your NRPE daemon has them
 | |
| supported and enabled.
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > Enabling command arguments in NRPE is considered harmful
 | |
| > and exposes a security risk allowing attackers to execute
 | |
| > commands remotely. Details at [seclists.org](http://seclists.org/fulldisclosure/2014/Apr/240).
 | |
| 
 | |
| The plugin check command `nrpe` provides the `nrpe_arguments` custom
 | |
| attribute which expects either a single value or an array of values.
 | |
| 
 | |
| Example:
 | |
| 
 | |
|     object Service "nrpe-disk-/" {
 | |
|       import "generic-service"
 | |
| 
 | |
|       host_name = "remote-nrpe-host"
 | |
| 
 | |
|       check_command = "nrpe"
 | |
|       vars.nrpe_command = "check_disk"
 | |
|       vars.nrpe_arguments = [ "20%", "10%", "/" ]
 | |
|     }
 | |
| 
 | |
| Icinga 2 will execute the nrpe plugin like this:
 | |
| 
 | |
|     /usr/lib/nagios/plugins/check_nrpe -H <remote-nrpe-host> -c 'check_disk' -a '20%' '10%' '/'
 | |
| 
 | |
| NRPE expects all additional arguments in an ordered fashion
 | |
| and interprets the first value as `$ARG1$` macro, the second
 | |
| value as `$ARG2$`, and so on.
 | |
| 
 | |
| nrpe.cfg:
 | |
| 
 | |
|     command[check_disk]=/usr/local/icinga/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
 | |
| 
 | |
| Using the above example with `nrpe_arguments` the command
 | |
| executed by the NRPE daemon looks similar to that:
 | |
| 
 | |
|     /usr/local/icinga/libexec/check_disk -w 20% -c 10% -p /
 | |
| 
 | |
| You can pass arguments in a similar manner to [NSClient++](07-agent-based-monitoring.md#agent-based-checks-nsclient)
 | |
| when using its NRPE supported check method.
 | |
| 
 | |
| 
 | |
| ## Passive Check Results and SNMP Traps <a id="agent-based-checks-snmp-traps"></a>
 | |
| 
 | |
| SNMP Traps can be received and filtered by using [SNMPTT](http://snmptt.sourceforge.net/)
 | |
| and specific trap handlers passing the check results to Icinga 2.
 | |
| 
 | |
| Following the SNMPTT [Format](http://snmptt.sourceforge.net/docs/snmptt.shtml#SNMPTT.CONF-FORMAT)
 | |
| documentation and the Icinga external command syntax found [here](24-appendix.md#external-commands-list-detail)
 | |
| we can create generic services that can accommodate any number of hosts for a given scenario.
 | |
| 
 | |
| ### Simple SNMP Traps <a id="simple-traps"></a>
 | |
| 
 | |
| A simple example might be monitoring host reboots indicated by an SNMP agent reset.
 | |
| Building the event to auto reset after dispatching a notification is important.
 | |
| Setup the manual check parameters to reset the event from an initial unhandled
 | |
| state or from a missed reset event.
 | |
| 
 | |
| Add a directive in `snmptt.conf`
 | |
| 
 | |
|     EVENT coldStart .1.3.6.1.6.3.1.1.5.1 "Status Events" Normal
 | |
|     FORMAT Device reinitialized (coldStart)
 | |
|     EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;Coldstart;2;The snmp agent has reinitialized." >> /var/run/icinga2/cmd/icinga2.cmd
 | |
|     SDESC
 | |
|     A coldStart trap signifies that the SNMPv2 entity, acting
 | |
|     in an agent role, is reinitializing itself and that its
 | |
|     configuration may have been altered.
 | |
|     EDESC
 | |
| 
 | |
| 1. Define the `EVENT` as per your need.
 | |
| 2. Construct the `EXEC` statement with the service name matching your template
 | |
| applied to your _n_ hosts. The host address inferred by SNMPTT will be the
 | |
| correlating factor. You can have snmptt provide host names or ip addresses to
 | |
| match your Icinga convention.
 | |
| 
 | |
| Add an `EventCommand` configuration object for the passive service auto reset event.
 | |
| 
 | |
|     object EventCommand "coldstart-reset-event" {
 | |
|       command = [ SysconfDir + "/icinga2/conf.d/custom/scripts/coldstart_reset_event.sh" ]
 | |
| 
 | |
|       arguments = {
 | |
|         "-i" = "$service.state_id$"
 | |
|         "-n" = "$host.name$"
 | |
|         "-s" = "$service.name$"
 | |
|       }
 | |
|     }
 | |
| 
 | |
| Create the `coldstart_reset_event.sh` shell script to pass the expanded variable
 | |
| data in. The `$service.state_id$` is important in order to prevent an endless loop
 | |
| of event firing after the service has been reset.
 | |
| 
 | |
|     #!/bin/bash
 | |
| 
 | |
|     SERVICE_STATE_ID=""
 | |
|     HOST_NAME=""
 | |
|     SERVICE_NAME=""
 | |
| 
 | |
|     show_help()
 | |
|     {
 | |
|     cat <<-EOF
 | |
|     	Usage: ${0##*/} [-h] -n HOST_NAME -s SERVICE_NAME
 | |
|     	Writes a coldstart reset event to the Icinga command pipe.
 | |
| 
 | |
|     	  -h                  Display this help and exit.
 | |
|     	  -i SERVICE_STATE_ID The associated service state id.
 | |
|     	  -n HOST_NAME        The associated host name.
 | |
|     	  -s SERVICE_NAME     The associated service name.
 | |
|     EOF
 | |
|     }
 | |
| 
 | |
|     while getopts "hi:n:s:" opt; do
 | |
|         case "$opt" in
 | |
|           h)
 | |
|               show_help
 | |
|               exit 0
 | |
|               ;;
 | |
|           i)
 | |
|               SERVICE_STATE_ID=$OPTARG
 | |
|               ;;
 | |
|           n)
 | |
|               HOST_NAME=$OPTARG
 | |
|               ;;
 | |
|           s)
 | |
|               SERVICE_NAME=$OPTARG
 | |
|               ;;
 | |
|           '?')
 | |
|               show_help
 | |
|               exit 0
 | |
|               ;;
 | |
|           esac
 | |
|     done
 | |
| 
 | |
|     if [ -z "$SERVICE_STATE_ID" ]; then
 | |
|         show_help
 | |
|         printf "\n  Error: -i required.\n"
 | |
|         exit 1
 | |
|     fi
 | |
| 
 | |
|     if [ -z "$HOST_NAME" ]; then
 | |
|         show_help
 | |
|         printf "\n  Error: -n required.\n"
 | |
|         exit 1
 | |
|     fi
 | |
| 
 | |
|     if [ -z "$SERVICE_NAME" ]; then
 | |
|         show_help
 | |
|         printf "\n  Error: -s required.\n"
 | |
|         exit 1
 | |
|     fi
 | |
| 
 | |
|     if [ "$SERVICE_STATE_ID" -gt 0 ]; then
 | |
|         echo "[`date +%s`] PROCESS_SERVICE_CHECK_RESULT;$HOST_NAME;$SERVICE_NAME;0;Auto-reset (`date +"%m-%d-%Y %T"`)." >> /var/run/icinga2/cmd/icinga2.cmd
 | |
|     fi
 | |
| 
 | |
| Finally create the `Service` and assign it:
 | |
| 
 | |
|     apply Service "Coldstart" {
 | |
|       import "generic-service-custom"
 | |
| 
 | |
|       check_command         = "dummy"
 | |
|       event_command         = "coldstart-reset-event"
 | |
| 
 | |
|       enable_notifications  = 1
 | |
|       enable_active_checks  = 0
 | |
|       enable_passive_checks = 1
 | |
|       enable_flapping       = 0
 | |
|       volatile              = 1
 | |
|       enable_perfdata       = 0
 | |
| 
 | |
|       vars.dummy_state      = 0
 | |
|       vars.dummy_text       = "Manual reset."
 | |
| 
 | |
|       vars.sla              = "24x7"
 | |
| 
 | |
|       assign where (host.vars.os == "Linux" || host.vars.os == "Windows")
 | |
|     }
 | |
| 
 | |
| ### Complex SNMP Traps <a id="complex-traps"></a>
 | |
| 
 | |
| A more complex example might be passing dynamic data from a traps varbind list
 | |
| for a backup scenario where the backup software dispatches status updates. By
 | |
| utilizing active and passive checks, the older freshness concept can be leveraged.
 | |
| 
 | |
| By defining the active check as a hard failed state, a missed backup can be reported.
 | |
| As long as the most recent passive update has occurred, the active check is bypassed.
 | |
| 
 | |
| Add a directive in `snmptt.conf`
 | |
| 
 | |
|     EVENT enterpriseSpecific <YOUR OID> "Status Events" Normal
 | |
|     FORMAT Enterprise specific trap
 | |
|     EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;$1;$2;$3" >> /var/run/icinga2/cmd/icinga2.cmd
 | |
|     SDESC
 | |
|     An enterprise specific trap.
 | |
|     The varbinds in order denote the Icinga service name, state and text.
 | |
|     EDESC
 | |
| 
 | |
| 1. Define the `EVENT` as per your need using your actual oid.
 | |
| 2. The service name, state and text are extracted from the first three varbinds.
 | |
| This has the advantage of accommodating an unlimited set of use cases.
 | |
| 
 | |
| Create a `Service` for the specific use case associated to the host. If the host
 | |
| matches and the first varbind value is `Backup`, SNMPTT will submit the corresponding
 | |
| passive update with the state and text from the second and third varbind:
 | |
| 
 | |
|     object Service "Backup" {
 | |
|       import "generic-service-custom"
 | |
| 
 | |
|       host_name             = "host.domain.com"
 | |
|       check_command         = "dummy"
 | |
| 
 | |
|       enable_notifications  = 1
 | |
|       enable_active_checks  = 1
 | |
|       enable_passive_checks = 1
 | |
|       enable_flapping       = 0
 | |
|       volatile              = 1
 | |
|       max_check_attempts    = 1
 | |
|       check_interval        = 87000
 | |
|       enable_perfdata       = 0
 | |
| 
 | |
|       vars.sla              = "24x7"
 | |
|       vars.dummy_state      = 2
 | |
|       vars.dummy_text       = "No passive check result received."
 | |
|     }
 | |
| 
 |