mirror of
				https://github.com/Icinga/icinga2.git
				synced 2025-10-30 10:43:50 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			485 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			485 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Agent-based Checks <a id="agent-based-checks-addon"></a>
 | |
| 
 | |
| If the remote services are not directly accessible through the network, a
 | |
| local agent installation exposing the results to check queries can
 | |
| become handy.
 | |
| 
 | |
| Prior to installing and configuration an agent service, evaluate possible
 | |
| options based on these requirements:
 | |
| 
 | |
| * Security (authentication, TLS certificates, secure connection handling, etc.)
 | |
| * Connection direction
 | |
|     * Master/satellite can execute commands directly or
 | |
|     * Agent sends back passive/external check results
 | |
| * Availability on specific OS types and versions
 | |
|     * Packages available
 | |
| * Configuration and initial setup
 | |
| * Updates and maintenance, compatibility
 | |
| 
 | |
| Available agent types:
 | |
| 
 | |
| * [Icinga Agent](07-agent-based-monitoring.md#agent-based-checks-icinga) on Linux/Unix and Windows
 | |
| * [SSH](07-agent-based-monitoring.md#agent-based-checks-ssh) on Linux/Unix
 | |
| * [SNMP](07-agent-based-monitoring.md#agent-based-checks-snmp) on Linux/Unix and hardware
 | |
| * [SNMP Traps](07-agent-based-monitoring.md#agent-based-checks-snmp-traps) as passive check results
 | |
| * [REST API](07-agent-based-monitoring.md#agent-based-checks-rest-api) for passive external check results
 | |
| * [NSClient++](07-agent-based-monitoring.md#agent-based-checks-nsclient) and [WMI](07-agent-based-monitoring.md#agent-based-checks-wmi) on Windows
 | |
| 
 | |
| 
 | |
| ## Icinga Agent <a id="agent-based-checks-icinga"></a>
 | |
| 
 | |
| For the most common setups on Linux/Unix and Windows, we recommend
 | |
| to setup the Icinga agent in a [distributed environment](06-distributed-monitoring.md#distributed-monitoring).
 | |
| 
 | |
| 
 | |
| 
 | |
| Key benefits:
 | |
| 
 | |
| * Directly integrated into the distributed monitoring stack of Icinga
 | |
| * Works on Linux/Unix and Windows
 | |
| * Secure communication with TLS
 | |
| * Connection can be established from both sides. Once connected, command execution and check results are exchanged.
 | |
|     * Master/satellite connects to agent
 | |
|     * Agent connects to parent satellite/master
 | |
| * Same configuration language and binaries
 | |
| * Troubleshooting docs and community best practices
 | |
| 
 | |
| Follow the setup and configuration instructions [here](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite).
 | |
| 
 | |
| On Windows hosts, the Icinga agent can query a local NSClient++ service
 | |
| for additional checks in case there are no plugins available.
 | |
| 
 | |
| 
 | |
| 
 | |
| ## SSH <a id="agent-based-checks-ssh"></a>
 | |
| 
 | |
| > **Tip**
 | |
| >
 | |
| > This is the recommended way for systems where the Icinga agent is not available
 | |
| > Be it specific hardware architectures, old systems or forbidden to install an additional software.
 | |
| 
 | |
| This method uses the SSH service on the remote host to execute
 | |
| an arbitrary plugin command line. The output and exit code is
 | |
| returned and used by the core.
 | |
| 
 | |
| The `check_by_ssh` plugin takes care of this. It is available in the
 | |
| [Monitoring Plugins](https://www.monitoring-plugins.org/) package.
 | |
| For your convenience, the Icinga template library provides the [by_ssh](10-icinga-template-library.md#plugin-check-command-by-ssh)
 | |
| CheckCommand already.
 | |
| 
 | |
| ### SSH: Preparations <a id="agent-based-checks-ssh-preparations"></a>
 | |
| 
 | |
| SSH key pair for the Icinga daemon user. In case the user has no shell, temporarily enable this.
 | |
| When asked for a passphrase, **do not set it** and press enter.
 | |
| 
 | |
| ```bash
 | |
| sudo su - icinga
 | |
| 
 | |
| ssh-keygen -b 4096 -t rsa -C "icinga@$(hostname) user for check_by_ssh" -f $HOME/.ssh/id_rsa
 | |
| ```
 | |
| 
 | |
| On the remote agent, create the icinga user and generate a temporary password.
 | |
| 
 | |
| ```bash
 | |
| useradd -m icinga
 | |
| passwd icinga
 | |
| ```
 | |
| 
 | |
| Copy the public key from the Icinga server to the remote agent, e.g. with `ssh-copy-id`
 | |
| or manually into `/home/icinga/.ssh/authorized_keys`.
 | |
| This will ask for the password once.
 | |
| 
 | |
| ```bash
 | |
| sudo su - icinga
 | |
| 
 | |
| ssh-copy-id -i $HOME/.ssh/id_rsa icinga@ssh-agent1.localdomain
 | |
| ```
 | |
| 
 | |
| After the SSH key is copied, test at the connection **at least once** and
 | |
| accept the host key verification. If you forget about this step, checks will
 | |
| become UNKNOWN later.
 | |
| 
 | |
| ```bash
 | |
| ssh -i $HOME/.ssh/id_rsa icinga@ssh-agent1.localdomain
 | |
| ```
 | |
| 
 | |
| After the SSH key login works, disable the previously enabled logins.
 | |
| 
 | |
| * Remote agent user's password with `passwd -l icinga`
 | |
| * Local icinga user terminal
 | |
| 
 | |
| Also, ensure that the permissions are correct for the `.ssh` directory
 | |
| as otherwise logins will fail.
 | |
| 
 | |
| * `.ssh` directory: 700
 | |
| * `.ssh/id_rsa.pub` public key file: 644
 | |
| * `.ssh/id_rsa` private key file: 600
 | |
| 
 | |
| 
 | |
| ### SSH: Configuration <a id="agent-based-checks-ssh-config"></a>
 | |
| 
 | |
| First, create a host object which has SSH configured and enabled.
 | |
| Mark this e.g. with the custom variable `agent_type` to later
 | |
| use this for service apply rule matches. Best practice is to
 | |
| store that in a specific template, either in the static configuration
 | |
| or inside the Director.
 | |
| 
 | |
| ```
 | |
| template Host "ssh-agent" {
 | |
|   check_command = "hostalive"
 | |
| 
 | |
|   vars.agent_type = "ssh"
 | |
|   vars.os_type = "linux"
 | |
| }
 | |
| 
 | |
| object Host "ssh-agent1.localdomain" {
 | |
|   import "ssh-agent"
 | |
| 
 | |
|   address = "192.168.56.115"
 | |
| }
 | |
| ```
 | |
| 
 | |
| Example for monitoring the remote users:
 | |
| 
 | |
| ```
 | |
| apply Service "users" {
 | |
|   check_command = "by_ssh"
 | |
| 
 | |
|   vars.by_ssh_command = [ "/usr/lib/nagios/plugins/check_users" ]
 | |
| 
 | |
|   // Follows the same principle as with command arguments, e.g. for ordering
 | |
|   vars.by_ssh_arguments = {
 | |
|     "-w" = {
 | |
|       value = "$users_wgreater$" // Can reference an existing custom variable defined on the host or service, evaluated at runtime
 | |
|     }
 | |
|     "-c" = {
 | |
|       value = "$users_cgreater$"
 | |
|     }
 | |
|   }
 | |
| 
 | |
|   vars.users_wgreater = 3
 | |
|   vars.users_cgreater = 5
 | |
| 
 | |
|   assign where host.vars.os_type == "linux" && host.vars.agent_type == "ssh"
 | |
| }
 | |
| ```
 | |
| 
 | |
| A more advanced example with better arguments is shown in [this blogpost](https://www.netways.de/blog/2016/03/21/check_by_ssh-mit-icinga-2/).
 | |
| 
 | |
| 
 | |
| ## SNMP <a id="agent-based-checks-snmp"></a>
 | |
| 
 | |
| The SNMP daemon runs on the remote system and answers SNMP queries by plugin scripts.
 | |
| The [Monitoring Plugins](https://www.monitoring-plugins.org/) package provides
 | |
| the `check_snmp` plugin binary, but there are plenty of [existing plugins](05-service-monitoring.md#service-monitoring-plugins)
 | |
| for specific use cases already around, for example monitoring Cisco routers.
 | |
| 
 | |
| The following example uses the [SNMP ITL](10-icinga-template-library.md#plugin-check-command-snmp)
 | |
| CheckCommand and sets the `snmp_oid` custom variable. A service is created for all hosts which
 | |
| have the `snmp-community` custom variable.
 | |
| 
 | |
| ```
 | |
| template Host "snmp-agent" {
 | |
|   check_command = "hostalive"
 | |
| 
 | |
|   vars.agent_type = "snmp"
 | |
| 
 | |
|   vars.snmp_community = "public-icinga"
 | |
| }
 | |
| 
 | |
| object Host "snmp-agent1.localdomain" {
 | |
|   import "snmp-agent"
 | |
| }
 | |
| ```
 | |
| 
 | |
| ```
 | |
| apply Service "uptime" {
 | |
|   import "generic-service"
 | |
| 
 | |
|   check_command = "snmp"
 | |
|   vars.snmp_oid = "1.3.6.1.2.1.1.3.0"
 | |
|   vars.snmp_miblist = "DISMAN-EVENT-MIB"
 | |
| 
 | |
|   assign where host.vars.agent_type == "snmp" && host.vars.snmp_community != ""
 | |
| }
 | |
| ```
 | |
| 
 | |
| If no `snmp_miblist` is specified, the plugin will default to `ALL`. As the number of available MIB files
 | |
| on the system increases so will the load generated by this plugin if no `MIB` is specified.
 | |
| As such, it is recommended to always specify at least one `MIB`.
 | |
| 
 | |
| Additional SNMP plugins are available using the [Manubulon SNMP Plugins](10-icinga-template-library.md#snmp-manubulon-plugin-check-commands).
 | |
| 
 | |
| For network monitoring, community members advise to use [nwc_health](05-service-monitoring.md#service-monitoring-network)
 | |
| for example.
 | |
| 
 | |
| 
 | |
| ## SNMP Traps and Passive Check Results <a id="agent-based-checks-snmp-traps"></a>
 | |
| 
 | |
| SNMP Traps can be received and filtered by using [SNMPTT](http://snmptt.sourceforge.net/)
 | |
| and specific trap handlers passing the check results to Icinga 2.
 | |
| 
 | |
| Following the SNMPTT [Format](http://snmptt.sourceforge.net/docs/snmptt.shtml#SNMPTT.CONF-FORMAT)
 | |
| documentation and the Icinga external command syntax found [here](24-appendix.md#external-commands-list-detail)
 | |
| we can create generic services that can accommodate any number of hosts for a given scenario.
 | |
| 
 | |
| ### Simple SNMP Traps <a id="simple-traps"></a>
 | |
| 
 | |
| A simple example might be monitoring host reboots indicated by an SNMP agent reset.
 | |
| Building the event to auto reset after dispatching a notification is important.
 | |
| Setup the manual check parameters to reset the event from an initial unhandled
 | |
| state or from a missed reset event.
 | |
| 
 | |
| Add a directive in `snmptt.conf`
 | |
| 
 | |
| ```
 | |
| EVENT coldStart .1.3.6.1.6.3.1.1.5.1 "Status Events" Normal
 | |
| FORMAT Device reinitialized (coldStart)
 | |
| EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;Coldstart;2;The snmp agent has reinitialized." >> /var/run/icinga2/cmd/icinga2.cmd
 | |
| SDESC
 | |
| A coldStart trap signifies that the SNMPv2 entity, acting
 | |
| in an agent role, is reinitializing itself and that its
 | |
| configuration may have been altered.
 | |
| EDESC
 | |
| ```
 | |
| 
 | |
| 1. Define the `EVENT` as per your need.
 | |
| 2. Construct the `EXEC` statement with the service name matching your template
 | |
| applied to your _n_ hosts. The host address inferred by SNMPTT will be the
 | |
| correlating factor. You can have snmptt provide host names or ip addresses to
 | |
| match your Icinga convention.
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > Replace the deprecated command pipe EXEC statement with a curl call
 | |
| > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
 | |
| 
 | |
| Add an `EventCommand` configuration object for the passive service auto reset event.
 | |
| 
 | |
| ```
 | |
| object EventCommand "coldstart-reset-event" {
 | |
|   command = [ ConfigDir + "/conf.d/custom/scripts/coldstart_reset_event.sh" ]
 | |
| 
 | |
|   arguments = {
 | |
|     "-i" = "$service.state_id$"
 | |
|     "-n" = "$host.name$"
 | |
|     "-s" = "$service.name$"
 | |
|   }
 | |
| }
 | |
| ```
 | |
| 
 | |
| Create the `coldstart_reset_event.sh` shell script to pass the expanded variable
 | |
| data in. The `$service.state_id$` is important in order to prevent an endless loop
 | |
| of event firing after the service has been reset.
 | |
| 
 | |
| ```bash
 | |
| #!/bin/bash
 | |
| 
 | |
| SERVICE_STATE_ID=""
 | |
| HOST_NAME=""
 | |
| SERVICE_NAME=""
 | |
| 
 | |
| show_help()
 | |
| {
 | |
| cat <<-EOF
 | |
| 	Usage: ${0##*/} [-h] -n HOST_NAME -s SERVICE_NAME
 | |
| 	Writes a coldstart reset event to the Icinga command pipe.
 | |
| 
 | |
| 	  -h                  Display this help and exit.
 | |
| 	  -i SERVICE_STATE_ID The associated service state id.
 | |
| 	  -n HOST_NAME        The associated host name.
 | |
| 	  -s SERVICE_NAME     The associated service name.
 | |
| EOF
 | |
| }
 | |
| 
 | |
| while getopts "hi:n:s:" opt; do
 | |
|     case "$opt" in
 | |
|       h)
 | |
|           show_help
 | |
|           exit 0
 | |
|           ;;
 | |
|       i)
 | |
|           SERVICE_STATE_ID=$OPTARG
 | |
|           ;;
 | |
|       n)
 | |
|           HOST_NAME=$OPTARG
 | |
|           ;;
 | |
|       s)
 | |
|           SERVICE_NAME=$OPTARG
 | |
|           ;;
 | |
|       '?')
 | |
|           show_help
 | |
|           exit 0
 | |
|           ;;
 | |
|       esac
 | |
| done
 | |
| 
 | |
| if [ -z "$SERVICE_STATE_ID" ]; then
 | |
|     show_help
 | |
|     printf "\n  Error: -i required.\n"
 | |
|     exit 1
 | |
| fi
 | |
| 
 | |
| if [ -z "$HOST_NAME" ]; then
 | |
|     show_help
 | |
|     printf "\n  Error: -n required.\n"
 | |
|     exit 1
 | |
| fi
 | |
| 
 | |
| if [ -z "$SERVICE_NAME" ]; then
 | |
|     show_help
 | |
|     printf "\n  Error: -s required.\n"
 | |
|     exit 1
 | |
| fi
 | |
| 
 | |
| if [ "$SERVICE_STATE_ID" -gt 0 ]; then
 | |
|     echo "[`date +%s`] PROCESS_SERVICE_CHECK_RESULT;$HOST_NAME;$SERVICE_NAME;0;Auto-reset (`date +"%m-%d-%Y %T"`)." >> /var/run/icinga2/cmd/icinga2.cmd
 | |
| fi
 | |
| ```
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > Replace the deprecated command pipe EXEC statement with a curl call
 | |
| > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
 | |
| 
 | |
| Finally create the `Service` and assign it:
 | |
| 
 | |
| ```
 | |
| apply Service "Coldstart" {
 | |
|   import "generic-service-custom"
 | |
| 
 | |
|   check_command         = "dummy"
 | |
|   event_command         = "coldstart-reset-event"
 | |
| 
 | |
|   enable_notifications  = 1
 | |
|   enable_active_checks  = 0
 | |
|   enable_passive_checks = 1
 | |
|   enable_flapping       = 0
 | |
|   volatile              = 1
 | |
|   enable_perfdata       = 0
 | |
| 
 | |
|   vars.dummy_state      = 0
 | |
|   vars.dummy_text       = "Manual reset."
 | |
| 
 | |
|   vars.sla              = "24x7"
 | |
| 
 | |
|   assign where (host.vars.os == "Linux" || host.vars.os == "Windows")
 | |
| }
 | |
| ```
 | |
| 
 | |
| ### Complex SNMP Traps <a id="complex-traps"></a>
 | |
| 
 | |
| A more complex example might be passing dynamic data from a traps varbind list
 | |
| for a backup scenario where the backup software dispatches status updates. By
 | |
| utilizing active and passive checks, the older freshness concept can be leveraged.
 | |
| 
 | |
| By defining the active check as a hard failed state, a missed backup can be reported.
 | |
| As long as the most recent passive update has occurred, the active check is bypassed.
 | |
| 
 | |
| Add a directive in `snmptt.conf`
 | |
| 
 | |
| ```
 | |
| EVENT enterpriseSpecific <YOUR OID> "Status Events" Normal
 | |
| FORMAT Enterprise specific trap
 | |
| EXEC echo "[$@] PROCESS_SERVICE_CHECK_RESULT;$A;$1;$2;$3" >> /var/run/icinga2/cmd/icinga2.cmd
 | |
| SDESC
 | |
| An enterprise specific trap.
 | |
| The varbinds in order denote the Icinga service name, state and text.
 | |
| EDESC
 | |
| ```
 | |
| 
 | |
| 1. Define the `EVENT` as per your need using your actual oid.
 | |
| 2. The service name, state and text are extracted from the first three varbinds.
 | |
| This has the advantage of accommodating an unlimited set of use cases.
 | |
| 
 | |
| > **Note**
 | |
| >
 | |
| > Replace the deprecated command pipe EXEC statement with a curl call
 | |
| > to the REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result).
 | |
| 
 | |
| Create a `Service` for the specific use case associated to the host. If the host
 | |
| matches and the first varbind value is `Backup`, SNMPTT will submit the corresponding
 | |
| passive update with the state and text from the second and third varbind:
 | |
| 
 | |
| ```
 | |
| object Service "Backup" {
 | |
|   import "generic-service-custom"
 | |
| 
 | |
|   host_name             = "host.domain.com"
 | |
|   check_command         = "dummy"
 | |
| 
 | |
|   enable_notifications  = 1
 | |
|   enable_active_checks  = 1
 | |
|   enable_passive_checks = 1
 | |
|   enable_flapping       = 0
 | |
|   volatile              = 1
 | |
|   max_check_attempts    = 1
 | |
|   check_interval        = 87000
 | |
|   enable_perfdata       = 0
 | |
| 
 | |
|   vars.sla              = "24x7"
 | |
|   vars.dummy_state      = 2
 | |
|   vars.dummy_text       = "No passive check result received."
 | |
| }
 | |
| ```
 | |
| 
 | |
| 
 | |
| ## Agents sending Check Results via REST API <a id="agent-based-checks-rest-api"></a>
 | |
| 
 | |
| Whenever the remote agent cannot run the Icinga agent, or a backup script
 | |
| should just send its current state after finishing, you can use the [REST API](12-icinga2-api.md#icinga2-api)
 | |
| as secure transport and send [passive external check results](08-advanced-topics.md#external-check-results).
 | |
| 
 | |
| Use the [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result) API action to send the external passive check result.
 | |
| You can either use `curl` or implement the HTTP requests in your preferred programming
 | |
| language. Examples for API clients are available in [this chapter](12-icinga2-api.md#icinga2-api-clients).
 | |
| 
 | |
| Feeding check results from remote hosts requires the host/service
 | |
| objects configured on the master/satellite instance.
 | |
| 
 | |
| ## NSClient++ on Windows <a id="agent-based-checks-nsclient"></a>
 | |
| 
 | |
| [NSClient++](https://nsclient.org/) works on both Windows and Linux platforms and is well
 | |
| known for its magnificent Windows support. There are alternatives like the WMI interface,
 | |
| but using `NSClient++` will allow you to run local scripts similar to check plugins fetching
 | |
| the required output and performance counters.
 | |
| 
 | |
| > **Tip**
 | |
| >
 | |
| > Best practice is to use the Icinga agent as secure execution
 | |
| > bridge (`check_nt` and `check_nrpe` are considered insecure)
 | |
| > and query the NSClient++ service [locally](06-distributed-monitoring.md#distributed-monitoring-windows-nscp).
 | |
| 
 | |
| You can use the `check_nt` plugin from the Monitoring Plugins project to query NSClient++.
 | |
| Icinga 2 provides the [nscp check command](10-icinga-template-library.md#plugin-check-command-nscp) for this:
 | |
| 
 | |
| Example:
 | |
| 
 | |
| ```
 | |
| object Service "disk" {
 | |
|   import "generic-service"
 | |
| 
 | |
|   host_name = "remote-windows-host"
 | |
| 
 | |
|   check_command = "nscp"
 | |
| 
 | |
|   vars.nscp_variable = "USEDDISKSPACE"
 | |
|   vars.nscp_params = "c"
 | |
|   vars.nscp_warn = 70
 | |
|   vars.nscp_crit = 80
 | |
| }
 | |
| ```
 | |
| 
 | |
| For details on the `NSClient++` configuration please refer to the [official documentation](https://docs.nsclient.org/).
 | |
| 
 | |
| ## WMI on Windows <a id="agent-based-checks-wmi"></a>
 | |
| 
 | |
| The most popular plugin is [check_wmi_plus](https://edcint.co.nz/checkwmiplus/).
 | |
| 
 | |
| > Check WMI Plus uses the Windows Management Interface (WMI) to check for common services (cpu, disk, sevices, eventlog…) on Windows machines. It requires the open source wmi client for Linux.
 | |
| 
 | |
| Community examples:
 | |
| 
 | |
| * [Icinga 2 check_wmi_plus example by 18pct](https://18pct.com/icinga2-check_wmi_plus-example/)
 | |
| * [Agent-less monitoring with WMI](https://www.devlink.de/linux/icinga2-nagios-agentless-monitoring-von-windows/)
 |