icinga2/doc/3.01-hosts-and-services.md

2.9 KiB

Hosts and Services

Icinga 2 can be used to monitor the availability of hosts and services. Services can be virtually anything which can be checked in some way:

  • Network services (HTTP, SMTP, SNMP, SSH, etc.)
  • Printers
  • Switches / Routers
  • Temperature Sensors
  • Other local or network-accessible services

Host objects provide a mechanism to group services that are running on the same physical device.

Here is an example of a host object which defines two child services:

object Host "my-server1" {
  services["ping4"] = {
    check_command = "ping4"
  },

  services["http"] = {
    check_command = "http_ip"
  },

  check = "ping4",

  macros["address"] = "10.0.0.1"
}

The example host my-server1 creates two services which belong to this host: ping4 and http.

It also specifies that the host should inherit its availability state from the ping4 service.

Note

In Icinga 1.x hosts had their own check command, check interval and notification settings. Instead, in Icinga 2 hosts inherit their state from one of its child services. No checks are performed for the host itself.

The address macro is used by check commands to determine which network address is associated with the host object.

Host States

Hosts inherit their state from the host check service that is specified using the check attribute.

Hosts can be in any of the following states:

Name Description
UP The host is available.
DOWN The host is unavailable.
UNREACHABLE At least one of the host's dependencies (e.g. its upstream router) is unavailable causing the host to be unreachable.

Service States

Services can be in any of the following states:

Name Description
OK The service is working properly.
WARNING The service is experiencing some problems but is still considered to be in working condition.
CRITICAL The service is in a critical state.
UNKNOWN The check could not determine the service's state.

Hard and Soft States

When detecting a problem with a service Icinga re-checks the service a number of times (based on the max_check_attempts and retry_interval settings) before sending notifications. This ensures that no unnecessary notifications are sent for transient failures. During this time the service is in a SOFT state.

After all re-checks have been executed and the service is still in a non-OK state the service switches to a HARD state and notifications are sent.

Name Description
HARD The host/service's state hasn't recently changed.
SOFT The host/service has recently changed state and is being re-checked.