2014-02-05 15:53:22 +01:00
|
|
|
## <a id="hosts-services"></a> Hosts and Services
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
Icinga 2 can be used to monitor the availability of hosts and services. Services
|
|
|
|
can be virtually anything which can be checked in some way:
|
|
|
|
|
|
|
|
* Network services (HTTP, SMTP, SNMP, SSH, etc.)
|
|
|
|
* Printers
|
|
|
|
* Switches / Routers
|
|
|
|
* Temperature Sensors
|
|
|
|
* Other local or network-accessible services
|
|
|
|
|
2014-03-09 22:30:56 +01:00
|
|
|
Host objects provide a mechanism to group services that are running
|
2013-10-02 09:50:26 +02:00
|
|
|
on the same physical device.
|
|
|
|
|
|
|
|
Here is an example of a host object which defines two child services:
|
|
|
|
|
|
|
|
object Host "my-server1" {
|
2014-04-04 18:41:54 +02:00
|
|
|
vars.address = "10.0.0.1"
|
2014-04-04 19:04:13 +02:00
|
|
|
check_command = "hostalive"
|
2014-03-29 01:13:28 +01:00
|
|
|
}
|
2013-10-02 09:50:26 +02:00
|
|
|
|
2014-04-05 14:53:12 +02:00
|
|
|
object Service "ping4" {
|
|
|
|
host_name = "localhost"
|
2014-03-29 01:13:28 +01:00
|
|
|
check_command = "ping4"
|
|
|
|
}
|
2013-10-02 09:50:26 +02:00
|
|
|
|
2014-04-05 14:53:12 +02:00
|
|
|
object Service "http" {
|
|
|
|
host_name = "localhost"
|
2014-03-29 01:13:28 +01:00
|
|
|
check_command = "http_ip"
|
2013-10-02 09:50:26 +02:00
|
|
|
}
|
|
|
|
|
2014-04-06 10:57:51 +02:00
|
|
|
The example creates two services `ping4` and `http` which belong to the
|
|
|
|
host `my-server1`.
|
2013-10-02 09:50:26 +02:00
|
|
|
|
2014-04-04 19:04:13 +02:00
|
|
|
It also specifies that the host should perform its own check using the `hostalive`
|
2014-04-04 18:57:23 +02:00
|
|
|
check command.
|
2013-10-02 09:50:26 +02:00
|
|
|
|
2014-04-04 18:57:23 +02:00
|
|
|
The `address` custom attribute is used by check commands to determine which network
|
2013-10-02 09:50:26 +02:00
|
|
|
address is associated with the host object.
|
|
|
|
|
2014-02-05 15:53:22 +01:00
|
|
|
### <a id="host-states"></a> Host States
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
Hosts inherit their state from the host check service that is specified using
|
2013-10-07 09:35:44 +02:00
|
|
|
the `check` attribute.
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
Hosts can be in any of the following states:
|
|
|
|
|
|
|
|
Name | Description
|
|
|
|
------------|--------------
|
|
|
|
UP | The host is available.
|
|
|
|
DOWN | The host is unavailable.
|
|
|
|
UNREACHABLE | At least one of the host's dependencies (e.g. its upstream router) is unavailable causing the host to be unreachable.
|
|
|
|
|
2014-02-05 15:53:22 +01:00
|
|
|
### <a id="service-states"></a> Service States
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
Services can be in any of the following states:
|
|
|
|
|
|
|
|
Name | Description
|
|
|
|
------------|--------------
|
|
|
|
OK | The service is working properly.
|
|
|
|
WARNING | The service is experiencing some problems but is still considered to be in working condition.
|
|
|
|
CRITICAL | The service is in a critical state.
|
|
|
|
UNKNOWN | The check could not determine the service's state.
|
|
|
|
|
2014-02-05 15:53:22 +01:00
|
|
|
### <a id="hard-soft-states"></a> Hard and Soft States
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
When detecting a problem with a service Icinga re-checks the service a number of
|
2013-10-07 09:35:44 +02:00
|
|
|
times (based on the `max_check_attempts` and `retry_interval` settings) before sending
|
2013-10-02 09:50:26 +02:00
|
|
|
notifications. This ensures that no unnecessary notifications are sent for
|
2013-10-07 09:35:44 +02:00
|
|
|
transient failures. During this time the service is in a `SOFT` state.
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
After all re-checks have been executed and the service is still in a non-OK
|
2013-10-07 09:35:44 +02:00
|
|
|
state the service switches to a `HARD` state and notifications are sent.
|
2013-10-02 09:50:26 +02:00
|
|
|
|
|
|
|
Name | Description
|
|
|
|
------------|--------------
|
|
|
|
HARD | The host/service's state hasn't recently changed.
|
|
|
|
SOFT | The host/service has recently changed state and is being re-checked.
|