2.7 KiB
Hosts and Services
Icinga 2 can be used to monitor the availability of hosts and services. Services can be virtually anything which can be checked in some way:
- Network services (HTTP, SMTP, SNMP, SSH, etc.)
- Printers
- Switches / Routers
- Temperature Sensors
- Other local or network-accessible services
Host objects provide a mechanism to group services that are running on the same physical device.
Here is an example of a host object which defines two child services:
object Host "my-server1" {
vars.address = "10.0.0.1"
check_command = "hostalive"
}
apply Service "ping4" {
check_command = "ping4"
assign where host.name == "my-server1"
}
apply Service "http" {
check_command = "http_ip"
assign where host.name == "my-server1"
}
The example host my-server1
creates two services which belong to this host:
ping4
and http
.
It also specifies that the host should perform its own check using the hostalive
check command.
The address
custom attribute is used by check commands to determine which network
address is associated with the host object.
Host States
Hosts inherit their state from the host check service that is specified using
the check
attribute.
Hosts can be in any of the following states:
Name | Description |
---|---|
UP | The host is available. |
DOWN | The host is unavailable. |
UNREACHABLE | At least one of the host's dependencies (e.g. its upstream router) is unavailable causing the host to be unreachable. |
Service States
Services can be in any of the following states:
Name | Description |
---|---|
OK | The service is working properly. |
WARNING | The service is experiencing some problems but is still considered to be in working condition. |
CRITICAL | The service is in a critical state. |
UNKNOWN | The check could not determine the service's state. |
Hard and Soft States
When detecting a problem with a service Icinga re-checks the service a number of
times (based on the max_check_attempts
and retry_interval
settings) before sending
notifications. This ensures that no unnecessary notifications are sent for
transient failures. During this time the service is in a SOFT
state.
After all re-checks have been executed and the service is still in a non-OK
state the service switches to a HARD
state and notifications are sent.
Name | Description |
---|---|
HARD | The host/service's state hasn't recently changed. |
SOFT | The host/service has recently changed state and is being re-checked. |