mirror of https://github.com/Icinga/icinga2.git
Documentation: Better apply rule best practice in monitoring basics
fixes #7480 fixes #7543 fixes #7187 fixes #7573
This commit is contained in:
parent
711517bd0e
commit
c1f4d2243e
|
@ -1219,7 +1219,7 @@ re-implementation of the Livestatus protocol which is compatible with MK
|
|||
Livestatus.
|
||||
|
||||
Details on the available tables and attributes with Icinga 2 can be found
|
||||
in the [Livestatus Schema](#schema-livestatus) section.
|
||||
in the [Livestatus](#livestatus) section.
|
||||
|
||||
You can enable Livestatus using icinga2 feature enable:
|
||||
|
||||
|
|
|
@ -37,6 +37,12 @@ Here is an example of a host object which defines two child services:
|
|||
The example creates two services `ping4` and `http` which belong to the
|
||||
host `my-server1`.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> When using [apply](#using-apply) rules, a service apply definition will
|
||||
> implicitely create a relationship to each host by setting the `host_name`
|
||||
> attribute.
|
||||
|
||||
It also specifies that the host should perform its own check using the `hostalive`
|
||||
check command.
|
||||
|
||||
|
@ -109,7 +115,7 @@ requirements first and then decide for a possible strategy.
|
|||
There are many ways of creating Icinga 2 configuration objects:
|
||||
|
||||
* Manually with your preferred editor, for example vi(m), nano, notepad, etc.
|
||||
* Generated by a configuration management tool such as Puppet, Chef, Ansible, etc.
|
||||
* Generated by a [configuration management too](#configuration-tools) such as Puppet, Chef, Ansible, etc.
|
||||
* A configuration addon for Icinga 2
|
||||
* A custom exporter script from your CMDB or inventory tool
|
||||
* your own.
|
||||
|
@ -143,7 +149,7 @@ You can later use them for applying assign/ignore rules, or export them into ext
|
|||
* Put hosts into hostgroups, services into servicegroups and use these attributes for your apply rules.
|
||||
* Use templates to store generic attributes for your objects and apply rules making your configuration more readable.
|
||||
Details can be found in the [using templates](#using-templates) chapter.
|
||||
* Apply rules may overlap. Keep a central place (for example, `services.conf` or `notifications.conf`) storing
|
||||
* Apply rules may overlap. Keep a central place (for example, [services.conf](#services-conf) or [notifications.conf](#notifications-conf)) storing
|
||||
the configuration instead of defining apply rules deep in your configuration tree.
|
||||
* Every plugin used as check, notification or event command requires a `Command` definition.
|
||||
Further details can be looked up in the [check commands](#check-commands) chapter.
|
||||
|
@ -164,22 +170,31 @@ object:
|
|||
enable_perfdata = true
|
||||
}
|
||||
|
||||
object Service "ping4" {
|
||||
template Service "ipv6-service {
|
||||
notes = "IPv6 critical != IPv4 broken."
|
||||
}
|
||||
|
||||
apply Service "ping4" {
|
||||
import "generic-service"
|
||||
|
||||
host_name = "localhost"
|
||||
check_command = "ping4"
|
||||
|
||||
assign where host.address
|
||||
}
|
||||
|
||||
object Service "ping6" {
|
||||
apply Service "ping6" {
|
||||
import "generic-service"
|
||||
import "ipv6-service"
|
||||
|
||||
host_name = "localhost"
|
||||
check_command = "ping6"
|
||||
|
||||
assign where host.address6
|
||||
}
|
||||
|
||||
|
||||
In this example the `ping4` and `ping6` services inherit properties from the
|
||||
template `generic-service`.
|
||||
template `generic-service`. The `ping6` service additionally imports the `ipv6-service`
|
||||
template with the `notes` attribute.
|
||||
|
||||
Objects as well as templates themselves can import an arbitrary number of
|
||||
templates. Attributes inherited from a template can be overridden in the
|
||||
|
@ -187,42 +202,135 @@ object if necessary.
|
|||
|
||||
### <a id="using-apply"></a> Apply objects based on rules
|
||||
|
||||
Instead of assigning each object (`Service`, `Notification`, `Dependency`, `ScheduledDowntime`)
|
||||
Instead of assigning each object ([Service](#objecttype-service),
|
||||
[Notification](#objecttype-notification), [Dependency](#objecttype-dependency),
|
||||
[ScheduledDowntime](#objecttype-scheduleddowntime))
|
||||
based on attribute identifiers for example `host_name` objects can be [applied](#apply).
|
||||
|
||||
Detailed scenario examples are used in their respective chapters, for example
|
||||
[apply services with custom command arguments](#using-apply-services-command-arguments).
|
||||
Before you start using the apply rules keep the following in mind:
|
||||
|
||||
* Define the best match.
|
||||
* A set of unique [custom attributes](#custom-attributes-apply) for these hosts/services?
|
||||
* Or [group](#groups) memberships, e.g. a host being a member of a hostgroup, applying services to it?
|
||||
* A generic pattern [match](#function-calls) on the host/service name?
|
||||
* [Multiple expressions combined](#using-apply-expressions) with `&&` or `||` [operators](#expression-operators)
|
||||
* All expressions must return a boolean value (an empty string is equal to `false` e.g.)
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> You can set/override object attributes in apply rules using the respectively available
|
||||
> objects in that scope (host and/or service objects).
|
||||
|
||||
[Custom attributes](#custom-attributes) can also store nested dictionaries and arrays. That way you can use them
|
||||
for not only matching for their existance or values in apply expressions, but also assign
|
||||
("inherit") their values into the generated objected from apply rules.
|
||||
|
||||
* [Apply services to hosts](#using-apply-services)
|
||||
* [Apply notifications to hosts and services](#using-apply-notifications)
|
||||
* [Apply dependencies to hosts and services](#using-apply-scheduledowntimes)
|
||||
* [Apply scheduled downtimes to hosts and services](#using-apply-scheduledowntimes)
|
||||
|
||||
A more advanced example is using [apply with for loops on arrays or
|
||||
dictionaries](#using-apply-for) for example provided by
|
||||
[custom atttributes](#custom-attributes-apply) or groups.
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
> Building configuration in that dynamic way requires detailed information
|
||||
> of the generated objects. Use the `object list` [cli command](#cli-command-object)
|
||||
> after successful [configuration validation](#config-validation).
|
||||
|
||||
|
||||
#### <a id="using-apply-expressions"></a> Apply Rules Expressions
|
||||
|
||||
You can use simple or advanced combinations of apply rule expressions. Each
|
||||
expression must evaluate into the boolean `true` value. An empty string
|
||||
will be for instance interpreted as `false`. In a similar fashion undefined
|
||||
attributes will return `false`.
|
||||
|
||||
Returns `false`:
|
||||
|
||||
assign where host.vars.attribute_does_not_exist
|
||||
|
||||
Multiple `assign where` condition rows are evaluated as `OR` condition.
|
||||
|
||||
You can combine multiple expressions for matching only a subset of objects. In some cases,
|
||||
you want to be able to add more than one assign/ignore where expression which matches
|
||||
a specific condition. To achieve this you can use the logical `and` and `or` operators.
|
||||
|
||||
|
||||
Match all `*mysql*` patterns in the host name and (`&&`) custom attribute `prod_mysql_db`
|
||||
matches the `db-*` pattern. All hosts with the custom attribute `test_server` set to `true`
|
||||
should be ignored, or any host name ending with `*internal` pattern.
|
||||
|
||||
object HostGroup "mysql-server" {
|
||||
display_name = "MySQL Server"
|
||||
|
||||
assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db)
|
||||
ignore where host.vars.test_server == true
|
||||
ignore where match("*internal", host.name)
|
||||
}
|
||||
|
||||
Similar example for advanced notification apply rule filters: If the service
|
||||
attribute `notes` contains the `has gold support 24x7` string `AND` one of the
|
||||
two condition passes: Either the `customer` host custom attribute is set to `customer-xy`
|
||||
`OR` the host custom attribute `always_notify` is set to `true`.
|
||||
|
||||
The notification is ignored for services whose host name ends with `*internal`
|
||||
`OR` the `priority` custom attribute is [less than](#expression-operators) `2`.
|
||||
|
||||
template Notification "cust-xy-notification" {
|
||||
users = [ "noc-xy", "mgmt-xy" ]
|
||||
command = "mail-service-notification"
|
||||
}
|
||||
|
||||
apply Notification "notify-cust-xy-mysql" to Service {
|
||||
import "cust-xy-notification"
|
||||
|
||||
assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
|
||||
ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.is_clustered == true)
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
#### <a id="using-apply-services"></a> Apply Services to Hosts
|
||||
|
||||
apply Service "load" {
|
||||
The sample configuration already ships a detailed example in [hosts.conf](#hosts-conf)
|
||||
and [services.conf](#services-conf) for this use case.
|
||||
|
||||
The example for `ssh` applies a service object to all hosts with the `address`
|
||||
attribute being defined and the custom attribute `os` set to the string `Linux` in `vars`.
|
||||
|
||||
apply Service "ssh" {
|
||||
import "generic-service"
|
||||
|
||||
check_command = "load"
|
||||
check_command = "ssh"
|
||||
|
||||
assign where "linux-server" in host.groups
|
||||
ignore where host.vars.no_load_check
|
||||
assign where host.address && host.vars.os == "Linux"
|
||||
}
|
||||
|
||||
In this example the `load` service will be created as object for all hosts in the `linux-server`
|
||||
host group. If the `no_load_check` custom attribute is set, the host will be
|
||||
ignored.
|
||||
|
||||
Other detailed scenario examples are used in their respective chapters, for example
|
||||
[apply services with custom command arguments](#using-apply-services-command-arguments).
|
||||
|
||||
#### <a id="using-apply-notifications"></a> Apply Notifications to Hosts and Services
|
||||
|
||||
Notifications are applied to specific targets (`Host` or `Service`) and work in a similar
|
||||
manner:
|
||||
|
||||
|
||||
apply Notification "mail-noc" to Service {
|
||||
import "mail-service-notification"
|
||||
command = "mail-service-notification"
|
||||
|
||||
user_groups = [ "noc" ]
|
||||
|
||||
assign where service.vars.sla == "24x7"
|
||||
assign where host.vars.notification.mail
|
||||
}
|
||||
|
||||
|
||||
In this example the `mail-noc` notification will be created as object for all services having the
|
||||
`sla` custom attribute set to `24x7`. The notification command is set to `mail-service-notification`
|
||||
`notification.mail` custom attribute defined. The notification command is set to `mail-service-notification`
|
||||
and all members of the user group `noc` will get notified.
|
||||
|
||||
#### <a id="using-apply-dependencies"></a> Apply Dependencies to Hosts and Services
|
||||
|
@ -231,9 +339,138 @@ Detailed examples can be found in the [dependencies](#dependencies) chapter.
|
|||
|
||||
### <a id="using-apply-scheduledowntimes"></a> Apply Recurring Downtimes to Hosts and Services
|
||||
|
||||
The sample confituration ships an example in [downtimes.conf](#downtimes-conf).
|
||||
|
||||
Detailed examples can be found in the [recurring downtimes](#recurring-downtimes) chapter.
|
||||
|
||||
|
||||
#### <a id="using-apply-for"></a> Using Apply For Rules
|
||||
|
||||
Next to the standard way of using apply rules there is
|
||||
|
||||
The sample configuration already ships a detailed example in [hosts.conf](#hosts-conf)
|
||||
and [services.conf](#services-conf) for this use case.
|
||||
|
||||
Imagine a different example: You are monitoring your switch (hosts) with many
|
||||
interfaces (services). The following requirements/problems apply:
|
||||
|
||||
* Each interface service check should be named with a prefix and a running number
|
||||
* Each interface has its own vlan tag
|
||||
* Some interfaces have QoS enabled
|
||||
* Additional attributes such as `display_name` or `notes, `notes_url` and `action_url` must be
|
||||
dynamically generated
|
||||
|
||||
By defining the `interfaces` dictionary with three example interfaces on the `core-switch`
|
||||
host object, you'll make sure to pass the storage required by the for loop in the service apply
|
||||
rule.
|
||||
|
||||
|
||||
object Host "core-switch" {
|
||||
import "generic-host"
|
||||
address = "127.0.0.1"
|
||||
|
||||
vars.interfaces["0"] = {
|
||||
port = 1
|
||||
vlan = "internal"
|
||||
address = "127.0.0.2"
|
||||
qos = "enabled"
|
||||
}
|
||||
vars.interfaces["1"] = {
|
||||
port = 2
|
||||
vlan = "mgmt"
|
||||
address = "127.0.1.2"
|
||||
}
|
||||
vars.interfaces["2"] = {
|
||||
port = 3
|
||||
vlan = "remote"
|
||||
address = "127.0.2.2"
|
||||
}
|
||||
}
|
||||
|
||||
You can also omit the `"if-"` string, then all generated service names are directly
|
||||
taken from the `if_name` variable value.
|
||||
|
||||
The config dictionary contains all key-value pairs for the specific interface in one
|
||||
loop cycle, like `port`, `vlan`, `address` and `qos` for the `0` interface.
|
||||
|
||||
By defining a default value for the custom attribute `qos` in the `vars` dictionary
|
||||
before adding the `config` dictionary we''ll ensure that this attribute is always defined.
|
||||
|
||||
After `vars` is fully populated, all object attributes can be set. For strings, you can use
|
||||
string concatention with the `+` operator.
|
||||
|
||||
You can also specifiy the check command that way.
|
||||
|
||||
apply Service "if-" for (if_name => config in host.vars.interfaces) {
|
||||
import "generic-service"
|
||||
check_command = "ping4"
|
||||
|
||||
vars.qos = "disabled"
|
||||
vars += config
|
||||
|
||||
display_name = "if-" + if_name + "-" + vars.vlan
|
||||
|
||||
notes = "Interface check for Port " + string(vars.port) + " in VLAN " + vars.vlan + " on Address " + vars.address + " QoS " + vars.qos
|
||||
notes_url = "http://foreman.company.com/hosts/" + host.name
|
||||
action_url = "http://snmp.checker.company.com/" + host.name + "if-" + if_name
|
||||
|
||||
assign where host.vars.interfaces
|
||||
}
|
||||
|
||||
Note that numbers must be explicitely casted to string when adding to strings.
|
||||
This can be achieved by wrapping them into the [string()](#function-calls) function.
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
> Building configuration in that dynamic way requires detailed information
|
||||
> of the generated objects. Use the `object list` [cli command](#cli-command-object)
|
||||
> after successful [configuration validation](#config-validation).
|
||||
|
||||
|
||||
#### <a id="using-apply-object attributes"></a> Use Object Attributes in Apply Rules
|
||||
|
||||
Since apply rules are evaluated after the generic objects, you
|
||||
can reference existing host and/or service object attributes as
|
||||
values for any object attribute specified in that apply rule.
|
||||
|
||||
object Host "opennebula-host" {
|
||||
import "generic-host"
|
||||
address = "10.1.1.2"
|
||||
|
||||
vars.hosting["xyz"] = {
|
||||
http_uri = "/shop"
|
||||
customer_name = "Customer xyz"
|
||||
customer_id = "7568"
|
||||
support_contract = "gold"
|
||||
}
|
||||
vars.hosting["abc"] = {
|
||||
http_uri = "/shop"
|
||||
customer_name = "Customer xyz"
|
||||
customer_id = "7568"
|
||||
support_contract = "silver"
|
||||
}
|
||||
}
|
||||
|
||||
apply Service for (customer => config in host.vars.hosting) {
|
||||
import "generic-service"
|
||||
check_command = "ping4"
|
||||
|
||||
vars.qos = "disabled"
|
||||
|
||||
vars += config
|
||||
|
||||
vars.http_uri = "/" + vars.customer + "/" + config.http_uri
|
||||
|
||||
display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id
|
||||
|
||||
notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")."
|
||||
|
||||
notes_url = "http://foreman.company.com/hosts/" + host.name
|
||||
action_url = "http://snmp.checker.company.com/" + host.name + "/" + vars.customer_id
|
||||
|
||||
assign where host.vars.hosting
|
||||
}
|
||||
|
||||
### <a id="groups"></a> Groups
|
||||
|
||||
Groups are used for combining hosts, services, and users into
|
||||
|
@ -296,13 +533,16 @@ If there is a certain number of hosts, services, or users matching a pattern
|
|||
it's reasonable to assign the group object to these members.
|
||||
Details on the `assign where` syntax can be found [here](#apply)
|
||||
|
||||
object HostGroup "mssql" {
|
||||
display_name = "MSSQL Servers"
|
||||
assign where host.vars.mssql_port
|
||||
object HostGroup "prod-mssql" {
|
||||
display_name = "Production MSSQL Servers"
|
||||
assign where host.vars.mssql_port && host.vars.prod_mysql_db
|
||||
ignore where host.vars.test_server == true
|
||||
ignore where match("*internal", host.name)
|
||||
}
|
||||
|
||||
In this inherited example from above all hosts with the `vars` attribute `mssql_port`
|
||||
set will be added as members to the host group `mssql`.
|
||||
set will be added as members to the host group `mssql`. All `*internal`
|
||||
hosts or with the `test_server` attribute set to `true` will be ignored.
|
||||
|
||||
## <a id="notifications"></a> Notifications
|
||||
|
||||
|
@ -367,17 +607,20 @@ to the defined notifications. That way you'll save duplicated attributes in each
|
|||
|
||||
The time period `24x7` is shipped as example configuration with Icinga 2.
|
||||
|
||||
|
||||
|
||||
Use the `apply` keyword to create `Notification` objects for your services:
|
||||
|
||||
apply Notification "mail" to Service {
|
||||
apply Notification "notify-cust-xy-mysql" to Service {
|
||||
import "generic-notification"
|
||||
|
||||
command = "mail-notification"
|
||||
users = [ "icingaadmin" ]
|
||||
users = [ "noc-xy", "mgmt-xy" ]
|
||||
|
||||
assign where service.name == "mysql"
|
||||
assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true
|
||||
ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.is_clustered == true)
|
||||
}
|
||||
|
||||
|
||||
Instead of assigning users to notifications, you can also add the `user_groups`
|
||||
attribute with a list of user groups to the `Notification` object. Icinga 2 will
|
||||
send notifications to all group members.
|
||||
|
@ -424,7 +667,7 @@ Define an additional [NotificationCommand](#notification) for SMS notifications.
|
|||
"..."
|
||||
}
|
||||
|
||||
The two new notification escalations are added onto the host `localhost`
|
||||
The two new notification escalations are added onto the local host
|
||||
and its service `ping4` using the `generic-notification` template.
|
||||
The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification`
|
||||
command) after `30m` until `1h`.
|
||||
|
@ -482,8 +725,9 @@ notified, but only for one hour (`2h` as `end` key for the `times` dictionary).
|
|||
Sometimes the problem in question should not be notified when the notification is due
|
||||
(the object reaching the `HARD` state) but a defined time duration afterwards. In Icinga 2
|
||||
you can use the `times` dictionary and set `begin = 15m` as key and value if you want to
|
||||
postpone the first notification for 15 minutes. Leave out the `end` key - if not set,
|
||||
Icinga 2 will not check against any end time for this notification.
|
||||
postpone the notification window for 15 minutes. Leave out the `end` key - if not set,
|
||||
Icinga 2 will not check against any end time for this notification. Make sure to
|
||||
specify a relatively low notification `interval` to get notified soon enough again.
|
||||
|
||||
apply Notification "mail" to Service {
|
||||
import "generic-notification"
|
||||
|
@ -491,7 +735,9 @@ Icinga 2 will not check against any end time for this notification.
|
|||
command = "mail-notification"
|
||||
users = [ "icingaadmin" ]
|
||||
|
||||
times.begin = 15m // delay first notification
|
||||
interval = 5m
|
||||
|
||||
times.begin = 15m // delay notification window
|
||||
|
||||
assign where service.name == "ping4"
|
||||
}
|
||||
|
@ -528,7 +774,7 @@ Available state and type filters for notifications are:
|
|||
|
||||
If you are familiar with Icinga 1.x `notification_options` please note that they have been split
|
||||
into type and state to allow more fine granular filtering for example on downtimes and flapping.
|
||||
You can filter for acknowledgements and custom notifications too.
|
||||
You can filter for acknowledgements and custom notifications too.s and custom notifications too.
|
||||
|
||||
|
||||
## <a id="timeperiods"></a> Time Periods
|
||||
|
@ -1337,13 +1583,33 @@ re-notify if the problem persists.
|
|||
|
||||
## <a id="custom-attributes"></a> Custom Attributes
|
||||
|
||||
### <a id="custom-attributes-apply"></a> Using Custom Attributes for Apply Rules
|
||||
|
||||
Custom attributes are not only used at runtime in command definitions to pass
|
||||
command arguments, but are also a smart way to define patterns and groups
|
||||
for applying objects for dynamic config generation.
|
||||
|
||||
There are several ways of using custom attributes with [apply rules](#using-apply):
|
||||
|
||||
* As simple attribute literal ([number](#numeric-literal), [string](#string-literal),
|
||||
[boolean](#boolean-literal)) for expression conditions (`assign where`, `ignore where`)
|
||||
* As [array](#array) or [dictionary](#dictionary) attribute with nested values
|
||||
(e.g. dictionaries in dictionaries) in [apply for](#using-apply-for) rules.
|
||||
|
||||
Features like [DB IDO](#db-ido), Livestatus(#livestatus) or StatusData(#status-data)
|
||||
dump this column as encoded JSON string, and set `is_json` resp. `cv_is_json` to `1`.
|
||||
|
||||
If arrays are used in runtime macros (for example `$host.groups$`) all entries
|
||||
are separated using the `;` character. If an entry contains a semi-colon itself,
|
||||
it is escaped like this: `entry1;ent\;ry2;entry3`.
|
||||
|
||||
### <a id="runtime-custom-attributes"></a> Using Custom Attributes at Runtime
|
||||
|
||||
Custom attributes may be used in command definitions to dynamically change how the command
|
||||
is executed.
|
||||
|
||||
Additionally there are Icinga 2 features such as the `PerfDataWriter` type
|
||||
which use custom attributes to format their output.
|
||||
which use custom runtime attributes to format their output.
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
|
|
|
@ -280,6 +280,13 @@ Functions can be called using the `()` operator:
|
|||
check_interval = len(MyGroups) * 1m
|
||||
}
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
> Use these functions in [apply](#using-apply) rule expressions.
|
||||
|
||||
assign where match("192.168.*", host.address)
|
||||
|
||||
|
||||
Function | Description
|
||||
--------------------------------|-----------------------
|
||||
regex(pattern, text) | Returns true if the regex pattern matches the text, false otherwise.
|
||||
|
|
|
@ -50,6 +50,7 @@ apply Service "ssh" {
|
|||
check_command = "ssh"
|
||||
|
||||
assign where host.address && host.vars.os == "Linux"
|
||||
ignore where host.name == "localhost" /* for upgrade safety */
|
||||
}
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue