icinga2/doc/3.04-notifications.md

8.8 KiB

Notifications

Notifications for service and host problems are an integral part of your monitoring setup.

There are many ways of sending notifications, e.g. by e-mail, XMPP, IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications. Instead it relies on external mechanisms such as shell scripts to notify users.

A notification specification requires one or more users (and/or user groups) who will be notified in case of problems. These users must have all custom attributes defined which will be used in the NotificationCommand on execution.

TODO

The user icingaadmin in the example below will get notified only on WARNING and CRITICAL states and problem and recovery notification types.

object User "icingaadmin" {
  display_name = "Icinga 2 Admin"
  enable_notifications = 1
  notification_state_filter = [ OK, Warning, Critical ]
  notification_type_filter = [ Problem, Recovery ]
  vars.email = "icinga@localhost"
  vars.pager = "+49123456789"
  }
}

If you don't set the notification_state_filter and notification_type_filter configuration attributes for the User object, all states and types will be notified.

You should choose which information you (and your notified users) are interested in case of emergency, and also which information does not provide any value to you and your environment.

There are various custom attributes available at runtime execution of the NotificationCommand. The example below may or may not fit your needs.

object NotificationCommand "mail-service-notification" {
  import "plugin-notification-command"

  command = [ SysconfDir + "/icinga2/scripts/mail-notification.sh" ]

  env = {
    "NOTIFICATIONTYPE" = "$notification.type$"
    "SERVICEDESC" = "$service.description$"
    "HOSTALIAS" = "$host.displayname$",
    "HOSTADDRESS" = "$host.vars.address$",
    "SERVICESTATE" = "$service.state$",
    "LONGDATETIME" = "$icinga.longdatetime$",
    "SERVICEOUTPUT" = "$service.output$",
    "NOTIFICATIONAUTHORNAME" = "$notification.author$",
    "NOTIFICATIONCOMMENT" = "$notification.comment$",
	"HOSTDISPLAYNAME" = "$host.displayname$",
    "SERVICEDISPLAYNAME" = "$service.displayname$",
    "USEREMAIL" = "$user.vars.email$"
  }
}

The command attribute in the mail-service-notification command refers to the shell script installed into /etc/icinga2/scripts/mail-notification.sh. The macros specified in the env dictionary are exported as environment variables and can be used in the notification script.

You can add all shared attributes to a Notification template which is inherited to the defined notifications. That way you'll save duplicated attributes in each Notification object. Attributes can be overridden locally.

TODO

template Notification "generic-notification" {
  notification_interval = 15m

  notification_command = "mail-service-notification"

  notification_state_filter = [ Warning, Critical, Unknown ]
  notification_type_filter = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
                               FlappingEnd, DowntimeStart,DowntimeEnd, DowntimeRemoved ]

  notification_period = "24x7"
}

The time period 24x7 is shipped as example configuration with Icinga 2.

Use the apply keyword to create Notification objects for your services:

apply Notification "mail" to Service {
  import "generic-notification"

  notification_command = "mail-notification"
  users = [ "icingaadmin" ]

  assign where service.name == "mysql"
}

Instead of assigning users to notifications, you can also add the user_groups attribute with a list of user groups to the Notification object. Icinga 2 will resolve all group members and send notifications to all of them.

Notification Escalations

When a problem notification is sent and a problem still exists after re-notification you may want to escalate the problem to the next support level. A different approach is to configure the default notification by email, and escalate the problem via sms if not already solved.

You can define notification start and end times as additional configuration attributes making the Notification object a so-called notification escalation. Using templates you can share the basic notification attributes such as users or the notification_interval (and override them for the escalation then).

Using the example from above, you can define additional users being escalated for sms notifications between start and end time.

object User "icinga-oncall-2nd-level" {
  display_name = "Icinga 2nd Level"
  enable_notifications = true

  vars.mobile = "+1 555 424642"
}

object User "icinga-oncall-1st-level" {
  display_name = "Icinga 1st Level"
  enable_notifications = true

  vars.mobile = "+1 555 424642"
  }
}

Define an additional NotificationCommand for SMS notifications.

Note

The example is not complete as there are many different SMS providers. Please note that sending SMS notifications will require an SMS provider or local hardware with a sim card active.

object NotificationCommand "sms-notification" {
   command = PluginDir + "/send_sms_notification $mobile$ ..."
}

The two new notification escalations are added onto the host localhost and its service ping4 using the generic-notification template. The user icinga-oncall-2nd-level will get notified by SMS (sms-notification command) after 30m until 1h.

Note

The notification_interval was set to 15m in the generic-notification template example. Lower that value in your escalations by using a secondary template or overriding the attribute directly in the notifications array position for escalation-sms-2nd-level.

If the problem does not get resolved or acknowledged preventing further notifications the escalation-sms-1st-level user will be escalated 1h after the initial problem was notified, but only for one hour (2h as end key for the times dictionary).

apply Notification "mail" to Service {
  import "generic-notification"
  notification_command = "mail-notification"
  users = [ "icingaadmin" ]

  assign where service.name == "ping4"
}

apply Notification "escalation-sms-2nd-level" to Service {
  import "generic-notification"
  notification_command = "sms-notification"
  users = [ "icinga-oncall-2nd-level" ]
      
  times = {
    begin = 30m
    end = 1h
  }

  assign where service.name == "ping4"
}

apply Notification "escalation-sms-1st-level" to Service {
  import "generic-notification"
  notification_command = "sms-notification"
  users = [ "icinga-oncall-1st-level" ]
      
  times = {
    begin = 1h
    end = 2h
  }

  assign where service.name == "ping4"
}        

Instead of assigning users to notifications, you can also add the user_groups attribute with a list of user groups to the Notification object. Icinga 2 will resolve all group members and send notifications and notification escalations to all of them.

First Notification Delay

Sometimes the problem in question should not be notified when the first notification happens, but a defined time duration afterwards. In Icinga 2 you can use the times dictionary and set begin = 15m as key and value if you want to suppress notifications in the first 15 minutes. Leave out the end key - if not set, Icinga 2 will not check against any end time for this notification.

apply Notification "mail" to Service {
  import "generic-notification"
  notification_command = "mail-notification"
  users = [ "icingaadmin" ]
      
  times.begin = 15m // delay first notification

  assign where service.name == "ping4"
}

Notification Filters by State and Type

If there are no notification state and type filter attributes defined at the Notification or User object Icinga 2 assumes that all states and types are being notified.

Available state and type filters for notifications are:

template Notification "generic-notification" {

  notification_state_filter = [ Warning, Critical, Unknown ]
  notification_type_filter = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart,
                               FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ]
}

If you are familiar with Icinga 1.x notification_options please note that they have been split into type and state, and allow more fine granular filtering for example on downtimes and flapping. You can filter for acknowledgements and custom notifications too.