Commit Graph

92 Commits

Author SHA1 Message Date
Jean Flach 1a9c1591c0 Fix check behavior on restart
This patch changes the way checkresults are handled during a restart.

  1. Check results coming in during a shutdown are ignored.
  2. Upon start, checks which should have ran (next_check in the past),
  are re-scheduled within the first minute.

This new behavior means there will be no more "Unknown - Terminated"
checkresults during a restart and checks with high check_interval will
be run earlier if they were already scheduled to run. The downside is
that after Icinga2 was down for a while, there will be a lot of checks
within the first minute. Our max concurrent check should take care of
this though.
2018-04-10 15:52:50 +02:00
Noah Hilverling e28277175b Implement concurrent checks limit for remote checks
fixes #4841
2018-01-29 14:50:14 +01:00
Gunnar Beutner c2fb9fe226 Use initializer lists for arrays and dictionaries 2018-01-16 12:27:44 +01:00
Michael Friedrich 211a07f49a Add 'ttl' support for check result freshness via REST API
The `process-check-result` action can now optionally set the
`ttl` parameter. This overrules the configured freshness
check (check_interval).

The main idea behind this is to allow the external sender
to specify when the next check result is coming in.

For example, a backup script which should be run every
24h can specify the exact expected next check result.

The addition to the CheckResult class is necessary to
forward the check result throughout the cluster and
calculate the `next_check` value on each node. This
allows us to send in a check result on a satellite,
and the master determines the freshness and possible
notifications/state changes for Icinga Web 2.
2018-01-15 13:54:11 +01:00
Gunnar Beutner ac155d1dda Apply clang-tidy fix 'modernize-redundant-void-arg' 2018-01-04 12:24:57 +01:00
Michael Insel 158ae2188e Change copyright header for 2018 2018-01-02 12:08:55 +01:00
Jean Flach 2636e6a77a Whitespace fix
What does this change?
* Remove use of spaces for formatting
These could be found by using `grep -r -l -P '^\t+ +[^*]'
* Removal of training whitespaces
* A few lines longer than 120 chars
2017-12-20 14:53:52 +01:00
Gunnar Beutner 1ad83886ac Replace a few more NULLs with nullptr 2017-12-14 15:37:20 +01:00
Gunnar Beutner 325e4a2fb9 Use nullptr instead of <Type>::Ptr() 2017-11-30 17:47:09 +01:00
Michael Friedrich 41d54029c8 Fix log messages for flapping 2017-11-08 12:12:27 +01:00
Jean Flach a21ffd6fe4 Fix flapping
Re-implement flapping following the 'old way' of just observing the last
20 stage changes.

refs #4982
2017-10-24 15:54:05 +02:00
Michael ac0fdd7144 Change more loglines for checkables so checkable is quoted
refs #5528
2017-08-24 13:35:55 +02:00
Thomas Widhalm de9a097a97 Change loglines for checkables so checkable is quoted 2017-08-23 19:11:46 +02:00
Michael Friedrich b7caf0820d Ensure that *.icinga.com is used everywhere
fixes #13897
fixes #13277
2017-01-10 17:19:12 +01:00
Michael Friedrich 7e0c48643b Fix Flapping{Start,End} notifications in SOFT states or downtimes
fixes #12560
fixes #12892
2016-11-10 14:16:02 +01:00
Gunnar Beutner 5fdc874377 Don't generate 'UNKNOWN' results when the endpoint's log is still being resynced
fixes #12844
2016-10-24 08:38:58 +02:00
Michael Friedrich 3571965fef Fix SOFT/HARD state counting logic for check attempts <= 2
fixes #12592
2016-09-27 11:30:57 +02:00
Michael Friedrich ae75575874 Remove unused last_in_downtime field
fixes #12602
2016-08-31 15:21:26 +02:00
Gunnar Beutner 288413f046 Replace BOOST_FOREACH with range-based for loops
fixes #12538
2016-08-25 06:46:17 +02:00
Gunnar Beutner deb938d412 Fix incorrect notifications for soft recoveries
fixes #12529
2016-08-24 12:22:08 +02:00
Michael Friedrich cd1b2cdddd Fix that recovery notifications are sent in SOFT NOT-OK states
fixes #12517
2016-08-23 14:58:24 +02:00
Michael Friedrich 42818ab758 Fix downtime notification events and missing author/comment
fixes #12333
fixes #11851
2016-08-10 16:04:37 +02:00
Gunnar Beutner 1beef64dc4 Fix crash in Checkable::ProcessCheckResult when cr is NULL
refs #12329
2016-08-08 14:17:44 +02:00
Gunnar Beutner 597dc0dea2 Fix incorrect behavior for max_check_attempts
fixes #11898
2016-08-08 11:02:08 +02:00
Michael Friedrich 3f89a6dd09 Disable immediate hard state for first check result
fixes #7354
2016-08-04 16:16:58 +02:00
Michael Friedrich 34655d77d3 Ensure to send recovery notifications if the was a problem notification before a downtime
fixes #12293
2016-08-03 18:28:09 +02:00
Michael Friedrich cdd858a0ec Flapping{Start,End} notifications must not depend on state changes
fixes #11899
2016-06-15 17:43:37 +02:00
Michael Friedrich f7f976b962 DB IDO: Ensure that SOFT state changes with the same state are logged
fixes #11933
2016-06-14 11:08:28 +02:00
Gunnar Beutner a8209c1a1a Change which instance is responsible for initiating notifications in a HA setup
refs #9242
2016-06-14 07:57:52 +02:00
Markus Frosch 8808e709c9 Make change to OK always a hard state
refs #11654
2016-06-13 10:43:57 +02:00
Gunnar Beutner 0eb0992d5e Fix custom notifications in a HA zone
fixes #9242
2016-06-07 12:44:12 +02:00
Gunnar Beutner aeb7a4a70b Fix incorrect check interval for SOFT->HARD transitions
fixes #11825
2016-05-24 11:05:29 +02:00
Michael Friedrich d49b63d2ab Fix: First HARD state does not change retry_interval to check_interval
refs #11825
2016-05-21 18:58:19 +02:00
Michael Friedrich 3f1a9f150b Silence compiler warnings
refs #11823
2016-05-21 14:16:47 +02:00
Michael Friedrich b4843dc81b Fix: Volatile check results for OK->OK transitions are logged into DB IDO statehistory
fixes #11823
2016-05-21 13:41:43 +02:00
Gunnar Beutner 97a5091abc Fix incorrect re-scheduling behavior for command_endpoint checks
refs #8137
2016-05-12 13:47:32 +02:00
Michael Friedrich ba82d2eb20 Move CalculateExecutionTime and CalculateLatency into the CheckResult class
fixes #11751
2016-05-10 12:16:49 +02:00
Gunnar Beutner f6f3bd1e4c Implement support for limiting the number of concurrent checks
fixes #8137
2016-05-10 11:26:55 +02:00
Gunnar Beutner c6a015e317 Fix crash in Checkable::ExecuteCheck
fixes #11582
2016-04-19 09:37:04 +02:00
Michael Friedrich a30cb86ca1 Only call UpdateNextCheck() for soft states
refs #11336
2016-03-15 14:02:19 +01:00
Michael Friedrich d682f56c38 Use UpdateNextCheck() for determining the retry_interval in ProcessCheckResult()
This patch also moves the next check updates for passive
check results into ProcessCheckResult(). That way the
next check status updates for DB IDO work in a sane way
again.

refs #11336
2016-03-15 13:02:38 +01:00
Michael Friedrich 3bd6848763 Refactor patch for host recovery notifications
refs #10225
2016-03-15 09:47:59 +01:00
Michael Friedrich 3e050bd0cd Fix: Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications
fixes #11320
2016-03-11 13:19:03 +01:00
Michael Friedrich 7fb8bcd933 Use retry_interval on first OK -> NOT-OK state change
Only valid for active check results. The API actions were
missing that marker similar to the external command processor.

The initial OK -> NOT-OK transition should use the retry_interval
but nothing else.

fixes #11336
2016-03-11 12:00:30 +01:00
Michael Friedrich 5b6a6f86b1 Fix host recovery notifications for warning states
fixes  #10225
2016-03-11 09:29:07 +01:00
Michael Friedrich ef532f20eb Revert "Fix check scheduling w/ retry_interval"
This reverts commit a51e647cc7.

This patch causes trouble with check results received
1) passively 2) throughout the cluster. A proper patch
for setting the retry_interval on NOT-OK state changes
is required.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287
2016-03-05 18:16:49 +01:00
Michael Friedrich b8e3d61820 Revert "Properly set the next check time for active and passive checks"
This reverts commit 2a11b27972.

This patch does not properly work and breaks the check_interval setting
for passive checks. Requires a proper patch.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287
2016-03-05 18:15:03 +01:00
Sebastian Chrostek 83845e609e Fix problem notifications while flapping is active
fixes #9969
fixes #9642
2016-02-23 16:27:22 +01:00
Gunnar Beutner e224e74994 Make sure the "syncing" attribute is set to false
refs #11083
2016-02-08 13:15:24 +01:00
Gunnar Beutner 6d5014b610 Increase grace period for agent-based checks
refs #11020
2016-02-08 09:46:01 +01:00