Commit Graph

141 Commits

Author SHA1 Message Date
Noah Hilverling e28277175b Implement concurrent checks limit for remote checks
fixes #4841
2018-01-29 14:50:14 +01:00
Gunnar Beutner c2fb9fe226 Use initializer lists for arrays and dictionaries 2018-01-16 12:27:44 +01:00
Michael Friedrich 211a07f49a Add 'ttl' support for check result freshness via REST API
The `process-check-result` action can now optionally set the
`ttl` parameter. This overrules the configured freshness
check (check_interval).

The main idea behind this is to allow the external sender
to specify when the next check result is coming in.

For example, a backup script which should be run every
24h can specify the exact expected next check result.

The addition to the CheckResult class is necessary to
forward the check result throughout the cluster and
calculate the `next_check` value on each node. This
allows us to send in a check result on a satellite,
and the master determines the freshness and possible
notifications/state changes for Icinga Web 2.
2018-01-15 13:54:11 +01:00
Gunnar Beutner ac155d1dda Apply clang-tidy fix 'modernize-redundant-void-arg' 2018-01-04 12:24:57 +01:00
Michael Insel 158ae2188e Change copyright header for 2018 2018-01-02 12:08:55 +01:00
Jean Flach 2636e6a77a Whitespace fix
What does this change?
* Remove use of spaces for formatting
These could be found by using `grep -r -l -P '^\t+ +[^*]'
* Removal of training whitespaces
* A few lines longer than 120 chars
2017-12-20 14:53:52 +01:00
Gunnar Beutner 1ad83886ac Replace a few more NULLs with nullptr 2017-12-14 15:37:20 +01:00
Gunnar Beutner 325e4a2fb9 Use nullptr instead of <Type>::Ptr() 2017-11-30 17:47:09 +01:00
Michael Friedrich 41d54029c8 Fix log messages for flapping 2017-11-08 12:12:27 +01:00
Jean Flach a21ffd6fe4 Fix flapping
Re-implement flapping following the 'old way' of just observing the last
20 stage changes.

refs #4982
2017-10-24 15:54:05 +02:00
Michael ac0fdd7144 Change more loglines for checkables so checkable is quoted
refs #5528
2017-08-24 13:35:55 +02:00
Thomas Widhalm de9a097a97 Change loglines for checkables so checkable is quoted 2017-08-23 19:11:46 +02:00
Michael Friedrich b7caf0820d Ensure that *.icinga.com is used everywhere
fixes #13897
fixes #13277
2017-01-10 17:19:12 +01:00
Michael Friedrich 7e0c48643b Fix Flapping{Start,End} notifications in SOFT states or downtimes
fixes #12560
fixes #12892
2016-11-10 14:16:02 +01:00
Gunnar Beutner 5fdc874377 Don't generate 'UNKNOWN' results when the endpoint's log is still being resynced
fixes #12844
2016-10-24 08:38:58 +02:00
Michael Friedrich 3571965fef Fix SOFT/HARD state counting logic for check attempts <= 2
fixes #12592
2016-09-27 11:30:57 +02:00
Michael Friedrich ae75575874 Remove unused last_in_downtime field
fixes #12602
2016-08-31 15:21:26 +02:00
Gunnar Beutner 288413f046 Replace BOOST_FOREACH with range-based for loops
fixes #12538
2016-08-25 06:46:17 +02:00
Gunnar Beutner deb938d412 Fix incorrect notifications for soft recoveries
fixes #12529
2016-08-24 12:22:08 +02:00
Michael Friedrich cd1b2cdddd Fix that recovery notifications are sent in SOFT NOT-OK states
fixes #12517
2016-08-23 14:58:24 +02:00
Michael Friedrich 42818ab758 Fix downtime notification events and missing author/comment
fixes #12333
fixes #11851
2016-08-10 16:04:37 +02:00
Gunnar Beutner 1beef64dc4 Fix crash in Checkable::ProcessCheckResult when cr is NULL
refs #12329
2016-08-08 14:17:44 +02:00
Gunnar Beutner 597dc0dea2 Fix incorrect behavior for max_check_attempts
fixes #11898
2016-08-08 11:02:08 +02:00
Michael Friedrich 3f89a6dd09 Disable immediate hard state for first check result
fixes #7354
2016-08-04 16:16:58 +02:00
Michael Friedrich 34655d77d3 Ensure to send recovery notifications if the was a problem notification before a downtime
fixes #12293
2016-08-03 18:28:09 +02:00
Michael Friedrich cdd858a0ec Flapping{Start,End} notifications must not depend on state changes
fixes #11899
2016-06-15 17:43:37 +02:00
Michael Friedrich f7f976b962 DB IDO: Ensure that SOFT state changes with the same state are logged
fixes #11933
2016-06-14 11:08:28 +02:00
Gunnar Beutner a8209c1a1a Change which instance is responsible for initiating notifications in a HA setup
refs #9242
2016-06-14 07:57:52 +02:00
Markus Frosch 8808e709c9 Make change to OK always a hard state
refs #11654
2016-06-13 10:43:57 +02:00
Gunnar Beutner 0eb0992d5e Fix custom notifications in a HA zone
fixes #9242
2016-06-07 12:44:12 +02:00
Gunnar Beutner aeb7a4a70b Fix incorrect check interval for SOFT->HARD transitions
fixes #11825
2016-05-24 11:05:29 +02:00
Michael Friedrich d49b63d2ab Fix: First HARD state does not change retry_interval to check_interval
refs #11825
2016-05-21 18:58:19 +02:00
Michael Friedrich 3f1a9f150b Silence compiler warnings
refs #11823
2016-05-21 14:16:47 +02:00
Michael Friedrich b4843dc81b Fix: Volatile check results for OK->OK transitions are logged into DB IDO statehistory
fixes #11823
2016-05-21 13:41:43 +02:00
Gunnar Beutner 97a5091abc Fix incorrect re-scheduling behavior for command_endpoint checks
refs #8137
2016-05-12 13:47:32 +02:00
Michael Friedrich ba82d2eb20 Move CalculateExecutionTime and CalculateLatency into the CheckResult class
fixes #11751
2016-05-10 12:16:49 +02:00
Gunnar Beutner f6f3bd1e4c Implement support for limiting the number of concurrent checks
fixes #8137
2016-05-10 11:26:55 +02:00
Gunnar Beutner c6a015e317 Fix crash in Checkable::ExecuteCheck
fixes #11582
2016-04-19 09:37:04 +02:00
Michael Friedrich a30cb86ca1 Only call UpdateNextCheck() for soft states
refs #11336
2016-03-15 14:02:19 +01:00
Michael Friedrich d682f56c38 Use UpdateNextCheck() for determining the retry_interval in ProcessCheckResult()
This patch also moves the next check updates for passive
check results into ProcessCheckResult(). That way the
next check status updates for DB IDO work in a sane way
again.

refs #11336
2016-03-15 13:02:38 +01:00
Michael Friedrich 3bd6848763 Refactor patch for host recovery notifications
refs #10225
2016-03-15 09:47:59 +01:00
Michael Friedrich 3e050bd0cd Fix: Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications
fixes #11320
2016-03-11 13:19:03 +01:00
Michael Friedrich 7fb8bcd933 Use retry_interval on first OK -> NOT-OK state change
Only valid for active check results. The API actions were
missing that marker similar to the external command processor.

The initial OK -> NOT-OK transition should use the retry_interval
but nothing else.

fixes #11336
2016-03-11 12:00:30 +01:00
Michael Friedrich 5b6a6f86b1 Fix host recovery notifications for warning states
fixes  #10225
2016-03-11 09:29:07 +01:00
Michael Friedrich ef532f20eb Revert "Fix check scheduling w/ retry_interval"
This reverts commit a51e647cc7.

This patch causes trouble with check results received
1) passively 2) throughout the cluster. A proper patch
for setting the retry_interval on NOT-OK state changes
is required.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287
2016-03-05 18:16:49 +01:00
Michael Friedrich b8e3d61820 Revert "Properly set the next check time for active and passive checks"
This reverts commit 2a11b27972.

This patch does not properly work and breaks the check_interval setting
for passive checks. Requires a proper patch.

refs #11248
refs #11257
refs #11273

(the old issue)
refs #7287
2016-03-05 18:15:03 +01:00
Sebastian Chrostek 83845e609e Fix problem notifications while flapping is active
fixes #9969
fixes #9642
2016-02-23 16:27:22 +01:00
Gunnar Beutner e224e74994 Make sure the "syncing" attribute is set to false
refs #11083
2016-02-08 13:15:24 +01:00
Gunnar Beutner 6d5014b610 Increase grace period for agent-based checks
refs #11020
2016-02-08 09:46:01 +01:00
Michael Friedrich 7a3848af1e Remove debug output
refs #11014
2016-01-29 14:03:58 +01:00
Michael Friedrich b960850ce3 DB IDO: Only update 'next_check' column when manually scheduling a check
Otherwise the changes from #7287 already take care of setting
the proper next check time from inside ProcessCheckResult().

There is no need to use the generic OnNextCheckChanged signal
but instead we're using a new one, locally just for DB IDO.

fixes #11019
2016-01-22 18:42:15 +01:00
Michael Friedrich 2a11b27972 Properly set the next check time for active and passive checks
fixes #7287
refs #11019
2016-01-22 18:40:14 +01:00
Gunnar Beutner 72c3b6d75b Make sure we're not running command_endpoint-based checks more than once
refs #10963
2016-01-21 10:37:47 +01:00
Michael Friedrich a51e647cc7 Fix check scheduling w/ retry_interval
fixes #7287
2016-01-20 16:29:01 +01:00
Gunnar Beutner 599929b0f6 Update copyright headers for 2016 2016-01-12 08:29:59 +01:00
Gunnar Beutner e3c75faabc Implement support for recursive object locks
fixes #10596
2015-11-11 10:21:30 +01:00
Michael Friedrich 43976d3989 Add host.last_state_{up,down} and last_check attribute, hide *_raw attributes
fixes #10508
fixes #10509
2015-11-02 14:10:44 +01:00
Gunnar Beutner 4aa0165701 Add getter for endpoint 'connected' attribute
fixes #10394
2015-10-22 10:52:38 +02:00
Michael Friedrich 286538c17e Implement api event streams
Documentation is not yet complete.

refs #9078
2015-10-21 15:34:26 +02:00
Gunnar Beutner 1a6b41787a Implement joins for status queries
fixes #10060
2015-09-22 09:45:23 +02:00
Michael Friedrich 3403765900 Use the command_endpoint name as check_source value if defined
fixes #9218
2015-09-05 15:18:10 +02:00
Michael Friedrich d7970f5bb1 Implement modified attributes v2
refs #9081
refs #9093
2015-08-15 20:07:10 +02:00
Gunnar Beutner 10441e9cd7 Fix permissions for agent CheckResult messages
fixes #8821
2015-03-30 13:50:14 +02:00
Michael Friedrich 05c237c780 Don't increment check attempt counter on OK->NOT-OK transition
refs #7287

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2015-03-11 16:33:36 +01:00
James Pharaoh 9fe52d0dc1 Make checks using 'command_endpoint' work inside HA zones
Previously there was no local processing of the executed
check result, which is mandatory inside a HA cluster.

Additionally this patch splits the command execution and
check result processing into more logical parts, executing
local checks, checks on the same command endpoint, and
remote checks.

More details in the referenced issue.

fixes #8249

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2015-02-12 17:53:50 +01:00
Michael Friedrich 78bfd0204c Update copyright year 2015-01-22 12:00:23 +01:00
Michael Friedrich 6ae9685cee Fix sending notifications for volatile checks on OK->OK changes
volatile checks make state changes behave like HARD state changes.
Though OK -> OK transitions must not be notified.

Increased log information for notifications too.

fixes #8063
2015-01-08 16:20:44 +01:00
Michael Friedrich d11286e9a5 DB IDO: Update child object reachability if parentchanges to !{OK,UP}
fixes #7683
2014-12-12 16:12:05 +01:00
Michael Friedrich cc8fe684fe Execute checks locally if command_endpoint == local endpoint
fixes #7863
2014-12-05 11:35:00 +01:00
Gunnar Beutner 7321e45abc Implement support for executing remote commands
fixes #7559
2014-11-13 14:54:55 +01:00
Gunnar Beutner 478f03b49a Replace boost::shared_ptr with boost::intrusive_ptr
refs #7622
2014-11-09 16:54:41 +01:00
Gunnar Beutner 2d5e9514a5 Refactor logging code 2014-10-19 17:52:17 +02:00
Gunnar Beutner b18f57a745 Remove logger_fwd.hpp 2014-10-19 14:50:39 +02:00
Michael Friedrich 31c9406684 Add OnCheckPeriodChanged event
refs #5219
2014-08-26 17:11:19 +02:00
Michael Friedrich 0db1b5095d Add OnMaxCheckAttemptsChanged event
refs #5219
2014-08-26 17:11:19 +02:00
Michael Friedrich 552d0a7d18 Add On{Event,Check}CommandChanged event
refs #5219
2014-08-26 17:11:19 +02:00
Michael Friedrich 3899601744 Add On{Check,Retry}IntervalChanged event
refs #5219
2014-08-26 17:11:19 +02:00
Gunnar Beutner 2d6ed4c9be Make sure that event handlers are run for hard recoveries
fixes #6686
2014-07-22 14:16:22 +02:00
Gunnar Beutner ec92309349 Don't run event commands when hosts/services are OK
fixes #6686
2014-07-16 11:48:36 +02:00
Michael Friedrich 3ecec31af3 Change log message identifier for libicinga.
Refs #6346
2014-05-28 14:42:00 +02:00
Michael Friedrich e070db65c8 Fix check statistics are mixing host/service checks.
Fixes #6313
2014-05-26 20:56:59 +02:00
Gunnar Beutner 632026cd9f Rename C++ header files.
Fixes #6291
2014-05-25 16:27:14 +02:00
Michael Friedrich 1436575095 Fix incorrect host state change logs.
Fixes #6290
2014-05-25 12:45:29 +02:00
Gunnar Beutner 820b1a340c Improve log messages.
Refs #6070
2014-05-23 19:07:44 +02:00
Michael Friedrich 1df7518b35 Move more log messages to 'notice' severity.
Refs #6070
2014-05-22 23:47:03 +02:00
Gunnar Beutner 657b3c6a1a Fix deadlock in db_ido.
Fixes #6230
2014-05-19 10:56:50 +02:00
Gunnar Beutner 9c3e399188 Remove unnecessary includes.
Fixes #6189
2014-05-11 18:11:32 +02:00
Gunnar Beutner 45270f1bb8 Refactor the agent and cluster components.
Refs #6107
2014-05-08 09:13:04 +02:00
Gunnar Beutner 2961364e97 Implement support for agent-based checks.
Refs #4865
2014-04-12 04:21:09 +02:00
Gunnar Beutner 1c115297f9 Rename the service state constants.
Fixes #5964
2014-04-08 09:11:54 +02:00
Gunnar Beutner 23e9630682 Implement host checks.
Refs #5919
2014-04-04 15:57:54 +02:00