6120 Commits

Author SHA1 Message Date
Alexander Aleksandrovič Klimov
9293dea787
Merge pull request #9303 from Icinga/bugfix/icingadb-sticky-comments-2.13
Icinga DB: correct ack comments' is_sticky
2022-03-30 14:15:27 +02:00
Alexander Aleksandrovič Klimov
5e0ac8c457
Merge pull request #9306 from Icinga/bugfix/add-some-missing-locks-2.13
Add some missing locks to prevent data races
2022-03-30 10:53:38 +02:00
Alexander A. Klimov
df3be79194 Icinga DB: correct ack comments' is_sticky
On ack Icinga first adds a comment, then acks the checkable
so the ack event has the comment ID.

But due to the yet missing ack the comment was missing is_sticky.
That's corrected now.
2022-03-30 09:45:39 +02:00
Alexander A. Klimov
45b723644c Introduce Comment#sticky
Carries whether ack was sticky for ack comments.
2022-03-30 09:45:39 +02:00
Alexander Aleksandrovič Klimov
5f1e0ee2aa
Merge pull request #9302 from Icinga/bugfix/icingadb-ignore-acks-in-comment-history-2.13
IcingaDB#SendRemovedComment(): ignore ack comments like #SendAddedCom…
2022-03-29 21:37:29 +02:00
Alexander Aleksandrovič Klimov
5a65190d02
Merge pull request #9301 from Icinga/bugfix/icingadb-remove-comment-history-2.13
Icinga DB: discard comment removals with missing information
2022-03-29 20:29:06 +02:00
Yonas Habteab
36c75218e4 ConfigObject: Initialize local static var at declaration to ensure thread safety 2022-03-29 16:38:09 +02:00
Yonas Habteab
178eb928e6 ConfigItem: Use atomic variables for notified and commited items count 2022-03-29 16:38:09 +02:00
Alexander A. Klimov
8d3a847998 IcingaDB#SendRemovedComment(): ignore ack comments like #SendAddedComment()
Icinga DB doesn't expect comment history for ack comments.

Before:

1. Acked checkable recovers
2. Icinga clears ack comments w/o setting removal time
3. Icinga DB gets neither removal time, nor expire time
4. Icinga DB falls back to NULL and violates NOT NULL constraint
2022-03-29 16:16:06 +02:00
Julian Brost
41043f86a8 Icinga DB: discard comment removals with missing information
If comments get removed in unintended ways (i.e. not by expiring or by using
the remove-comment API action), the comment object misses information to create
a proper history event for Icinga DB. Therefor, discard these events.
2022-03-29 15:56:25 +02:00
Julian Brost
de8960feaa Add missing array locking in IcingaDB::GetArrayDeletedValues()
icinga::Array requires locking by the caller when iterating using Begin() and
End(). This is only checked in debug builds but there it makes this function
fail.
2022-03-29 15:54:07 +02:00
Julian Brost
f67a5532dc
Merge pull request #9285 from Icinga/bugfix/suppressed-state-notifications-2.13
Checkable: send state notifications after suppression if and only if the state differs compared to before the suppression started
2022-03-29 15:16:04 +02:00
Julian Brost
332dcb8bec
Merge pull request #9271 from Icinga/feature/icingadb-redis-lost-history-memory-213
Icinga DB: keep history in memory until written to Redis
2022-03-29 13:57:48 +02:00
Julian Brost
3be1202eb3
Merge pull request #9290 from Icinga/bugfix/override-default-template-apply-rules-7914
Apply rules: import default templates first
2022-03-29 13:55:41 +02:00
Alexander A. Klimov
07cd15f48f Apply rules: import default templates first
... to allow to override the attributes they set.

refs #7914
2022-03-24 14:04:58 +01:00
Julian Brost
ccb18a04ec Checkable: Add test for state notifications after a suppression ends 2022-03-09 17:06:09 +01:00
Julian Brost
6303d8df09 Checkable: sync state_before_suppression in cluster
This ensures that in case of a failover in an HA zone, the other can take over
properly and has the required state to send the proper notifications.
2022-03-09 17:06:09 +01:00
Julian Brost
29fc3ad151 Checkable: improve state notifications after suppression ends
This commit changes the Checkable notification suppression logic (notifications
are currently suppressed on the Checkable if it is unreachable, in a downtime,
or acknowledged) to that after the suppression reason ends, a state
notification is sent if and only if the first hard state after is different
from the last hard state from before. If the checkable is in a soft state after
the suppression ends, the notification is further suppressed until a hard state
is reached.

To achieve this behavior, a new attribute state_before_suppression is added to
Checkable. This attribute is set to the last hard state the first time either a
PROBLEM or a RECOVERY notification is suppressed. Compared to from before,
neither of these two flags in the suppressed_notification will ever be cleared
while the supression is still ongoing but only after the suppression ended and
the current state is compared with the old state stored in
state_before_suppression.
2022-03-09 17:06:09 +01:00
Julian Brost
e09eaa3ad2
Merge pull request #9239 from Icinga/bugfix/adjust-behavior-of-service-get-severity
Service#GetSeverity(): behave as the respective IDO query of Icinga Web
2022-03-08 16:34:18 +01:00
Julian Brost
316dd110e2
Merge pull request #9237 from Icinga/bugfix/influxdb-writer-synchronization
Fix unsafe concurrent access to m_DataBuffer in InfluxdbCommonWriter
2022-03-08 16:25:14 +01:00
Julian Brost
dbe13e2f32
Merge pull request #9238 from Icinga/bugfix/timeperiod-dst-2.0
LegacyTimePeriod::ScriptFunc: fix DST edge-cases
2022-03-08 15:22:09 +01:00
Julian Brost
12293d999c
Merge pull request #9190 from Icinga/bugfix/sync-missing-history-information-213
Icinga DB: ensure consistent history streams in HA setup
2022-03-07 11:32:15 +01:00
Julian Brost
db84b834ca
Merge pull request #9214 from Icinga/feature/icingadb-add-previous_soft_state-to-host_state-and-service_state-9210-213
IcingaDB: Add previous_soft_state to host_state and service_state
2022-03-07 11:19:50 +01:00
Julian Brost
50ef32a0ad
Merge pull request #9228 from Icinga/bugfix/processcheckresult-dependency-deadlock-2.13
Prevent deadlock in ProcessCheckResult
2022-03-07 11:16:00 +01:00
Julian Brost
c6bac19da8
Merge pull request #9241 from Icinga/bugfix/icingadb-reachabilitychangehandler-9143
Icinga DB: ensure is_reachable and severity don't miss updates
2022-03-07 09:27:35 +01:00
Julian Brost
c68ab0f3bc
Merge pull request #9242 from Icinga/bugfix/multi-ido-notification-id
IDO: fix incorrect contacts in notification history with multiple IDO instances on a single node
2022-03-07 09:22:27 +01:00
Alexander A. Klimov
a9c6161def IcingaDB#Stop(): don't block shutdown, timeout instead 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
f1a942681f IcingaDB#Send*(): don't enqueue any history once stopped 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
2ab52d5c59 RedisConnection#Connect(): wait for all promises to be completed
by the read loop from the previous connection.
2022-03-03 09:57:03 +01:00
Alexander A. Klimov
1d1e2b2888 Introduce IoEngine::YieldCurrentCoroutine() 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
59dd3592c2 RedisConnection#ReadLoop(): don't crash (silently) if a promise to be set is already set 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
b7426f4ee6 Icinga DB: include amount of history kept in memory in /v1/status 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
3cc82069cd Icinga DB: log amount of history kept in memory every 10s 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
0137713d15 Icinga DB: keep history in memory until written to Redis
by putting the messages into a Bulker and retrying each chunk.
2022-03-03 09:57:03 +01:00
Alexander A. Klimov
8538ba97aa Introduce Bulker 2022-03-03 09:57:03 +01:00
Julian Brost
53a389769c
Merge pull request #9260 from Icinga/bugfix/event-handler-spamming-8704-213
Checkable#ExecuteEventHandler(): don't outsource event command run twice
2022-02-25 16:52:27 +01:00
Alexander A. Klimov
74935dad7b Checkable#ExecuteEventHandler(): don't outsource event command run twice
refs #8704
2022-02-24 14:03:57 +01:00
Alexander A. Klimov
88b041c7c9 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged less often
Call it only on state changes to reduce no-op Redis/IDO updates a lot.

refs #9143
2022-02-23 16:06:31 +01:00
Alexander A. Klimov
4ea65076b0 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged last
to ensure Checkable#IsReachable() returns correctly for dependency children inside OnReachabilityChanged().
That needs the dependency parent to be already in the correct state.

refs #9143
2022-02-23 16:06:31 +01:00
Alexander A. Klimov
3c8c248508 Icinga DB: ensure is_reachable and severity don't miss updates
refs #9143
2022-02-23 16:06:31 +01:00
Julian Brost
175fd2df44 Icinga DB: remove obsolete StateChangeHandler overload
This version of StateChangeHandler is no longer called anywhere as it was the
wrong function for all previous callers anyways.
2022-02-23 11:08:56 +01:00
Julian Brost
b60d3e5665 Icinga DB: make host problem change events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
3466dacaac Icinga DB: make acknowledgement events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
5ea0208b83 Icinga DB: make downtime events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
f671374762 Icinga DB: don't reimplement volatile state update in SendConfigUpdate
Sending a volatile state update is already implemented in UpdateState, so just
use that function instead of generating the update queries.
2022-02-23 11:08:56 +01:00
Julian Brost
244575ecbd Icinga DB: Merge SendStatusUpdate into UpdateState
Previously, both funktions did related operations but had unclear and confusing
naming:
- UpdateState updated the icinga:{host,service}:state Redis keys.
- SendStatusUpdate sent a runtime update for the icinga:{host,service}:state.

This commit merges both functions into one with a new mode parameter. The
following modes are now supported:
- Volatile: Update the icinga:{host,service}:state Redis key.
- Full: Perform the volatile state update and in addition send a corresponding
  runtime update so that this state update gets written through to the
  persistent database by a running icingadb process.
- RuntimeOnly: Special mode for callers that can ensure that a volatile update
  for the current state was already performed but has to be upgraded to a full
  update.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
ed5ad46266 IDO: use per-instance notification_id in history
When there are multiple active IDO instances on the same node, before this
commit, all of them would share a single DbValue object for the notification_id
column of the icinga_contactnotifications table. This resulted in the issue
that one database references the notification_id in another database.

This commit fixes this by using a separate DbValue value for each IDO instance.
This needs a new signal as the existing OnQuery and OnMultipleQueries signals
perform the same queries on all IDO instances, but different queries are needed
here per instance (they only differ in the referenced DbValue). Therefore, a
new signal OnMakeQueries is added that takes a std::function which is called
once per IDO instance and can access callbacks to perform one or multiple
queries only on this specific IDO instance.
2022-02-21 15:51:04 +01:00
Alexander Aleksandrovič Klimov
501691cdde Service#GetSeverity(): behave as the respective IDO query of Icinga Web
which doesn't include host reachability.
2022-02-21 15:30:49 +01:00
Julian Brost
26246a4601 LegacyTimePeriod::ScriptFunc: fix DST edge-cases
This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
  full days, but ScriptFunc may have called them with some random timestamp
  during the day. This is fixed by always using midnight of the day as
  reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
  On days with DST changes, this could either still be on the same day (a day
  may have 25 hours) or skip an entire day (a day may have 23 hours). This is
  fixed by using a struct tm to advance the time to the next day.
2022-02-21 15:24:15 +01:00
Julian Brost
9b2f95d6d2 InfluxdbCommonWriter: use atomic_size_t to data buffer size from stats function
m_DataBuffer may be modified concurrently while StatsFunc() is called, thus
it's unsafe to call size() on it. As write access to m_DataBuffer is already
synchronized by only modifying it from the single work queue thread, instead of
adding a mutex, this commit adds a new std::atomic_size_t which is additionally
updated when modifying m_DataBuffer and can safely be accessed in StatsFunc().
2022-02-21 15:11:56 +01:00