13014 Commits

Author SHA1 Message Date
Alexander A. Klimov
07cd15f48f Apply rules: import default templates first
... to allow to override the attributes they set.

refs #7914
2022-03-24 14:04:58 +01:00
Julian Brost
6071d93981
Merge pull request #9269 from Icinga/feature/subscription-213
GHA: build RHEL and Amazon Linux
2022-03-22 15:09:54 +01:00
Julian Brost
ccb18a04ec Checkable: Add test for state notifications after a suppression ends 2022-03-09 17:06:09 +01:00
Julian Brost
6303d8df09 Checkable: sync state_before_suppression in cluster
This ensures that in case of a failover in an HA zone, the other can take over
properly and has the required state to send the proper notifications.
2022-03-09 17:06:09 +01:00
Julian Brost
29fc3ad151 Checkable: improve state notifications after suppression ends
This commit changes the Checkable notification suppression logic (notifications
are currently suppressed on the Checkable if it is unreachable, in a downtime,
or acknowledged) to that after the suppression reason ends, a state
notification is sent if and only if the first hard state after is different
from the last hard state from before. If the checkable is in a soft state after
the suppression ends, the notification is further suppressed until a hard state
is reached.

To achieve this behavior, a new attribute state_before_suppression is added to
Checkable. This attribute is set to the last hard state the first time either a
PROBLEM or a RECOVERY notification is suppressed. Compared to from before,
neither of these two flags in the suppressed_notification will ever be cleared
while the supression is still ongoing but only after the suppression ended and
the current state is compared with the old state stored in
state_before_suppression.
2022-03-09 17:06:09 +01:00
Julian Brost
e09eaa3ad2
Merge pull request #9239 from Icinga/bugfix/adjust-behavior-of-service-get-severity
Service#GetSeverity(): behave as the respective IDO query of Icinga Web
2022-03-08 16:34:18 +01:00
Julian Brost
316dd110e2
Merge pull request #9237 from Icinga/bugfix/influxdb-writer-synchronization
Fix unsafe concurrent access to m_DataBuffer in InfluxdbCommonWriter
2022-03-08 16:25:14 +01:00
Julian Brost
dbe13e2f32
Merge pull request #9238 from Icinga/bugfix/timeperiod-dst-2.0
LegacyTimePeriod::ScriptFunc: fix DST edge-cases
2022-03-08 15:22:09 +01:00
Julian Brost
12293d999c
Merge pull request #9190 from Icinga/bugfix/sync-missing-history-information-213
Icinga DB: ensure consistent history streams in HA setup
2022-03-07 11:32:15 +01:00
Julian Brost
db84b834ca
Merge pull request #9214 from Icinga/feature/icingadb-add-previous_soft_state-to-host_state-and-service_state-9210-213
IcingaDB: Add previous_soft_state to host_state and service_state
2022-03-07 11:19:50 +01:00
Julian Brost
50ef32a0ad
Merge pull request #9228 from Icinga/bugfix/processcheckresult-dependency-deadlock-2.13
Prevent deadlock in ProcessCheckResult
2022-03-07 11:16:00 +01:00
Julian Brost
c6bac19da8
Merge pull request #9241 from Icinga/bugfix/icingadb-reachabilitychangehandler-9143
Icinga DB: ensure is_reachable and severity don't miss updates
2022-03-07 09:27:35 +01:00
Julian Brost
99e1b2efe2
Merge pull request #9240 from Icinga/bugfix/doc-allow-to-change-severity-formula-across-icinga2-releases
Doc: technical concepts: allow to change severity formula across Icin…
2022-03-07 09:23:36 +01:00
Julian Brost
c68ab0f3bc
Merge pull request #9242 from Icinga/bugfix/multi-ido-notification-id
IDO: fix incorrect contacts in notification history with multiple IDO instances on a single node
2022-03-07 09:22:27 +01:00
Alexander A. Klimov
a9c6161def IcingaDB#Stop(): don't block shutdown, timeout instead 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
f1a942681f IcingaDB#Send*(): don't enqueue any history once stopped 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
2ab52d5c59 RedisConnection#Connect(): wait for all promises to be completed
by the read loop from the previous connection.
2022-03-03 09:57:03 +01:00
Alexander A. Klimov
1d1e2b2888 Introduce IoEngine::YieldCurrentCoroutine() 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
59dd3592c2 RedisConnection#ReadLoop(): don't crash (silently) if a promise to be set is already set 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
b7426f4ee6 Icinga DB: include amount of history kept in memory in /v1/status 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
3cc82069cd Icinga DB: log amount of history kept in memory every 10s 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
0137713d15 Icinga DB: keep history in memory until written to Redis
by putting the messages into a Bulker and retrying each chunk.
2022-03-03 09:57:03 +01:00
Alexander A. Klimov
8538ba97aa Introduce Bulker 2022-03-03 09:57:03 +01:00
Alexander A. Klimov
dbe4093bab GHA: preserve .rpm job names 2022-03-02 18:17:50 +01:00
Alexander A. Klimov
41d9422919 GHA: build Amazon Linux 2022-03-02 18:17:50 +01:00
Alexander A. Klimov
5d5cd1fd31 GHA: build RHEL 2022-03-02 18:17:50 +01:00
Alexander A. Klimov
8cdc53ffd4 GHA: correct subscription packages repo 2022-03-02 18:17:50 +01:00
Alexander A. Klimov
add9c823a1 GHA: new subscription packages repo access token
with more permissions and URL-friendlier login name.
2022-03-02 18:17:50 +01:00
Alexander A. Klimov
726497a0b5 GHA: explicitly specify whether $DISTRO packages require subscription
to have all info at one place in the file.
2022-03-02 18:17:50 +01:00
Julian Brost
53a389769c
Merge pull request #9260 from Icinga/bugfix/event-handler-spamming-8704-213
Checkable#ExecuteEventHandler(): don't outsource event command run twice
2022-02-25 16:52:27 +01:00
Alexander A. Klimov
74935dad7b Checkable#ExecuteEventHandler(): don't outsource event command run twice
refs #8704
2022-02-24 14:03:57 +01:00
Alexander A. Klimov
88b041c7c9 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged less often
Call it only on state changes to reduce no-op Redis/IDO updates a lot.

refs #9143
2022-02-23 16:06:31 +01:00
Alexander A. Klimov
4ea65076b0 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged last
to ensure Checkable#IsReachable() returns correctly for dependency children inside OnReachabilityChanged().
That needs the dependency parent to be already in the correct state.

refs #9143
2022-02-23 16:06:31 +01:00
Alexander A. Klimov
3c8c248508 Icinga DB: ensure is_reachable and severity don't miss updates
refs #9143
2022-02-23 16:06:31 +01:00
Julian Brost
ca37dbb9c6
Merge pull request #9252 from Icinga/bugfix/icingadb-state-history
Icinga DB: don't write state history for ack/downtime/host problem changes
2022-02-23 15:43:55 +01:00
Julian Brost
175fd2df44 Icinga DB: remove obsolete StateChangeHandler overload
This version of StateChangeHandler is no longer called anywhere as it was the
wrong function for all previous callers anyways.
2022-02-23 11:08:56 +01:00
Julian Brost
b60d3e5665 Icinga DB: make host problem change events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
3466dacaac Icinga DB: make acknowledgement events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
5ea0208b83 Icinga DB: make downtime events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
f671374762 Icinga DB: don't reimplement volatile state update in SendConfigUpdate
Sending a volatile state update is already implemented in UpdateState, so just
use that function instead of generating the update queries.
2022-02-23 11:08:56 +01:00
Julian Brost
244575ecbd Icinga DB: Merge SendStatusUpdate into UpdateState
Previously, both funktions did related operations but had unclear and confusing
naming:
- UpdateState updated the icinga:{host,service}:state Redis keys.
- SendStatusUpdate sent a runtime update for the icinga:{host,service}:state.

This commit merges both functions into one with a new mode parameter. The
following modes are now supported:
- Volatile: Update the icinga:{host,service}:state Redis key.
- Full: Perform the volatile state update and in addition send a corresponding
  runtime update so that this state update gets written through to the
  persistent database by a running icingadb process.
- RuntimeOnly: Special mode for callers that can ensure that a volatile update
  for the current state was already performed but has to be upgraded to a full
  update.

refs #9063
2022-02-23 11:08:56 +01:00
Julian Brost
ed5ad46266 IDO: use per-instance notification_id in history
When there are multiple active IDO instances on the same node, before this
commit, all of them would share a single DbValue object for the notification_id
column of the icinga_contactnotifications table. This resulted in the issue
that one database references the notification_id in another database.

This commit fixes this by using a separate DbValue value for each IDO instance.
This needs a new signal as the existing OnQuery and OnMultipleQueries signals
perform the same queries on all IDO instances, but different queries are needed
here per instance (they only differ in the referenced DbValue). Therefore, a
new signal OnMakeQueries is added that takes a std::function which is called
once per IDO instance and can access callbacks to perform one or multiple
queries only on this specific IDO instance.
2022-02-21 15:51:04 +01:00
Alexander Aleksandrovič Klimov
d559b30f0a Doc: technical concepts: allow to change severity formula across Icinga 2 releases
so nobody is surprised in that case.
2022-02-21 15:35:51 +01:00
Alexander Aleksandrovič Klimov
501691cdde Service#GetSeverity(): behave as the respective IDO query of Icinga Web
which doesn't include host reachability.
2022-02-21 15:30:49 +01:00
Julian Brost
d4b9d7e51b Add tests for LegacyTimePeriod::ScriptFunc when used by TimePeriod::IsInside 2022-02-21 15:24:15 +01:00
Julian Brost
26246a4601 LegacyTimePeriod::ScriptFunc: fix DST edge-cases
This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
  full days, but ScriptFunc may have called them with some random timestamp
  during the day. This is fixed by always using midnight of the day as
  reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
  On days with DST changes, this could either still be on the same day (a day
  may have 25 hours) or skip an entire day (a day may have 23 hours). This is
  fixed by using a struct tm to advance the time to the next day.
2022-02-21 15:24:15 +01:00
Julian Brost
9b2f95d6d2 InfluxdbCommonWriter: use atomic_size_t to data buffer size from stats function
m_DataBuffer may be modified concurrently while StatsFunc() is called, thus
it's unsafe to call size() on it. As write access to m_DataBuffer is already
synchronized by only modifying it from the single work queue thread, instead of
adding a mutex, this commit adds a new std::atomic_size_t which is additionally
updated when modifying m_DataBuffer and can safely be accessed in StatsFunc().
2022-02-21 15:11:56 +01:00
Julian Brost
6b69235e15 InfluxdbCommonWriter: only flush from work queue
There is no explicit synchronization of access to m_DataBuffer which is fine if
it is only accessed from the single-threaded work queue. However, Stop() also
called Flush() in another thread, leading to concurrent write access to
m_DataBuffer which can result in a crash due to use after free/double free.

Changes in this commit:
* Flush() is renamed to FlushWQ() to show that it should only be called from
  the work queue. Additionally, it now asserts that it is running on the work
  queue.
* Visibility of some data members is changed from protected to private. No
  other classes have to access these at the moment. By this change, accidental
  concurrent access from derived classes in the future is prevented.
* Stop() now flushes by posting FlushWQ() to the work queue and joining it.
2022-02-21 15:11:56 +01:00
Alexander Aleksandrovič Klimov
40f647588e
Merge pull request #9233 from Icinga/bugfix/gha-pin-windows-version-2.13
GitHub Actions: pin Windows Server version to 2019
2022-02-21 15:08:18 +01:00
Alexander Aleksandrovič Klimov
d2c90ad66e
Merge pull request #9231 from Icinga/bugfix/gha-drop-centos8-2.13
GHA: drop CentOS 8
2022-02-21 14:04:42 +01:00