6672 Commits

Author SHA1 Message Date
William Calliari
9d40de78eb
Address comments from review 2025-05-29 09:16:46 +02:00
William Calliari
abf2fb392b
Keep object locked until events are dispatched. 2025-05-29 09:16:44 +02:00
Julian Brost
c253e7eb6e
Merge pull request #10397 from Icinga/activation-priority-10179
Checkable#ProcessCheckResult(): discard🗑️ CR or delay its producers shutdown
2025-05-28 12:30:40 +02:00
Yonas Habteab
7d2f1c2030 Drop Windows VISTA from the supported platform
Boost `1.88.0` introduced a feature [^1] that makes use of the Windows API, but it
uses API functions that are only available with `PSAPI_VERSION >= 2` and
Windows VISTA only supports `PSAPI_VERSION == 1`. Actually, that new feature
can also be disabled by setting the `BOOST_STACKTRACE_DISABLE_OFFSET_ADDR_BASE`
macro, but since it seems to be a useful feature and isn't even disabled by default,
we can just drop it that ancient Windows version instead of disabling it.

[^1]: https://github.com/boostorg/stacktrace/pull/200
2025-05-28 09:39:03 +02:00
Yonas Habteab
d265329a17
Merge commit from fork
Fix for master
2025-05-27 13:50:26 +02:00
Yannick Martin
ab58ff2928
fix api deadlock that can appears on two simultaneous actions
With this mutex, we can have deadlock in the following case:
    1/ Thread A processes a /v1/actions/acknowledge-problem request and locks the checkable
    2/ Thread B processes a /v1/actions/add-comment and enters first the ConfigItem::ActivateItems() method and locks the static mutex there and starts the just created comment object, which triggers the OnCommentAdded() event.
    3/ Thread A wants to activate the just created ack comment as well but since the mutex is already locked by TB, it blocks.
    4/ Thread B's OnCommentAdded() even dispatch causes the IcingaDB::CommentAddedHandler() to be called and implicitly triggers full state update for the checkable. Now, the state serialization of that checkable (remember that's the same checkable currently locked by TA) also includes computing its severity, thus it calls either service->GetSeverity() or host->GetSeverity(). However, since computing the checkable severity (as of now) requires acquiring the object lock, and boom - they deadlock each other.
2025-05-26 15:05:42 +02:00
Alexander Aleksandrovič Klimov
56d9f38b35
Merge pull request #10456 from Icinga/SharedObject-delete
SharedObject: delete unused methods
2025-05-26 10:00:52 +02:00
Alexander A. Klimov
4f351f625f SharedObject: delete unused methods
None of the derived classes use them, none shall have to explicitly delete them.
2025-05-23 15:47:02 +02:00
Alexander A. Klimov
36743f3100 Checkable#ProcessCheckResult(): discard CR or delay its producers shutdown 2025-05-23 14:53:58 +02:00
Alexander A. Klimov
f4691dd054 Require to pass WaitGroup::Ptr to several methods
Namely:

Checkable#ProcessCheckResult()
ClusterCheckTask::ScriptFunc()
ClusterZoneCheckTask::ScriptFunc()
DummyCheckTask::ScriptFunc()
ExceptionCheckTask::ScriptFunc()
IcingaCheckTask::ScriptFunc()
IfwApiCheckTask::ScriptFunc()
NullCheckTask::ScriptFunc()
PluginCheckTask::ScriptFunc()
RandomCheckTask::ScriptFunc()
SleepCheckTask::ScriptFunc()
IdoCheckTask::ScriptFunc()
IcingadbCheck::ScriptFunc()
CheckCommand#Execute()
Checkable#ExecuteCheck()
ClusterEvents::ExecuteCheckFromQueue()
ExternalCommandProcessor::Process*CheckResult()
ExternalCommandCallback
ExternalCommandProcessor::Execute()
ExternalCommandProcessor::ExecuteFromFile()
ExternalCommandProcessor::ProcessFile()
LivestatusQuery#ExecuteCommandHelper()
LivestatusQuery#Execute()
2025-05-23 14:53:58 +02:00
Alexander A. Klimov
c7cca7b460 Add a StoppableWaitGroup, and join it on #Stop(), to:
ApiListener
CheckerComponent
ExternalCommandListener
LivestatusListener
2025-05-23 14:53:58 +02:00
Alexander A. Klimov
18fb93fc11 Introduce WaitGroup and StoppableWaitGroup 2025-05-23 14:53:58 +02:00
Alexander A. Klimov
77b86bba52 Move l_MyCapabilities -> ApiCapabilities::MyCapabilities 2025-05-23 10:44:16 +02:00
Alexander A. Klimov
6cd83ba2b8 ApiListener::UpdateObjectAuthority(): distribute auth. by object's host
Pin child objects of hosts (HOST!...) to the same endpoint as the host.
This reduces cross-object action latency withing the same host.
If all endpoints know this algorithm, we can use it.
2025-05-23 10:44:16 +02:00
Yonas Habteab
2b0f73987e
Merge pull request #10443 from Icinga/use-correct-timeout-for-command-endpoints
Checkable: Use correct timeout for rescheduling remote checks
2025-05-23 10:14:49 +02:00
Alexander A. Klimov
18f810a1ea For the same zone, call ApiListener::UpdateObjectAuthority() in icinga::Hello
after setting remote capabilities. They'll become important for teamwork in a zone.
2025-05-22 15:01:06 +02:00
Alexander Aleksandrovič Klimov
ec2080dcc1
Merge pull request #9731 from Icinga/fix-compiler-warnings-by-copy-constructing-loop-variables-explicitly
Fix compiler warnings by (copy-)constructing loop variables explicitly or not at all
2025-05-21 14:26:47 +02:00
Alexander A. Klimov
22e75f08fa Fix compiler warnings by not unnecessarily (copy-)constructing loop variables 2025-05-21 11:36:32 +02:00
Julian Brost
4023128be4 VerifyCertificate: Work around issue in OpenSSL < 1.1.0 causing invalid certifcates being treated as valid
Old versions of OpenSSL stored a valid flag in the certificate (see inline code
comment for details) that if already set, causes parts of the verification to
be skipped and return that the certificate is valid, even if it's not actually
signed by the CA in the trust store.

This issue was assigned CVE-2025-48057.
2025-05-21 10:50:12 +02:00
Julian Brost
00864d1096 VerifyCertificate: fix use after free
`X509_STORE_CTX_get_error(csc)` was called after `X509_STORE_CTX_free(csc)`.
This is fixed by automatically freeing variables at the end of the function
using `std::unique_ptr`.
2025-05-21 10:46:25 +02:00
Alexander Aleksandrovič Klimov
5c20b1ae12
Merge pull request #10444 from Icinga/ProcessingResult-NoCheckResult
Remove unused ProcessingResult::NoCheckResult
2025-05-20 12:54:41 +02:00
Alexander A. Klimov
69d2a0442a Remove unused ProcessingResult::NoCheckResult
No one passes a NULL CR to Checkable#ProcessCheckResult() anymore.
2025-05-20 10:41:26 +02:00
Alexander A. Klimov
55fc0e51ff Checkable#process_check_result(): don't pass NULL CR to Checkable#ProcessCheckResult()
It's ignored anyway.
2025-05-20 10:33:20 +02:00
Yonas Habteab
44b3382f5f
Merge pull request #10442 from Icinga/jschmidt/fix-compiler-warnings
Fix two low-hanging fruit compiler warnings
2025-05-19 14:40:55 +02:00
Yonas Habteab
0c2215a9f8 Checkable: Use correct timeout for rescheduling remote checks
Previously, the `command#timeout` which by default is `1m`, was used to reschedule
the just sent remote check. However, this results into a bunch of extra checks being
sent to the remote host, even though the first one is still running. That's because
if one want to override the default timeout of the command for a specific host/service,
one has to set the `checkable#check_timeout` attribute to the desired value. So, this
commit makes sure that the `checkable#check_timeout` attribute (if set) is used to
reschedule the remote check.
2025-05-19 14:07:33 +02:00
Johannes Schmidt
f8d3bacc29 Fix warnings related to enum integer conversion 2025-05-19 12:31:22 +02:00
Yonas Habteab
8a1d9df767
Merge pull request #10070 from Icinga/time-period-schedule-next-check-on-next-transition-9984
If skipped due to time period, schedule next check on next transition
2025-05-19 12:29:09 +02:00
Johannes Schmidt
6a6c494279 Mark MakeName and ParseName virtual methods as override 2025-05-19 11:33:22 +02:00
Yonas Habteab
45c651499b
Merge pull request #10379 from Icinga/set-cancel-time-conditionally
IcingaDB: Sync downtime `cancel_time` conditionally
2025-05-16 12:18:20 +02:00
Yonas Habteab
83a0f9d217
Merge pull request #10361 from Icinga/reset-no-more-notifications-only-on-recovery
Notification: Reset internal states on (missed)recovery
2025-05-16 09:53:10 +02:00
Yonas Habteab
7acec6fc36 IcingaDB: Set downtime cancel_time conditionally
If the downtime ended automatically `cancel_time` should just be `NULL`
instead of a `0` timestamp.
2025-05-16 09:49:58 +02:00
Yonas Habteab
5ea666a7ad IcingaDB: Don't set cancel_time for downtime start event
It's a downtime start event there's now way the downtime could be
cancelled before it even started.
2025-05-16 09:49:16 +02:00
Julian Brost
1a386ad55d
Merge pull request #10265 from Icinga/RedisConnection-spinlock
RedisConnection#Connect(): get rid of spin lock
2025-05-14 15:06:58 +02:00
Yonas Habteab
cef6fb77e5 Serialize fields before queueing the event to the workqueue 2025-05-14 14:42:04 +02:00
Alexander A. Klimov
daeab09334 If skipped due to time period, schedule next check on next transition
and not after yet another check interval. Otherwise checks done every 24h may get suppressed due to being re-scheduled outside time period every 24h.
2025-05-14 12:47:34 +02:00
Alexander A. Klimov
2739f7f189 RedisConnection#Connect(): get rid of spin lock
Instead of IoEngine::YieldCurrentCoroutine(yc) until m_Queues.FutureResponseActions.empty(), async-wait a CV which is updated along with m_Queues.FutureResponseActions.
2025-05-14 12:28:11 +02:00
Alexander A. Klimov
060d8b185e Introduce AsioDualEvent 2025-05-14 12:24:28 +02:00
Yonas Habteab
a589b87d6c Remove unused parameters 2025-05-13 15:31:29 +02:00
Yonas Habteab
2e19fce31d Remove some superfluous if statements
They're just useless, since a `CheckResult` handler is never going to be
called without a check result and a checkable can't exist without a
checkcommand.
2025-05-13 15:31:29 +02:00
Yonas Habteab
d750bff193 Notification: Fix incorrectly dropped recovery & ACK notifications
Previously, recovery and ACK notifications were not delivered to users
who weren't notified about the problem state while having a configured
`Problem` type filter. However, since the type filter can also be
configured on the `Notification` object level, this resulted to an
incorrect behaviour. This PR changes the existing logic so that the
recovery and ACK notifications gets dropped only if the `Problem` filter
is configured on both the `User` and `Notification` object levels.
2025-05-13 09:46:35 +02:00
Alvar Penning
7e65a60a5d
Fix PerfdataValue Counter Parsing
Ensure that the counter unit of measurement, "c", is parsed correctly
for performance data values again.

A prior refactoring in 720a88c29a489cec91815af49755413202802d7a changed
the parsing logic, resulting in an incorrect behavior for counter units.
By passing the raw input into the l_CsUoMs map first, the "c" UoM is
removed. Moving the explicit counter check before passing the raw unit
into the map resolves this issue.

Fixes #9540.
2025-05-12 16:34:05 +02:00
Yonas Habteab
4596b44171 Reset no_more_notifications on filter mismatch correctly
Previously, if you enable flapping for a Checkable but the corresponding
`Notification` object does not have `FlappingStart` or `FlappingEnd`
types set, the `no_more_notifications` flag wasn't reset to false again.
This commit ensures that this flag is always reset on `Recovery` even
the type filter does not match including when we miss the `Recovery` due
to Flapping state.
2025-05-12 12:03:13 +02:00
Yonas Habteab
9166326876 Notification: Reset notified problem users on flapping end as well 2025-05-12 12:03:13 +02:00
Yonas Habteab
86365a4e2b Notification: Clear last notified state per user on flapping end as well 2025-05-12 12:03:13 +02:00
Yonas Habteab
89f12c2323 Notification: Reset no_more_notifications only on recovery 2025-05-12 12:03:13 +02:00
Julian Brost
b2b47981a5
Merge pull request #10422 from Icinga/mktime-dst-consistency
Ensure consistent mktime() DST behavior across different implementations
2025-04-30 16:51:05 +02:00
Julian Brost
cc48c924ae Load Notification objects after User and UserGroup
Notification objects can refer User and Group objects similar to how they can
refer Host and Service objects, so that dependency feels quite natural. Note
that for evaluating most configuration, this order doesn't really matter, the
configuration will successfully evaluate in either case, the difference can be
noticed mainly in more advanced configurations, for example when dynamically
assigning user based on their groups. When accessing user objects from the
Notification object definition (like in the following example), without this
change, only groups configured directly in groups attribute of User objects are
visible and those added via assign clauses in UserGroup objects are missing.
With this commit, these are also visible.

    apply Notification "n" to Host {
        for (var u in get_objects(User)) {
            log(u.name + " -> " + Json.encode(u.groups))
        }

        # [...]
    }
2025-04-29 12:14:46 +02:00
Alexander A. Klimov
98d097517b Introduce Endpoint#messages_received_per_type 2025-04-29 11:42:14 +02:00
Alexander A. Klimov
3dd7b15808 Count incoming messages per type and endpoint 2025-04-29 11:42:14 +02:00
Alexander A. Klimov
c566f6dc31 ApiFunction: store own name 2025-04-29 11:42:14 +02:00