1812 Commits

Author SHA1 Message Date
Julian Brost
85f032dc7c Prevent worst-case exponential complexity in dependency evaluation
So far, calling Checkable::IsReachable() traversed all possible paths to it's
parents. In case a parent is reachable via multiple paths, all it's parents
were evaluated multiple times, result in a worst-case exponential complexity.

With this commit, the implementation keeps track of which checkables were
already visited and uses the already-computed reachability instead of repeating
the computation, ensuring a worst-case linear runtime within the graph size.
2025-07-30 17:13:38 +02:00
Julian Brost
073a35554a Move code involved in recursive dependency evaluation to helper class
Checkable::IsReachable() and DependencyGroup::GetState() call each other
recursively. Moving them to a common helper class allows adding caching to them
in a later commit without having to pass a cache between the functions (through
a public interface) or resorting to thread_local variables.
2025-07-30 17:13:38 +02:00
Chris Malton
ec48dae331 Correct a problem with expiry times not being passed through to AcknowledgeProblem 2025-06-23 11:57:29 +02:00
Julian Brost
dcd9a2dd41
Merge pull request #10457 from Icinga/remove-superfluous-dsl-functions
Drop superfluous (broken) DSL functions
2025-06-10 16:33:37 +02:00
Julian Brost
c41fe682c5
Merge pull request #10452 from Icinga/streamline-redis-and-db-values
IcingaDB: Make Redis & DB values consistent
2025-06-10 11:30:07 +02:00
Julian Brost
a15b706ee3
Merge pull request #10372 from WuerthPhoenix/10364-race-condition-while-calculating-object-state
Keep object locked until events are dispatched.
2025-06-06 17:19:21 +02:00
Julian Brost
4d9e9e7ed6
Merge pull request #9924 from ymartin-ovh/pr-9916
Fix `/v1/actions/` deadlock & nullptr dereference
2025-06-06 15:15:51 +02:00
Yonas Habteab
186571ec99 Refresh the states & types bitsets whenever states & types attrs change
Since the types and states attributes are user configurable and allowed to change at
runtime, we need to update the actual filter bitsets whenever these attributes change.
Otherwise, the filter bitsets would be stale and not reflect their current state.
2025-06-06 13:31:44 +02:00
Yonas Habteab
7037b18b34 IcingaDB: Send the int representation of states & types filter 2025-06-06 13:31:44 +02:00
Yonas Habteab
953a2e2e96 Merge {host,service}::StateTypeToString() & drop unused StateTypeFromString() 2025-06-06 13:31:44 +02:00
Yonas Habteab
9577af8e6d Drop Checkable#process_check_result() DSL function
Not sure why it's introduced in the first place, maybe for debugging
purposes at the early stage of Icinga 2 dev but I failed to see an
actual useful use case for it that's worth its maintenance burden. So,
this commit dropped it entirely from the DSL language.
2025-06-03 17:09:57 +02:00
Yannick Martin
e6fc1b91a7
AddComment: return Comment::Ptr instead of String containing the name
Mimic 88e5744d54f79ab6a84260bd4a8d2cc0ffd43465 and address nullptr on concurrent
  add-comment / remove-comment actions.
2025-06-03 17:00:07 +02:00
William Calliari
9d40de78eb
Address comments from review 2025-05-29 09:16:46 +02:00
William Calliari
abf2fb392b
Keep object locked until events are dispatched. 2025-05-29 09:16:44 +02:00
Julian Brost
c253e7eb6e
Merge pull request #10397 from Icinga/activation-priority-10179
Checkable#ProcessCheckResult(): discard🗑️ CR or delay its producers shutdown
2025-05-28 12:30:40 +02:00
Alexander A. Klimov
36743f3100 Checkable#ProcessCheckResult(): discard CR or delay its producers shutdown 2025-05-23 14:53:58 +02:00
Alexander A. Klimov
f4691dd054 Require to pass WaitGroup::Ptr to several methods
Namely:

Checkable#ProcessCheckResult()
ClusterCheckTask::ScriptFunc()
ClusterZoneCheckTask::ScriptFunc()
DummyCheckTask::ScriptFunc()
ExceptionCheckTask::ScriptFunc()
IcingaCheckTask::ScriptFunc()
IfwApiCheckTask::ScriptFunc()
NullCheckTask::ScriptFunc()
PluginCheckTask::ScriptFunc()
RandomCheckTask::ScriptFunc()
SleepCheckTask::ScriptFunc()
IdoCheckTask::ScriptFunc()
IcingadbCheck::ScriptFunc()
CheckCommand#Execute()
Checkable#ExecuteCheck()
ClusterEvents::ExecuteCheckFromQueue()
ExternalCommandProcessor::Process*CheckResult()
ExternalCommandCallback
ExternalCommandProcessor::Execute()
ExternalCommandProcessor::ExecuteFromFile()
ExternalCommandProcessor::ProcessFile()
LivestatusQuery#ExecuteCommandHelper()
LivestatusQuery#Execute()
2025-05-23 14:53:58 +02:00
Yonas Habteab
2b0f73987e
Merge pull request #10443 from Icinga/use-correct-timeout-for-command-endpoints
Checkable: Use correct timeout for rescheduling remote checks
2025-05-23 10:14:49 +02:00
Alexander Aleksandrovič Klimov
ec2080dcc1
Merge pull request #9731 from Icinga/fix-compiler-warnings-by-copy-constructing-loop-variables-explicitly
Fix compiler warnings by (copy-)constructing loop variables explicitly or not at all
2025-05-21 14:26:47 +02:00
Alexander A. Klimov
22e75f08fa Fix compiler warnings by not unnecessarily (copy-)constructing loop variables 2025-05-21 11:36:32 +02:00
Alexander Aleksandrovič Klimov
5c20b1ae12
Merge pull request #10444 from Icinga/ProcessingResult-NoCheckResult
Remove unused ProcessingResult::NoCheckResult
2025-05-20 12:54:41 +02:00
Alexander A. Klimov
69d2a0442a Remove unused ProcessingResult::NoCheckResult
No one passes a NULL CR to Checkable#ProcessCheckResult() anymore.
2025-05-20 10:41:26 +02:00
Alexander A. Klimov
55fc0e51ff Checkable#process_check_result(): don't pass NULL CR to Checkable#ProcessCheckResult()
It's ignored anyway.
2025-05-20 10:33:20 +02:00
Yonas Habteab
0c2215a9f8 Checkable: Use correct timeout for rescheduling remote checks
Previously, the `command#timeout` which by default is `1m`, was used to reschedule
the just sent remote check. However, this results into a bunch of extra checks being
sent to the remote host, even though the first one is still running. That's because
if one want to override the default timeout of the command for a specific host/service,
one has to set the `checkable#check_timeout` attribute to the desired value. So, this
commit makes sure that the `checkable#check_timeout` attribute (if set) is used to
reschedule the remote check.
2025-05-19 14:07:33 +02:00
Johannes Schmidt
f8d3bacc29 Fix warnings related to enum integer conversion 2025-05-19 12:31:22 +02:00
Johannes Schmidt
6a6c494279 Mark MakeName and ParseName virtual methods as override 2025-05-19 11:33:22 +02:00
Yonas Habteab
83a0f9d217
Merge pull request #10361 from Icinga/reset-no-more-notifications-only-on-recovery
Notification: Reset internal states on (missed)recovery
2025-05-16 09:53:10 +02:00
Yonas Habteab
d750bff193 Notification: Fix incorrectly dropped recovery & ACK notifications
Previously, recovery and ACK notifications were not delivered to users
who weren't notified about the problem state while having a configured
`Problem` type filter. However, since the type filter can also be
configured on the `Notification` object level, this resulted to an
incorrect behaviour. This PR changes the existing logic so that the
recovery and ACK notifications gets dropped only if the `Problem` filter
is configured on both the `User` and `Notification` object levels.
2025-05-13 09:46:35 +02:00
Yonas Habteab
4596b44171 Reset no_more_notifications on filter mismatch correctly
Previously, if you enable flapping for a Checkable but the corresponding
`Notification` object does not have `FlappingStart` or `FlappingEnd`
types set, the `no_more_notifications` flag wasn't reset to false again.
This commit ensures that this flag is always reset on `Recovery` even
the type filter does not match including when we miss the `Recovery` due
to Flapping state.
2025-05-12 12:03:13 +02:00
Yonas Habteab
9166326876 Notification: Reset notified problem users on flapping end as well 2025-05-12 12:03:13 +02:00
Yonas Habteab
86365a4e2b Notification: Clear last notified state per user on flapping end as well 2025-05-12 12:03:13 +02:00
Yonas Habteab
89f12c2323 Notification: Reset no_more_notifications only on recovery 2025-05-12 12:03:13 +02:00
Julian Brost
b2b47981a5
Merge pull request #10422 from Icinga/mktime-dst-consistency
Ensure consistent mktime() DST behavior across different implementations
2025-04-30 16:51:05 +02:00
Julian Brost
cc48c924ae Load Notification objects after User and UserGroup
Notification objects can refer User and Group objects similar to how they can
refer Host and Service objects, so that dependency feels quite natural. Note
that for evaluating most configuration, this order doesn't really matter, the
configuration will successfully evaluate in either case, the difference can be
noticed mainly in more advanced configurations, for example when dynamically
assigning user based on their groups. When accessing user objects from the
Notification object definition (like in the following example), without this
change, only groups configured directly in groups attribute of User objects are
visible and those added via assign clauses in UserGroup objects are missing.
With this commit, these are also visible.

    apply Notification "n" to Host {
        for (var u in get_objects(User)) {
            log(u.name + " -> " + Json.encode(u.groups))
        }

        # [...]
    }
2025-04-29 12:14:46 +02:00
Julian Brost
5404143dee Ensure consistent mktime() DST behavior across different implementations
There are inputs to mktime() where the behavior is not specified and there's
also no single obviously correct behavior. In particular, this affects how
auto-detection of whether DST is in effect is done when tm_isdst = -1 is set
and the time specified does not exist at all or exists twice on that day.

If different implementations are used within an Icinga 2 cluster, that can lead
to inconsistent behavior because different nodes may interpret the same
TimePeriod differently.

This commit introduces a wrapper to mktime(), namely Utility::NormalizeTm()
that implements the behavior provided by glibc. The choice for glibc's behavior
is pretty arbitrary, it was simply picked because most systems that are
officially/fully supported use it (with the only exception being Windows), so
this should give the least possible amount of user-visible changes.

As part of this commit, the closely related helper function mktime_const() is
also moved to Utility::TmToTimestamp() and made a wrapper around the newly
introduced NormalizeTm().
2025-04-28 13:38:55 +02:00
Alexander A. Klimov
c2ddd20ef3 Fix compiler warnings by (copy-)constructing loop variables explicitly
for (const T& needle : haystack) creates the illusion that haystack is a
container of T and we're just borrowing needle. In these cases that's not true.
2025-04-22 13:55:49 +02:00
Yonas Habteab
9cc3971288
Merge pull request #10352 from Icinga/checkable-checkercomponent-fixed-timestamp-debug-logs
Fixed double output for timestamps in debug log
2025-04-14 12:16:51 +02:00
Alvar Penning
2ce34e8134
Fixed double output for timestamps in debug log
The timestamps used both in the CheckerComponent and Checkable debug
logs were printed in the scientific notation, making them effectively
useless.

> debug/CheckerComponent: Scheduling info for checkable 'host!service' (2025-02-26 14:53:16 +0100): Object 'host!service', Next Check: 2025-02-26 14:53:16 +0100(1.74058e+09).
> debug/Checkable: Update checkable 'host!service' with check interval '300' from last check time at 2025-02-26 14:48:47 +0100 (1.74058e+09) to next check time at 2025-02-26 14:58:12 +0100 (1.74058e+09).

Switching to std::fixed actually shows the complete Unix timestamp.

> debug/CheckerComponent: Scheduling info for checkable 'host!service' (2025-02-26 15:36:44 +0000): Object 'host!service', Next Check: 2025-02-26 15:36:44 +0000 (1740584204).
> debug/Checkable: Update checkable 'host!service' with check interval '60' from last check time at 2025-02-26 15:37:11 +0000 (1740584232) to next check time at 2025-02-26 15:38:09 +0000 (1740584290).
2025-04-14 10:09:42 +02:00
Julian Brost
5a6b2044b1
Merge pull request #10290 from Icinga/icingadb-dependencies-sync
Sync dependencies to Redis
2025-04-04 15:13:05 +02:00
Julian Brost
31a224c509 Checkable::GetSeverity(): always take reachability into account
So far, Service::GetSeverity() only considered the state of its own host, i.e.
the implicit service to its own host dependency, and treated it similar to
acknowledgements and downtimes. In contrast, Host::GetSeverity() considered
reachability and treated it like a state, i.e. for the severity calculation,
the host was either up, down, or unreachable.

This commit changes the following things:
1. Make the service severity also consider explicitly configured dependencies
   by using IsReachable().
2. Prefer acknowledgements and downtimes over unreachability in the severity
   calculation so that if an already acknowledged or in-downtime services (i.e.
   already handled service) becomes unreachable, it shouln't become more
   severe.
3. To unify host and service severities a bit, hosts now use the same logic
   that treats reachability more like acknowledgements/downtimes instead of
   like a state (changing the other way around would the state from the check
   plugin would not affect the severity for unrachable services anymore).
2025-03-31 15:23:51 +02:00
Julian Brost
1e05a166f1 Host::GetSeverity(): remove empty line at end of method 2025-03-31 15:23:51 +02:00
Julian Brost
d8271c6568 Host::GetSeverity(): remove explicit unlocking
No change in functionality. The ObjectLock destructor will implicitly release
the locks when returning from the function.
2025-03-31 15:23:51 +02:00
Julian Brost
2ebee010f0 Host::GetHost(): return early to remove a nesting level
No change in functionality. The first two branches actually set the final
return value for the method, so they can just return directly, removing the
need to have the rest of the function inside an else block.
2025-03-31 15:23:51 +02:00
Julian Brost
6443f8997f Host::GetSeverity(): add braces to if statements
No change in functionality, just makes the code a bit nicer.
2025-03-31 15:23:51 +02:00
Julian Brost
c899d52e2f Service::GetSeverity(): remove explicit unlocking
No change in functionality. The ObjectLock destructor will implicitly release
the locks when returning from the function.
2025-03-31 15:23:50 +02:00
Julian Brost
01acfb47a9 Service::GetHost(): return early to remove a nesting level
No change in functionality. The first two branches actually set the final
return value for the method, so they can just return directly, removing the
need to have the rest of the function inside an else block.
2025-03-31 15:23:50 +02:00
Julian Brost
5ca6047b35 Service::GetSeverity(): replace switch with if
No change in functionality, just making the code a bit more compact.
2025-03-31 15:23:50 +02:00
Julian Brost
a1865e1b43 Service::GetSeverity(): simplify nested if, add braces
No change in functionality, just making the code a bit nicer and more compact.
2025-03-31 15:23:50 +02:00
Julian Brost
061338156c
Merge pull request #10345 from Icinga/remove-child-downtimes
ApiActions: Remove child downtimes recursively
2025-03-21 16:37:43 +01:00
Julian Brost
065118bc22 Make DependencyGroup::State an enum
The previous struct used two bools to represent three useful states. Make this
more explicit by having these three states as an enum.
2025-03-19 16:28:00 +01:00