Commit Graph

1678 Commits

Author SHA1 Message Date
Julian Brost 6390911262
Merge pull request #9123 from Icinga/bugfix/icinga2-crashes-when-sending-notifications-8186
Avoid "type" key in dicts being part of object state attrs
2022-01-19 11:48:40 +01:00
Julian Brost 463b159414
Merge pull request #9171 from Icinga/bugfix/icinga-db-notification-history-might-use-incorrect-previous_hard_state-9132
IcingaDB#SendSentNotification(): make stream deterministic via CheckResult#previous_hard_state
2022-01-18 16:54:16 +01:00
Alexander A. Klimov 1fee3f1b12 IcingaDB#SendSentNotification(): make stream deterministic via CheckResult#previous_hard_state
Now it gets everything from one source, the CheckResult.

refs #9132
2022-01-10 19:18:11 +01:00
Julian Brost e518dc2436
Merge pull request #9112 from Icinga/bugfix/sync-missing-history-information
Icinga DB: ensure consistent history streams in HA setup
2022-01-07 15:14:06 +01:00
Julian Brost 3e73a262cc Sync comment and downtime removal info for Icinga DB history
When a comment or downtime is removed manually, the name of the requestor and
timestamp have to be synced to other nodes in the cluster to allow all of them
to generate a consistent Icinga DB history stream.

refs #9101
2022-01-05 10:27:13 +01:00
Alexander Aleksandrovič Klimov 80663cf5e6
Merge pull request #9048 from Icinga/bugfix/timeperiod-dst-2.0
LegacyTimePeriod::ScriptFunc: fix DST edge-cases
2022-01-03 18:11:32 +01:00
Julian Brost 13ea635188 Don't trigger a fixed downtime like a flexible one
When creating a fixed downtime that starts immediately while the checkable is
in a non-OK state, previously the code path for flexible downtimes was used to
trigger this downtime. This is fixed by this commit which resolves two issued:

1. Missing downtime start notification: notifications work differently for
   fixed and flexible downtimes. This resulted in missing downtime start
   notifications under the conditions described above.
2. Incorrect downtime trigger time: this code path would incorrectly assume the
   timestamp of the last checkable as the trigger time which is incorrect for
   fixed downtimes.
2021-12-14 11:02:40 +01:00
Alexander A. Klimov eb71fb7529 Avoid "type" key in dicts being part of object state attrs
not to confuse the state file deserializator with e.g. `"type":32` on startup.
That would unexpectedly restore null (not `{"type":32}`) as there's no type "32".

refs #8186
2021-12-13 17:56:12 +01:00
Julian Brost c71029f2e8 Set downtime trigger time deterministically
When triggering a downtime, the time of the causing event is now passed on as
the trigger time. That time is:

* For fixed downtimes: the later one of start and entry time.
* If a check result triggers the downtime: The execution end of the check
  result.
* If another downtime triggers the downtime: The trigger time of the first
  downtime.

This is done so two nodes in a HA setup can write consistent Icinga DB downtime
history streams.

refs #9101
2021-12-08 14:15:50 +01:00
Alexander Aleksandrovič Klimov 31c564182a
Merge pull request #8990 from Icinga/bugfix/downtime-all-services-on-child-hosts
Fix scheduling of downtimes for all services on child hosts
2021-12-07 12:48:01 +01:00
Julian Brost 596fcdc123 Downtime::DowntimesExpireTimerHandler: don't copy vector
`ConfigType::GetObjectsByType<Downtime>()` already returns a
`std::vector<Downtime::Ptr>` so there is no point in copying it into another
vector of the same type just to then iterate the copied vector instead of the
original one.
2021-12-01 13:05:23 +01:00
Julian Brost 2ad0a4b8c3 Add missing include to fix non-unity builds
This commit fixes the following build error:

    [ 55%] Building CXX object lib/icinga/CMakeFiles/icinga.dir/usergroup.cpp.o
    lib/icinga/usergroup.cpp:79:24: error: incomplete type ‘icinga::Notification’ used in nested name specifier
       79 | std::set<Notification::Ptr> UserGroup::GetNotifications() const
          |                        ^~~
2021-11-17 16:11:15 +01:00
Julian Brost a740b1d66c LegacyTimePeriod::ScriptFunc: fix DST edge-cases
This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
  full days, but ScriptFunc may have called them with some random timestamp
  during the day. This is fixed by always using midnight of the day as
  reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
  On days with DST changes, this could either still be on the same day (a day
  may have 25 hours) or skip an entire day (a day may have 23 hours). This is
  fixed by using a struct tm to advance the time to the next day.
2021-11-17 13:09:10 +01:00
Noah Hilverling 73e0d6e61b Icinga DB: Make sure object relationships are handled correctly 2021-11-12 13:34:57 +01:00
Julian Brost bb0dcdf0b4 Prevent duplicate donwtimes when combining child_options and all_services 2021-09-03 15:44:01 +02:00
Julian Brost e556d3c489 Fix scheduling of downtimes for all services on child hosts
The loop iterated over the services of the wrong host resulting in duplicate
downtimes scheduled for services of the parent host instead of downtimes for
services of the child host.
2021-09-03 15:19:27 +02:00
Alexander A. Klimov 2818245e01 Introduce Checkable#GetLastComment() 2021-07-29 12:10:42 +02:00
Julian Brost 42eb055c5f
Merge pull request #8921 from Icinga/bugfix/timeperiod-dst
TimePeriod/ScheduledDowntime: improve DST handling
2021-07-27 18:11:34 +02:00
Noah Hilverling 07145d2e61
Merge pull request #8913 from Icinga/feature/remove-child-downtimes
API Action "remove-downtime": Also remove child downtimes
2021-07-27 18:02:15 +02:00
Noah Hilverling 7217959206 API Action 'remove-downtime': Also remove child downtimes 2021-07-23 13:53:44 +02:00
Julian Brost 4273f30157 LegacyTimePeriod: Prevent modification of input parameters
Many functions of LegacyTimePeriod take a tm pointer as an input parameter and
then pass it to mktime() which actually modifies it. This causes problems if
tm_isdst was intentionally set to -1 (to automatically detect whether DST is
active at some time) and then a function is called that implicitly sets
tm_isdst and then the values of tm are modified in a way that crosses a DST
change. This resulted in 1 hour offsets with ScheduledDowntimes on days with
DST changes.
2021-07-22 15:17:06 +02:00
Michael Insel da394b2ab0
Implement scheduling_source attribute (#6326)
* Implement scheduling_source attribute

This implements the attribute `scheduling_source` for hosts and services to show which endpoint is running the scheduler for the check.

refs #4814
2021-07-20 11:10:26 +02:00
Alexander Aleksandrovič Klimov bad8059969
Merge pull request #8761 from Icinga/feature/icingadb-perfdata
Icinga DB: introduce icinga:*:state#normalized_performance_data
2021-07-07 12:29:21 +02:00
Julian Brost 7d2a1bbffe
Merge pull request #8310 from Icinga/feature/scheduleddowntime-change-remove-downtimes-8309
On ScheduledDowntime change: remove downtimes created before change
2021-07-07 10:44:08 +02:00
Alexander A. Klimov 43e4ab4760 Checkable::NotifyDowntimeEnd(): don't send Downtime end notification unless triggered
... for fixed Downtimes as well.
2021-07-06 12:50:44 +02:00
Alexander A. Klimov ea5411a6e0 PluginUtility::FormatPerfdata(): normalize UoMs if desired 2021-07-05 19:05:32 +02:00
Alexander A. Klimov 666c5818bb On ScheduledDowntime change: remove future downtimes created before change
refs #8309
2021-07-02 10:37:29 +02:00
Alexander Aleksandrovič Klimov 31f97d3e6a
Merge pull request #8828 from Icinga/bugfix/execute-command-origin-check
event::ExecuteCommand: add missing origin check
2021-06-29 18:08:07 +02:00
Alexander A. Klimov bcc3870f3a On ScheduledDowntime change: ignore downtimes created before change
... while creating new downtimes.

refs #8309
2021-06-29 17:08:41 +02:00
Alexander A. Klimov 1ee26ac89e Introduce Downtime#config_owner_hash
refs #8309
2021-06-29 16:38:33 +02:00
Julian Brost 8f585bd2ee event::ExecuteCommand: add missing origin check
Only handle messages with a trusted origin in
ClusterEvents::ExecuteCommandAPIHandler. Previously, it would not locally
execute any command but forward them to other nodes where they would then have
a trusted origin and be executed.
2021-06-29 11:15:22 +02:00
Julian Brost 5fdfd47176
Merge pull request #8848 from Icinga/bugfix/harden-scheduled-downtimes
ScheduledDowntime::TimerProc(): Catch exceptions to make sure other downtimes are still created
2021-06-28 17:16:57 +02:00
Noah Hilverling f48ad574d7 ScheduledDowntime::TimerProc(): Catch exceptions to make sure other downtimes are still created 2021-06-24 14:05:08 +02:00
Alexander A. Klimov d8e5e07c4f Downtime#Start(): trigger fixed downtimes immediately instead of waiting for the timer
... not to cause e.g. notifications if a problem occurs
between the downtime start time and the timer routine.
2021-06-23 19:16:15 +02:00
Alexander A. Klimov f28b9fb7f3 ScheduledDowntime: ignore not related Downtimes while creating Downtimes 2021-05-19 16:10:57 +02:00
Alexander Aleksandrovič Klimov 1c0ce89cb3
Merge pull request #8681 from Icinga/bugfix/problem-notification-at-downtime-end
Send problem notifications after downtime end for checkables in child zones
2021-03-22 17:56:25 +01:00
Alexander Aleksandrovič Klimov ef8619f76b
Merge pull request #8601 from Icinga/feature/replace-std-boost-bind-with-lambdas-7006
Feature: Replace std/boost::bind() with lambdas
2021-03-18 17:56:13 +01:00
Julian Brost 29727e06c0 Only handle event::SetSuppressed{Notifications,NotificationTypes} within the local zone
Note that even when passing `nullptr` as target zone to `RelayMessage()`, the
cluster message will still be sent to the parent zone. These incoming messages
will now be rejected by the parent nodes. At the moment, there's no way to only
send within the local zone.
2021-03-17 15:05:12 +01:00
Yonas Habteab 43ba2da39c Replace std/boost::bind() function with lambda expression 2021-03-10 16:29:40 +01:00
Julian Brost 02fd60934f
Merge pull request #8008 from Icinga/bugfix/ascii-tables-in-plugin-output-8006
PluginUtility::ParseCheckOutput(): if it doesn't look like perfdata, it's not perfdata
2021-03-05 17:19:38 +01:00
Julian Brost ddbad7937d
Merge pull request #8622 from Icinga/bugfix/dependency-ti-typo-8180
dependency.ti: fix typo
2021-02-05 11:49:03 +01:00
Alexander A. Klimov ebfa73388f dependency.ti: fix typo
refs #8180
2021-02-04 18:29:54 +01:00
Alexander Aleksandrovič Klimov aa0baf6f69
Merge pull request #8099 from Icinga/feature/std-mutex
Use std::mutex, not boost::mutex
2021-02-04 10:19:04 +01:00
Alexander A. Klimov c3388e9af6 Use std::mutex, not boost::mutex 2021-02-03 09:54:57 +01:00
Alexander Aleksandrovič Klimov 9a867c2c25
Merge pull request #8513 from Icinga/bugfix/notifications-downtime-change-in-timeperiod-8509
FireSuppressedNotifications(const Notification::Ptr&): don't send notifications while suppressed by checkable
2021-01-28 10:01:23 +01:00
Alexander Aleksandrovič Klimov 8d1e958275
Make code doc more readable
Co-authored-by: Julian Brost <julian.brost@icinga.com>
2021-01-27 15:43:37 +01:00
Julian Brost 9219f68c83
Merge pull request #8158 from Icinga/bugfix/check-source-passive-7948
Checkable#ProcessCheckResult(): don't overwrite check source
2021-01-26 10:49:55 +01:00
Alexander A. Klimov c3eba7e88d Checkable#ProcessCheckResult(): don't overwrite check source
... set by passive check results.

refs #7948
2021-01-25 16:05:03 +01:00
Alexander Aleksandrovič Klimov 124f98eed4
Merge pull request #8600 from Icinga/feature/flapping-ignore-unknown
Flapping: Allow to ignore states in flapping detection
2021-01-21 13:47:44 +01:00
Alexander Aleksandrovič Klimov ef23ae5f3c
Merge pull request #8267 from efuss/passive_reach
Drop passive check results for unreachable hosts/services
2021-01-20 17:07:52 +01:00
Noah Hilverling e060995fd8 Flapping: Allow to ignore states in flapping calculation 2021-01-20 11:09:03 +01:00
Edgar Fuß 3c050fcc46 Drop passive check results for unreachable hosts/services
Disregard passive check results while no active checks are being scheduled due to violated dependencies.

This copes with the fact that programs feeding passive check results into Icinga may have no notion of reachability and so drive a checkable into HARD state although dependencies have caused active check scheduling being suspended. This may prevent superflous problem notifications being emitted during recovery.

As disable_checks defaults to false, it was regarded OK (by @Al2Klimov) to make this behaviour (which resembles the active check case) unconditional and not conditionalize it on an additional attribute.

In the description of disable_checks, note that a value of true both disables scheduling of active checks and drops passive check results.
2021-01-19 20:08:38 +01:00
Alexander Aleksandrovič Klimov cbd0d6ea6e
Merge pull request #8588 from Icinga/bugfix/concurrent-schedule-downtime-delete-host
Fix null pointer dereferences when deleting objects while scheduling downtimes
2021-01-19 13:51:08 +01:00
Julian Brost 88e5744d54 AddDowntime: return Downtime::Ptr instead of String containing the name
At numerous places in the code, something like this is performed:

    String name = Downtime::AddDowntime(...);
    Downtime::Ptr downtime = Downtime::GetByName(name);

However, `downtime` can be a `nullptr` after this as it is possible that
the downtime is deleted in between.

This commit changes the return type of `Downtime::AddDowntime` to return
a Downtime::Ptr instead of the full name of the downtime. `AddDowntime`
performs the very same `GetByName()` operation internally, but handles
the `nullptr` case correctly and throws an exception.
2021-01-15 16:34:48 +01:00
Alexander Aleksandrovič Klimov c549a6657e
Merge pull request #8562 from Icinga/bugfix/fix-no-renotification-for-non-ok-state-changes-8545
Fix no re-notification for non OK state changes with time delay
2021-01-14 17:49:29 +01:00
Alexander Aleksandrovič Klimov 70b438a2bf
Merge pull request #8104 from Icinga/bugfix/remove-downtime-returns-wrong-status-7408
API: Display a correct status when removing a downtime
2021-01-14 17:49:00 +01:00
Yonas Habteab 997ad86225 Fix no re-notification for non OK state changes with time delay 2021-01-14 11:54:25 +01:00
Julian Brost f12666c166
Merge pull request #8157 from Icinga/bugfix/temporary-files-5124
Clean up temp files
2021-01-13 15:45:29 +01:00
Alexander A. Klimov 450b2117d2 Add ".tmp" to state and modified attributes temp files
refs #5124
2021-01-12 17:35:29 +01:00
Alexander A. Klimov 18c2dae941 Clean up temp files
refs #5124
2021-01-12 17:35:29 +01:00
Julian Brost aea06a27dc Use reference-counted pointer in notification callback
`this` could be deleted after `Notification::BeginExecuteNotification`
exited and before `Notification::ExecuteNotificationHelper` finished.
This is fixed by constructing a `Notification::Ptr` and operate on that
one as it is properly reference-counted.
2021-01-12 17:19:29 +01:00
Yonas Habteab 756abbb2ff ApiEvents: Implement new API event streams response 2021-01-11 14:59:48 +01:00
Julian Brost 0276c0b052 Properly handle service downtime referencing a deleted host
Only two out of three cases were handled properly by the code: host
downtimes referencing a deleted host and service downtimes referencing a
deleted service worked fine. However, if a service downtime references a
deleted host, `Host::GetByName()` returns `nullptr` which isn't
accounted for. Use `Service::GetByNamePair()` instead as this performs a
check for the host being null internally.
2021-01-08 11:12:15 +01:00
Alexander A. Klimov f311dfb775 Apply rules: import default templates first
... to allow to override the attributes they set.

refs #7914
2020-12-14 18:15:18 +01:00
Alexander A. Klimov 5547488cd5 Introduce Checkable#NotificationReasonSuppressed()
refs #8509
2020-12-14 13:27:58 +01:00
Alexander Aleksandrovič Klimov 915a3c3001
Merge pull request #8436 from Icinga/bugfix/children-recover-too-late
On recovery: re-check children
2020-12-11 15:41:31 +01:00
Yonas Habteab dd02e3b6d8 API: Display a correct status code when removing a scheduled downtime 2020-12-07 13:19:41 +01:00
Julian Brost f2a532de32
Merge pull request #8035 from Icinga/feature/expiry-date-comments-4663
/v1/actions/add-comment: add param expiry
2020-12-04 15:48:50 +01:00
Alexander A. Klimov 854939a8ce On recovery: re-check children 2020-12-02 12:24:40 +01:00
Alexander A. Klimov 668bf06424 Don't fire suppressed notifications if last parent recovery >= last check result 2020-12-02 12:03:19 +01:00
Alexander Aleksandrovič Klimov bee4ac7f7c
Merge pull request #8040 from Icinga/feature/v1-actions-execute-command-8034
Add API endpoint: /v1/actions/execute-command
2020-12-02 10:53:24 +01:00
Alexander A. Klimov 5cfac1f643 Fix function and variable names
refs #8034
2020-11-23 16:43:47 +01:00
Alexander A. Klimov 0ad1ab20aa Fix code style
refs #8034
2020-11-23 16:39:24 +01:00
Alexander Aleksandrovič Klimov 4f6fecc74c
Merge pull request #8101 from Icinga/bugfix/timestamps-checkresult-differ-across-nodes-8092
State timestamps set by the same check result differ across nodes
2020-10-30 17:24:15 +01:00
Alexander Aleksandrovič Klimov 3fa1eab344
Merge pull request #7736 from froehl/7735
API-Event StateChange & CheckResult: Added acknowledgement and downtime_depth…
2020-10-29 13:41:52 +01:00
Alexander A. Klimov bb851b0558 Merge branch 'master' into feature/v1-actions-execute-command-8034 2020-10-28 18:37:08 +01:00
Alexander A. Klimov 40ac05c182 Introduce Endpoint#capabilities
refs #8034
2020-10-19 13:04:20 +02:00
Alexander A. Klimov 9e29936b8f Fix missing include
refs #8034
2020-10-19 13:04:20 +02:00
Alexander Aleksandrovič Klimov 9dd197d0af
Merge pull request #8165 from Icinga/bugfix/service-get-severity-deadlock-8160
ProcessCheckResult(): Make sure hosts aren't locked during Service::GetSeverity()
2020-10-16 10:41:23 +02:00
Alexander Aleksandrovič Klimov be8dd201eb
Merge pull request #8229 from Icinga/bugfix/downtime-checkable-getname
Check !!downtime->GetCheckable() before downtime->GetCheckable()->GetName()
2020-10-16 10:39:57 +02:00
Fabian Röhl ca487ed732 #7735 API-Event StateChange & CheckResult: Added acknowledgement and downtime_depth 2020-09-18 08:20:29 +02:00
Alexander A. Klimov fe4a44096a Merge branch 'feature/v1-actions-execute-command-8034-deadline-timer' into feature/v1-actions-execute-command-8034
refs #8034
2020-09-15 18:14:01 +02:00
Alexander A. Klimov 8b0ba2275a Check !!downtime->GetCheckable() before downtime->GetCheckable()->GetName()
... not to crash while removing a downtime from a disappeared checkable.
2020-09-11 14:47:46 +02:00
Mattia Codato 6801419bd6 Add newline ad the end of file 2020-08-26 16:41:02 +02:00
Mattia Codato 19628252f8 Add timer to clean deadlined executions 2020-08-26 15:48:04 +02:00
Alexander A. Klimov ade891bbf5 Revert "MacroProcessor::ResolveArguments(): skip null argument values"
This reverts commit e4bdcedbca.
2020-08-13 10:39:55 +02:00
Noah Hilverling ddf1e50d93 ProcessCheckResult(): Make sure hosts aren't locked during Service::GetSeverity() 2020-08-11 15:24:54 +02:00
Mattia Codato df2b82f7fe Remove an useless check 2020-08-05 16:14:57 +02:00
Mattia Codato 7474ab6de5 Set exit code 126 if endpoint doens't support the new executeCommand API 2020-08-05 15:53:34 +02:00
Mattia Codato 7c004af6be Check child endpoint versions and check child zone can access to the target endpoint 2020-08-05 14:08:54 +02:00
Mattia Codato a90068cc78 Check satellites Icinga version before relay the execute command message 2020-08-05 10:01:29 +02:00
mcodato 730075a177
Merge pull request #1 from Al2Klimov/version
Introduce Endpoint#icinga_version
2020-08-05 09:23:28 +02:00
Mattia Codato ac71cc67f8 Use local zone for update executions 2020-08-04 16:09:21 +02:00
Mattia Codato c6c1849106 Change checkable with the endpoint zone for execute command relay message 2020-08-04 14:32:36 +02:00
Mattia Codato 951388797a Forward the execute command through the zones 2020-08-03 21:09:57 +02:00
Eric Lippmann e8745f7e96
Merge pull request #7816 from Icinga/feature/notification-timeperiod-6167
Re-send notifications previously suppressed by their time periods
2020-08-03 10:04:27 +02:00
Mattia Codato 9c4a3aed1b Use ExecuteOverride to override the command 2020-07-31 17:28:33 +02:00
Mattia Codato cf1430c409 Fix update execution 2020-07-31 14:07:48 +02:00
Mattia Codato 604b938ade Fix macros substitutions 2020-07-31 11:15:17 +02:00
Mattia Codato 44fc841ee1 Notify to all nodes that execution has completed 2020-07-31 10:42:01 +02:00