Commit Graph

12513 Commits

Author SHA1 Message Date
Alexander Aleksandrovič Klimov ef23ae5f3c
Merge pull request #8267 from efuss/passive_reach
Drop passive check results for unreachable hosts/services
2021-01-20 17:07:52 +01:00
Noah Hilverling e060995fd8 Flapping: Allow to ignore states in flapping calculation 2021-01-20 11:09:03 +01:00
Alexander Aleksandrovič Klimov 5e810f30a7
Merge pull request #8605 from Icinga/bugfix/icingadb-initial-sync-log
IcingaDB: wait for queries to be executed in inital sync
2021-01-20 10:47:46 +01:00
Julian Brost 2d080f14eb IcingaDB: start initial dump in callback instead of timer
Previously, the initial config dump was started in a timer executed
every 15 seconds. During the first execution of the timer, the Redis
connection is typically not established yet. Therefore, this delayed the
initial sync by up to 15 seconds.

This commit instead triggers the sync from a callback that is executed
after the connection is successfully established.

The timer is removed completely. On first glance, it looks like it would
ensure that a lost connection is reestablished, but this is handled
internally by RedisConnection. After the config has been dumped once,
that timer wouldn't ever attempt a reconnect anyways.
2021-01-20 09:31:27 +01:00
Edgar Fuß 3c050fcc46 Drop passive check results for unreachable hosts/services
Disregard passive check results while no active checks are being scheduled due to violated dependencies.

This copes with the fact that programs feeding passive check results into Icinga may have no notion of reachability and so drive a checkable into HARD state although dependencies have caused active check scheduling being suspended. This may prevent superflous problem notifications being emitted during recovery.

As disable_checks defaults to false, it was regarded OK (by @Al2Klimov) to make this behaviour (which resembles the active check case) unconditional and not conditionalize it on an additional attribute.

In the description of disable_checks, note that a value of true both disables scheduling of active checks and drops passive check results.
2021-01-19 20:08:38 +01:00
Julian Brost 174f7f75a8 IcingaDB: wait for queries to be executed in inital sync
This delays the log message stating that the initial dump is done until
all queries are actually done and now logs a meaningful duration. In
addition, this delays the return of the function and therefore when
state variables are updated by the caller.
2021-01-19 17:14:42 +01:00
Alexander Aleksandrovič Klimov 3976f256a8
Merge pull request #8593 from Icinga/bugfix/activate-items-assertion
Remove incorrect assertion in ConfigItem::ActivateItems
2021-01-19 13:52:16 +01:00
Alexander Aleksandrovič Klimov cbd0d6ea6e
Merge pull request #8588 from Icinga/bugfix/concurrent-schedule-downtime-delete-host
Fix null pointer dereferences when deleting objects while scheduling downtimes
2021-01-19 13:51:08 +01:00
Alexander Aleksandrovič Klimov e9abbf1803
Merge pull request #8603 from Icinga/probot/update-authors/master/126f586d882235ccd7fe5d932aeb595e817435d3
Update AUTHORS
2021-01-19 13:48:03 +01:00
icinga-probot[bot] f7c7c329dc
Update AUTHORS 2021-01-19 11:44:39 +00:00
Alexander Aleksandrovič Klimov 126f586d88
Merge pull request #8589 from iustin/feature/check-dns-all-argument
Improve check_dns command when used with monitoring-plugins 2.3
2021-01-19 12:44:24 +01:00
Alexander Aleksandrovič Klimov 54541eccfd
Merge pull request #8596 from efuss/VERSION
Avoid name clashes on case-insensitive file systems
2021-01-19 12:04:23 +01:00
Julian Brost 509db4ab94 Delay start of IcingaDB until most config objects are activated
This commit sets the activation priority if IcingaDB objects to 100 (the
same value as IDO uses) so that it get's activated after most regular
config objects (hosts, services, ...).

Before (note how Icinga 2 continues to active objects for over a minute
after IcingaDB is started and thinks the initial dump is done):

    [2021-01-19 08:33:19 +0000] information/IcingaDB: 'icingadb' started.
    [2021-01-19 08:34:02 +0000] information/IcingaDB: Initial config/status dump finished in 28.247 seconds.
    [2021-01-19 08:35:49 +0000] information/ConfigItem: Activated all objects.

After (now activation of objects is done right after IcingaDB is
started, as it's one of the last objects to be activated):

    [2021-01-19 08:39:01 +0000] information/IcingaDB: 'icingadb' started.
    [2021-01-19 08:39:02 +0000] information/ConfigItem: Activated all objects.
    [2021-01-19 08:39:38 +0000] information/IcingaDB: Initial config/status dump finished in 21.6606 seconds.
2021-01-19 09:45:47 +01:00
Julian Brost 6abab6bddc
Merge pull request #8599 from Icinga/bugfux/check-if-cr-is-not-nullptr-while-writing-history-8592
IcingaDB: Check whether or not cr is nullptr
2021-01-18 18:36:06 +01:00
Iustin Pop 8509e55b78 Improve check_dns command when used with monitoring-plugins 2.3
Version 2.3 of monitoring plugins did two things:

- change how multiple addresses are expected; no longer a single
  argument, with comma separated values, but repeated "-a" argument;
  sadly this is incompatible change, and configs need to be changed
  manually; this is commit
  monitoring-plugins/monitoring-plugins@a03068743f;
- add a "-L" argument that requires all passed addresses to be
  matched, which allows for stronger validation (all vs. at least
  one); this is commit
  monitoring-plugins/monitoring-plugins@fd9a7d2e00;

Both of these were committed a long while ago (2018), but were only
released very recently, in the 2.3 release (December 2020).

I've tried to make the descriptions as good as I could, but not sure
they're very readable, feedback welcome.

Signed-off-by: Iustin Pop <iustin@k1024.org>
2021-01-18 18:10:33 +01:00
Yonas Habteab 5b0bbd6351 IcingaDB: Check whether or not cr is nullptr 2021-01-18 11:38:31 +01:00
Alexander Aleksandrovič Klimov b8e0d23164
Merge pull request #8597 from Icinga/probot/update-authors/master/a55c7d4b86cc9723aee078d3a4a81fcf70a4b788
Update AUTHORS
2021-01-15 17:56:30 +01:00
icinga-probot[bot] fe8b07ddc8
Update AUTHORS 2021-01-15 16:55:36 +00:00
Alexander Aleksandrovič Klimov a55c7d4b86
Merge pull request #8568 from yayayayaka/feature/itl-check-systemd
ITL: Add systemd CheckCommand
2021-01-15 17:55:28 +01:00
Edgar Fuß 718ebe3cbd Avoid name clashes on case-insensitive file systems
On case-insensitive file systems (i.e. macOS), the VERSION file collides with the Boost-provided version file on #include <version>.

Work around by re-naming VERSION to ICINGA2_VERSION.
2021-01-15 17:46:16 +01:00
Julian Brost e727675aaf Remove incorrect assertion in ConfigItem::ActivateItems
There is an assertion that after activating items, all these items are
active, which sounds reasonable at first. However, with concurrent API
queries, some of these could already be deleted and therefore be
deactivated again.
2021-01-15 16:40:07 +01:00
Julian Brost 88e5744d54 AddDowntime: return Downtime::Ptr instead of String containing the name
At numerous places in the code, something like this is performed:

    String name = Downtime::AddDowntime(...);
    Downtime::Ptr downtime = Downtime::GetByName(name);

However, `downtime` can be a `nullptr` after this as it is possible that
the downtime is deleted in between.

This commit changes the return type of `Downtime::AddDowntime` to return
a Downtime::Ptr instead of the full name of the downtime. `AddDowntime`
performs the very same `GetByName()` operation internally, but handles
the `nullptr` case correctly and throws an exception.
2021-01-15 16:34:48 +01:00
Lara bcaa7d6292 ITL: Add check_systemd
The [check_systemd.py](https://github.com/Josef-Friedrich/check_systemd)
plugin will report a degraded system to your monitoring solution. It
requires only the
[nagiosplugin](https://nagiosplugin.readthedocs.io/en/stable) library.
2021-01-15 13:27:25 +01:00
Alexander Aleksandrovič Klimov 986bedd9a0
Merge pull request #8594 from Icinga/feature/remove-upq-from-activate-items
Remove upq from ConfigItem::ActivateItems
2021-01-15 12:09:57 +01:00
Alexander Aleksandrovič Klimov 4063e39d5f
Merge pull request #8515 from Icinga/feature/update-ssl-context-after-accepting-new-connection-8501
API: Update the ssl context after each accepting incoming connection
2021-01-15 11:21:36 +01:00
Noah Hilverling 4554d3c50e
Merge pull request #8529 from Icinga/feature/fedora31
GitHub actions: drop Fedora 31
2021-01-15 09:28:36 +01:00
Yonas Habteab d27f533e5f ApiListener: Update the ssl cont after each accepting incoming connection 2021-01-14 18:40:20 +01:00
Yonas Habteab 057254695d Utility: Introduce new helper function Utility::GetFileCreationTime() 2021-01-14 18:39:14 +01:00
Alexander Aleksandrovič Klimov d82e4987bd
Merge pull request #8427 from Icinga/bugfix/fork-exit
StartUnixWorker(): don't exit() on fork() failure
2021-01-14 17:49:41 +01:00
Alexander Aleksandrovič Klimov c549a6657e
Merge pull request #8562 from Icinga/bugfix/fix-no-renotification-for-non-ok-state-changes-8545
Fix no re-notification for non OK state changes with time delay
2021-01-14 17:49:29 +01:00
Alexander Aleksandrovič Klimov 70b438a2bf
Merge pull request #8104 from Icinga/bugfix/remove-downtime-returns-wrong-status-7408
API: Display a correct status when removing a downtime
2021-01-14 17:49:00 +01:00
Alexander Aleksandrovič Klimov 3ebd44ce32
Merge pull request #8001 from Icinga/feature/documentation-time-zones-7069
Doc: clarify TimePeriod/ScheduledDowntime time zone handling
2021-01-14 17:23:55 +01:00
Alexander A. Klimov 28bd23824d Doc: clarify TimePeriod/ScheduledDowntime time zone handling
refs #7069
2021-01-14 16:49:08 +01:00
Alexander A. Klimov 931b9307ae StartUnixWorker(): don't exit() on fork() failure
... but let the caller handle the failure.

Not to stop working completely just because of fork() failure during a reload.
2021-01-14 13:40:18 +01:00
Alexander A. Klimov e1bc4d474f On check_timeout first send SIGTERM
... to allow check plugins to terminate gracefully.

refs #6162
2021-01-14 12:00:11 +01:00
Yonas Habteab 997ad86225 Fix no re-notification for non OK state changes with time delay 2021-01-14 11:54:25 +01:00
Alexander Aleksandrovič Klimov 5efe3e662c
Merge pull request #8025 from Icinga/bugfix/downtime-for-host-service-with-long-name-8022
ConfigObjectUtility::GetObjectConfigPath(): hash names of not already existing objects
2021-01-14 10:42:04 +01:00
Julian Brost a68d774f78
Merge pull request #8581 from bebehei/shell-exitcode
Shell exitcode
2021-01-14 09:56:21 +01:00
Julian Brost db30704d14
Merge pull request #8532 from Icinga/bugfix/do-not-override-error-codes-that-are-not-200
HTTP: Do not override status codes that are not 200
2021-01-14 09:34:04 +01:00
Yonas Habteab 066db5ef60 HTTP: Don't override status codes that are not OK 2021-01-13 18:56:56 +01:00
Julian Brost f12666c166
Merge pull request #8157 from Icinga/bugfix/temporary-files-5124
Clean up temp files
2021-01-13 15:45:29 +01:00
Julian Brost 0c6abc817b Remove upq from ConfigItem::ActivateItems
Since commit d9010c7b9f, ActivateItems no
longer uses the WorkQueue upq to perform tasks but instead performs
these locally. One instance of `upq.Join()`/`upq.HasExceptions()`
remained in the function, but I believe this was just missed when
removing the `upq.Enqueue()` call just before.

This commit removes the corresponding parameter and updates all call
sites accordingly.
2021-01-13 15:19:55 +01:00
Alexander Aleksandrovič Klimov 5a0118c6d8
Merge pull request #8442 from Icinga/bugfix/close-ebadf-8437
Close FDs based on /proc/self/fd
2021-01-13 10:59:15 +01:00
Alexander Aleksandrovič Klimov f1110eb321
Merge pull request #8591 from Icinga/bugfix/concurent-notification-send-and-delete
Fix crash when notifications are sent while the notification object is deleted
2021-01-13 10:58:37 +01:00
Alexander A. Klimov 68a0079c26 ConfigObjectUtility::GetObjectConfigPath(): hash names of not already existing objects
... to avoid too long file names.

refs #8022
2021-01-12 18:03:22 +01:00
Alexander A. Klimov 450b2117d2 Add ".tmp" to state and modified attributes temp files
refs #5124
2021-01-12 17:35:29 +01:00
Alexander A. Klimov 18c2dae941 Clean up temp files
refs #5124
2021-01-12 17:35:29 +01:00
Alexander A. Klimov 26c944125b Close FDs based on /proc/self/fd
... not to waste time with close(2)ing RLIMIT_NOFILE-3 non-existing FDs.

Newer kernel = higher RLIMIT_NOFILE = more time wasted

refs #8437
2021-01-12 17:32:28 +01:00
Julian Brost aea06a27dc Use reference-counted pointer in notification callback
`this` could be deleted after `Notification::BeginExecuteNotification`
exited and before `Notification::ExecuteNotificationHelper` finished.
This is fixed by constructing a `Notification::Ptr` and operate on that
one as it is properly reference-counted.
2021-01-12 17:19:29 +01:00
Julian Brost 5f548c8f89
Merge pull request #8431 from Icinga/feature/windows-lower-fqdn-7407
Windows agent: Default to lower case FQDN
2021-01-12 12:44:58 +01:00