Previously, the initial config dump was started in a timer executed
every 15 seconds. During the first execution of the timer, the Redis
connection is typically not established yet. Therefore, this delayed the
initial sync by up to 15 seconds.
This commit instead triggers the sync from a callback that is executed
after the connection is successfully established.
The timer is removed completely. On first glance, it looks like it would
ensure that a lost connection is reestablished, but this is handled
internally by RedisConnection. After the config has been dumped once,
that timer wouldn't ever attempt a reconnect anyways.
Disregard passive check results while no active checks are being scheduled due to violated dependencies.
This copes with the fact that programs feeding passive check results into Icinga may have no notion of reachability and so drive a checkable into HARD state although dependencies have caused active check scheduling being suspended. This may prevent superflous problem notifications being emitted during recovery.
As disable_checks defaults to false, it was regarded OK (by @Al2Klimov) to make this behaviour (which resembles the active check case) unconditional and not conditionalize it on an additional attribute.
In the description of disable_checks, note that a value of true both disables scheduling of active checks and drops passive check results.
This delays the log message stating that the initial dump is done until
all queries are actually done and now logs a meaningful duration. In
addition, this delays the return of the function and therefore when
state variables are updated by the caller.
This commit sets the activation priority if IcingaDB objects to 100 (the
same value as IDO uses) so that it get's activated after most regular
config objects (hosts, services, ...).
Before (note how Icinga 2 continues to active objects for over a minute
after IcingaDB is started and thinks the initial dump is done):
[2021-01-19 08:33:19 +0000] information/IcingaDB: 'icingadb' started.
[2021-01-19 08:34:02 +0000] information/IcingaDB: Initial config/status dump finished in 28.247 seconds.
[2021-01-19 08:35:49 +0000] information/ConfigItem: Activated all objects.
After (now activation of objects is done right after IcingaDB is
started, as it's one of the last objects to be activated):
[2021-01-19 08:39:01 +0000] information/IcingaDB: 'icingadb' started.
[2021-01-19 08:39:02 +0000] information/ConfigItem: Activated all objects.
[2021-01-19 08:39:38 +0000] information/IcingaDB: Initial config/status dump finished in 21.6606 seconds.
Version 2.3 of monitoring plugins did two things:
- change how multiple addresses are expected; no longer a single
argument, with comma separated values, but repeated "-a" argument;
sadly this is incompatible change, and configs need to be changed
manually; this is commit
monitoring-plugins/monitoring-plugins@a03068743f;
- add a "-L" argument that requires all passed addresses to be
matched, which allows for stronger validation (all vs. at least
one); this is commit
monitoring-plugins/monitoring-plugins@fd9a7d2e00;
Both of these were committed a long while ago (2018), but were only
released very recently, in the 2.3 release (December 2020).
I've tried to make the descriptions as good as I could, but not sure
they're very readable, feedback welcome.
Signed-off-by: Iustin Pop <iustin@k1024.org>
On case-insensitive file systems (i.e. macOS), the VERSION file collides with the Boost-provided version file on #include <version>.
Work around by re-naming VERSION to ICINGA2_VERSION.
There is an assertion that after activating items, all these items are
active, which sounds reasonable at first. However, with concurrent API
queries, some of these could already be deleted and therefore be
deactivated again.
At numerous places in the code, something like this is performed:
String name = Downtime::AddDowntime(...);
Downtime::Ptr downtime = Downtime::GetByName(name);
However, `downtime` can be a `nullptr` after this as it is possible that
the downtime is deleted in between.
This commit changes the return type of `Downtime::AddDowntime` to return
a Downtime::Ptr instead of the full name of the downtime. `AddDowntime`
performs the very same `GetByName()` operation internally, but handles
the `nullptr` case correctly and throws an exception.
Since commit d9010c7b9f, ActivateItems no
longer uses the WorkQueue upq to perform tasks but instead performs
these locally. One instance of `upq.Join()`/`upq.HasExceptions()`
remained in the function, but I believe this was just missed when
removing the `upq.Enqueue()` call just before.
This commit removes the corresponding parameter and updates all call
sites accordingly.
`this` could be deleted after `Notification::BeginExecuteNotification`
exited and before `Notification::ExecuteNotificationHelper` finished.
This is fixed by constructing a `Notification::Ptr` and operate on that
one as it is properly reference-counted.