The Icinga DB daemon processes the data from the `IcingaApplication`
type only and Icinga DB Web also uses only those stats. However, before
this commit, Icinga DB published all kinds of useless stats to Redis
each second, like the number of (un)reachable hosts, services, and so
on, which is waste of CPU and some other resources. This commit reduces
the published data drastically to only those simple stats coming from
the `IcingaApplication` type.
It's not used. Also, the callback shall run completely at once. This ensures that it won't (continue to) run once another coroutine on the strand calls Timeout#Cancel().
By design, only one Icinga 2 instance should be responsible in the HA
context. If this promise is broken, the Icinga 2 IcingaDB check should
report it.
The code did not check for invalid data in icingadb:telemetry:heartbeat.
With this change, it will go CRITICAL with a descriptive message and
report the actual number of icingadb_responsible_instances in the
performance data.
The Redis ACL system was introduced with Redis 6.0. It introduced users
with precisely granular permissions. This change allows Icinga 2 to use
the Icinga DB feature against a Redis with an ACL user.
This was reflected in the documentation, next to the already
implemented, but undocumented Redis database.
Closes#9536.
Each configuration field of an IcingaDB Object was marked with
no_user_modify as modifications via the API would not result in an
actual change. While the Object would be updated, the internal Redis
connection would not be restarted, resulting in an unexpected behavior.
The missing db_index was added to the documentation.
The table sla_history_downtime requires a downtime_end.
The Go daemon takes the cancel_time if has_been_cancelled is 1.
So we must supply a cancel_time whereever has_been_cancelled is 1.
Otherwise the Go daemon can't process some entries.
and, in case of null, fall back to Checkable#check_command.timeout, just like
IcingaDB#SerializeState(). Otherwise the Go daemon crashes. It expects a number.
At the moment, the Icinga DB feature will use that value as-is and
serialize it to JSON, resulting in a crash in Icinga DB down the road
because it expects a boolean.
This commit removes EmbeddedNamespaceValue and ConstEmbeddedNamespaceValue and
reduces NamespaceValue down to a simple struct without inheritance or member
functions. The code from these clases is inlined into the Namespace class. The
class hierarchy determining whether a value is const is moved to an attribute
of NamespaceValue.
This is done in preparation for changes to the locking in the Namespace class.
Currently, it relies on a recursive mutex. In the future, a shared mutex
(read/write lock) should be used instead, which cannot allow recursive locking
(without failing or risk deadlocking on lock upgrades). With this change, all
operations requiring a lock for one operation are within one function, no
recursive locking is not needed any more.
Before (time: vertical, stack: horizontal):
* Checkable::ExecuteCheck
* Checkable::UpdateNextCheck
* IcingaDB::NextCheckChangedHandler
* HSET icinga:host:state
* HSET icinga:checksum:host:state
* ZADD icinga:nextupdate:host
* RandomCheckTask::ScriptFunc
* Checkable::ProcessCheckResult
* Checkable::UpdateNextCheck
* IcingaDB::NextCheckChangedHandler
* HSET icinga:host:state
* HSET icinga:checksum:host:state
* ZADD icinga:nextupdate:host
* IcingaDB::NewCheckResultHandler
* HSET icinga:host:state
* HSET icinga:checksum:host:state
* ZADD icinga:nextupdate:host
* IcingaDB::StateChangeHandler
* XADD icinga:runtime:state
* IcingaDB::ForwardHistoryEntries
* XADD icinga:history:stream:state
After:
* Checkable::ExecuteCheck
* Checkable::UpdateNextCheck
* RandomCheckTask::ScriptFunc
* Checkable::ProcessCheckResult
* Checkable::UpdateNextCheck
* IcingaDB::NewCheckResultHandler
* HSET icinga:host:state
* HSET icinga:checksum:host:state
* ZADD icinga:nextupdate:host
* IcingaDB::StateChangeHandler
* XADD icinga:runtime:state
* IcingaDB::ForwardHistoryEntries
* XADD icinga:history:stream:state
The first state + nextupdate (for overdue) update comes from next_check being
set to now + interval immediately before doing the actual check (not to trigger
it twice). This update is not only not important for the end user, but even
inappropriate. The end user SHALL see next_check being e.g. in -4s, not 5m, as
the check is running at the moment.
The second one is just redundant as IcingaDB::NewCheckResultHandler (the third
one) is called anyway and will update state + nextupdate as well.