Without this commit, children and parents of a checkable were rescheduled on a
state change while holding the lock for the current checkable. If both ends of
a dependency are checked at the same time and both change state, they could end
up in a deadlock waiting for each other.
This commit fixes this problem by changing the code so that other checkables
are rescheduled only after releasing the lock for the current checkable.
to ensure Checkable#IsReachable() returns correctly for dependency children inside OnReachabilityChanged().
That needs the dependency parent to be already in the correct state.
refs #9143
When triggering a downtime, the time of the causing event is now passed on as
the trigger time. That time is:
* For fixed downtimes: the later one of start and entry time.
* If a check result triggers the downtime: The execution end of the check
result.
* If another downtime triggers the downtime: The trigger time of the first
downtime.
This is done so two nodes in a HA setup can write consistent Icinga DB downtime
history streams.
refs #9101
* Implement scheduling_source attribute
This implements the attribute `scheduling_source` for hosts and services to show which endpoint is running the scheduler for the check.
refs #4814
This patch changes the way checkresults are handled during a restart.
1. Check results coming in during a shutdown are ignored.
2. Upon start, checks which should have ran (next_check in the past),
are re-scheduled within the first minute.
This new behavior means there will be no more "Unknown - Terminated"
checkresults during a restart and checks with high check_interval will
be run earlier if they were already scheduled to run. The downside is
that after Icinga2 was down for a while, there will be a lot of checks
within the first minute. Our max concurrent check should take care of
this though.
The `process-check-result` action can now optionally set the
`ttl` parameter. This overrules the configured freshness
check (check_interval).
The main idea behind this is to allow the external sender
to specify when the next check result is coming in.
For example, a backup script which should be run every
24h can specify the exact expected next check result.
The addition to the CheckResult class is necessary to
forward the check result throughout the cluster and
calculate the `next_check` value on each node. This
allows us to send in a check result on a satellite,
and the master determines the freshness and possible
notifications/state changes for Icinga Web 2.