`async_shutdown()` performs a TLS shutdown, which exchanges messages, which can
hang. Therefore, it has to be protected by a timeout that cancels it if needed.
When a HTTP connection dies prematurely while the response is sent,
`http::async_write()` sets the error code to something like broken pipe for
example. When calling `async_flush()` afterwards, it sometimes happens that
this never returns. This results in a resource leak as the coroutine isn't
cleaned up. This commit makes the individual functions throw exceptions instead
of silently ignoring the errors, resulting in the function terminating early
and also resulting in an error being logged as well.
The table sla_history_downtime requires a downtime_end.
The Go daemon takes the cancel_time if has_been_cancelled is 1.
So we must supply a cancel_time whereever has_been_cancelled is 1.
Otherwise the Go daemon can't process some entries.
The problem was that some PerfData labels contained several whitespace characters,
not just one, and therefore it was parsed incorrectly in `SplitPerfdata()`. I.e. the condition
in line 144 checks whether the first and last character is a normal quote, but since the
label can contain spaces at the beginning and at the end respectively, this caused the problems.
This PR fixes the problem by removing all occurring whitespace from the beginning and end,
before starting to parse the actual label.
Normally if for some reason an ack comment still exists on a checkable not
acked anymore, still clean it up. But while replaying log config objects
incl. ack comments come before check results and acks. I.e. 1) ack comment,
2) DOWN check result and 3) ack. Not 1) DOWN check result, 2) ack and 3) ack
comment. So the checkable is temporarily not acked, but already has the ack
comment. In this case the DOWN check result which is older than the ack
comment shall not clean up the latter.
the last state change could be a long time ago. If it's longer than
the new downtime's duration, the downtime expires immediately.
trigger time + duration < now
If a connection hangs for too long in ApiListener#NewClientHandler(),
ApiListener#AddConnection()'s Timeout calls boost::asio::basic_socket#cancel()
on that connection to trigger an exception which unwinds
ApiListener#NewClientHandler(). Previously that unwind could trigger a Defer
which called boost::asio::ssl::stream#async_shutdown() which extended the hang.
and, in case of null, fall back to Checkable#check_command.timeout, just like
IcingaDB#SerializeState(). Otherwise the Go daemon crashes. It expects a number.
At the moment, the Icinga DB feature will use that value as-is and
serialize it to JSON, resulting in a crash in Icinga DB down the road
because it expects a boolean.
if caused by dependency or check period.
Now as long as any of the above causes check skips
next check and next update will be up-to-date in Icinga DB,
so the checkable won't slide into false positive overdue.