6129 Commits

Author SHA1 Message Date
Julian Brost
0dd94ece28 TimeoutLogger: don't log stack traces
Provides little benefit given the different log messages, but have a high risk
of flooding the log for HTTP requests.
2024-02-09 19:21:53 +01:00
Julian Brost
0690a1ce70 testing 2024-02-09 17:13:45 +01:00
Julian Brost
04cbeac49e Log long-running and long-waiting CpuBoundWork tasks 2024-02-09 17:13:45 +01:00
Yonas Habteab
5c844504e5 HttpServerConnection: Drop superfluous CpuBoundWork usage 2024-02-09 16:03:26 +01:00
Yonas Habteab
64f469d28e Revert "Boost Coroutines: Increase the default stack size from 64 to 256KB"
This reverts commit f62f2eb25ed5592e4432dd59fe903fccfe0b165a.
2024-02-09 14:55:30 +01:00
Yonas Habteab
dbf1ebdc82 EventsHandler: Drop superfluous CpuBoundWork usage 2024-02-09 12:13:40 +01:00
Yonas Habteab
cd7d413d9e Drop redundant CpuBoundWork usages in lib/remote 2024-02-09 12:13:40 +01:00
Julian Brost
78c5ebd588 HttpServerConnection: log disconnected message after the client was actually disconnected
Previously, the "HTTP client disconnected" message was logged before shutting
down the connection even started.
2024-02-09 12:05:09 +01:00
Julian Brost
1a11ad2147 HttpServerConnection: add timeout for TLS shutdown in Disconnect()
`async_shutdown()` performs a TLS shutdown, which exchanges messages, which can
hang. Therefore, it has to be protected by a timeout that cancels it if needed.
2024-02-09 12:05:09 +01:00
Julian Brost
06584c2880 HttpServerConnection: use exceptions for error handling
When a HTTP connection dies prematurely while the response is sent,
`http::async_write()` sets the error code to something like broken pipe for
example. When calling `async_flush()` afterwards, it sometimes happens that
this never returns. This results in a resource leak as the coroutine isn't
cleaned up. This commit makes the individual functions throw exceptions instead
of silently ignoring the errors, resulting in the function terminating early
and also resulting in an error being logged as well.
2024-02-09 12:05:09 +01:00
Yonas Habteab
2054525e78 Drop redundant CpuBoundWork usage in JsonRpcConnection::Disconnect()
Although there is locking involved here, it shoudln't take too long for
the thread to actually acquire it, since there aren't that many threads
dealing with endpoint clients concurrently. It's just wasting pointless
time trying to obtain a CPU slot.
2024-02-09 11:06:28 +01:00
Eric Lippmann
f33c301a3c IoEngine: Always log coroutine exception diagnostics
While analyzing a possible memory leak, we encountered several coroutine
exception messages, which unfortunately do not provide any information
about what exactly went wrong, as exception diagnostics were previously
only logged at the notice level.
2024-02-09 11:06:13 +01:00
Alexander Aleksandrovič Klimov
600e631a4d
Merge pull request #9945 from Icinga/2139backport
Disable TLS renegotiation, bump Windows deps and fix Icinga DB crashes
2023-12-20 12:14:30 +01:00
Alexander A. Klimov
b2d975f916 IcingaDB#SendConfigDelete(): fix missing nullptr check before deref 2023-12-20 10:29:17 +01:00
Alexander A. Klimov
873988129f Icinga DB downtime history: provide cancel_time where has_been_cancelled may be 1
The table sla_history_downtime requires a downtime_end.
The Go daemon takes the cancel_time if has_been_cancelled is 1.
So we must supply a cancel_time whereever has_been_cancelled is 1.
Otherwise the Go daemon can't process some entries.
2023-12-20 10:29:17 +01:00
Alexander A. Klimov
89c54ca5e5 Disable TLS renegotiation
The API doesn't need it and a customer's security scanner
is afraid of a potential DoS attack vector.
2023-12-20 10:05:35 +01:00
Alexander A. Klimov
7a8bd0f6ea RequestCertificateHandler(): also renew if CA needs a renewal
and a newer one is available.
2023-12-18 17:07:44 +01:00
Alexander A. Klimov
5bf8db41ef CertificateToString(): allow raw pointer input 2023-12-18 17:07:44 +01:00
Alexander A. Klimov
e7a50f3e7c ApiListener#Start(): auto-renew CA on its owner
otherwise it would expire.
2023-12-18 17:07:44 +01:00
Alexander A. Klimov
1e31bc13f0 ApiListener#RenewCert(): enable optional CA creation 2023-12-18 17:07:44 +01:00
Alexander A. Klimov
d1098dc959 CreateCertIcingaCA(EVP_PKEY*, X509_NAME*): enable optional CA creation 2023-12-18 17:07:44 +01:00
Alexander A. Klimov
35317f14e7 Introduce IsCaUptodate() by splitting IsCertUptodate() 2023-12-18 17:07:44 +01:00
Alexander Aleksandrovič Klimov
eacf5f27cf
Merge pull request #9816 from Icinga/2.13.8/vendor
Update vendored libs
2023-07-07 16:29:20 +02:00
Alexander Aleksandrovič Klimov
3e682a99ef
Merge pull request #9814 from Icinga/2.13.8/icingadb
Icinga DB feature: normalize several Redis data not to crash the Go daemon
2023-07-06 17:32:14 +02:00
Alexander Aleksandrovič Klimov
8cd11a9146
Merge pull request #9822 from Icinga/2.13.8/bugfix/cluster-zone-own-zone-8570
cluster-zone: consider own zone connected if there's only one endpoint
2023-07-06 14:24:52 +02:00
Alexander Aleksandrovič Klimov
56a22461c9
Merge pull request #9818 from Icinga/2.13.8/ElasticsearchWriter-Pause
ElasticsearchWriter#Pause(): call Flush() only once
2023-07-06 10:17:57 +02:00
Alexander Aleksandrovič Klimov
45d5a3f5f3
Merge pull request #9817 from Icinga/flexible-downtimes-disappear-too-early-9797
Downtime#Start(): trigger flexible downtimes not earlier than fixed ones
2023-07-05 17:06:03 +02:00
Alexander Aleksandrovič Klimov
51afc74310
Merge pull request #9820 from Icinga/2.13.8/checkable-processcheckresult-only-clean-up-ack-comments-older-than-check-result-9718
Checkable#ProcessCheckResult(): only clean up ack comments older than check result
2023-07-05 11:20:55 +02:00
Alexander Aleksandrovič Klimov
3f9769f4ed
Merge pull request #9821 from Icinga/2.13.8/bugfix/perfdata-dont-get-parsed-correctly-8912
PluginUtility: Fix PerfData parsing for values separated with multiple spaces
2023-07-04 21:11:08 +02:00
Alexander A. Klimov
363c1b2986 cluster-zone: consider own zone connected if there's only one endpoint
... because in this case only the checking node can be (not) connected to itself.

refs #8570
2023-07-04 11:12:29 +02:00
Yonas Habteab
ff0b45eca0 PluginUtility: Fix PerfData don't get parsed correctly
The problem was that some PerfData labels contained several whitespace characters,
not just one, and therefore it was parsed incorrectly in `SplitPerfdata()`. I.e. the condition
in line 144 checks whether the first and last character is a normal quote, but since the
label can contain spaces at the beginning and at the end respectively, this caused the problems.

This PR fixes the problem by removing all occurring whitespace from the beginning and end,
before starting to parse the actual label.
2023-07-04 11:10:38 +02:00
Alexander A. Klimov
6dffc57a37 Checkable#ProcessCheckResult(): only clean up ack comments older than check result
Normally if for some reason an ack comment still exists on a checkable not
acked anymore, still clean it up. But while replaying log config objects
incl. ack comments come before check results and acks. I.e. 1) ack comment,
2) DOWN check result and 3) ack. Not 1) DOWN check result, 2) ack and 3) ack
comment. So the checkable is temporarily not acked, but already has the ack
comment. In this case the DOWN check result which is older than the ack
comment shall not clean up the latter.
2023-07-04 10:56:30 +02:00
Alexander A. Klimov
c160c4b62e Checkable#RemoveAckComments(): add optional comment entry time filter 2023-07-04 10:56:30 +02:00
Alexander A. Klimov
0470fe12a7 Checkable#RemoveCommentsByType(): remove redundant parameter 2023-07-04 10:56:30 +02:00
Alexander Aleksandrovič Klimov
8dca4d7121 ElasticsearchWriter#Pause(): lock m_DataBufferMutex during Flush()
just to be sure regarding race conditions.
2023-07-04 10:50:03 +02:00
Alexander Aleksandrovič Klimov
73f8c4473e ElasticsearchWriter#Pause(): call Flush() only once
The first Flush() is redundant and may access m_DataBuffer at the same time as some Flush() in m_WorkQueue (race condition) which isn't joined, yet.
2023-07-04 10:49:53 +02:00
Alexander A. Klimov
43c4feb645 Downtime#Start(): trigger flexible downtimes not earlier than fixed ones
the last state change could be a long time ago. If it's longer than
the new downtime's duration, the downtime expires immediately.

trigger time + duration < now
2023-07-04 10:39:14 +02:00
Alexander A. Klimov
b3d90f5418 Update third-party/nlohmann_json to v3.9.1
the latest version w/o Apache 2.0 licensed code which conflicts with GPL 2.
2023-07-03 17:40:53 +02:00
Alexander A. Klimov
e0e10a7efa ApiListener#NewClientHandlerInternal(): on basic_socket#cancel() (due to timeout) don't ssl::stream#async_shutdown()
If a connection hangs for too long in ApiListener#NewClientHandler(),
ApiListener#AddConnection()'s Timeout calls boost::asio::basic_socket#cancel()
on that connection to trigger an exception which unwinds
ApiListener#NewClientHandler(). Previously that unwind could trigger a Defer
which called boost::asio::ssl::stream#async_shutdown() which extended the hang.
2023-07-03 17:16:26 +02:00
Alexander A. Klimov
243b8aa7a8 Connect(): don't try next DNS record if operation is canceled
Instead return immediately to meet the caller's expectations.
2023-07-03 17:16:26 +02:00
Alexander A. Klimov
0735966e23 IcingaDB::PrepareObject(): cut off (null) negative Notification#times.{begin,end} not to crash Go daemon
At least our PostgreSQL schema enforces positive values.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
04457f5f16 IcingaDB::PrepareObject(): round Notification#times.{begin,end} not to crash Go daemon
The latter expects ints, not floats - not to mention strings.
Luckily Icinga already enforces numeric strings so that we can cast it to number.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
b7ecefb3c0 IcingaDB::PrepareObject(): round Notification#interval and limit it to >=0
otherwise, e.g. with -42.5, the Go daemon crashes. It expects uints there.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
2f1732e7e6 IcingaDB::PrepareObject(): cut off (0) negative Command#timeout for Redis
not to crash the Go daemon which expects positive values there.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
766e28e1aa IcingaDB::PrepareObject(): convert non-null Checkable#check_timeout to number
and, in case of null, fall back to Checkable#check_command.timeout, just like
IcingaDB#SerializeState(). Otherwise the Go daemon crashes. It expects a number.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
f0176001fe Icinga DB: don't write negative Downtime durations into Redis
via `std::max(0, x)` not to crash the Go daemon which can't handle such.
2023-07-03 17:08:40 +02:00
Alexander A. Klimov
99350e6b27 Icinga DB feature: normalize *Command.arguments[*].{required,skip_key,repeat_key} to boolean
At the moment, the Icinga DB feature will use that value as-is and
serialize it to JSON, resulting in a crash in Icinga DB down the road
because it expects a boolean.
2023-07-03 17:08:40 +02:00
Alexander Aleksandrovič Klimov
c0bd0936f9
Merge pull request #9682 from Icinga/9631-213
Setup all signal handlers with SA_RESTART flag
2023-02-16 16:24:26 +01:00
Alexander Aleksandrovič Klimov
fe2fed4817
Merge pull request #9680 from Icinga/9488-213
Fix compile error on Solaris 11.4
2023-02-16 16:24:05 +01:00
Alexander Aleksandrovič Klimov
6dfc21f9bd
Merge pull request #9678 from Icinga/181b213
Bump Boost to v1.81
2023-02-16 16:23:50 +01:00