icinga2

mirror of https://github.com/Icinga/icinga2.git synced 2025-04-08 17:05:25 +02:00

Author	SHA1	Message	Date
Alexander Aleksandrovič Klimov	c538324040	Merge pull request #10304 from Icinga/win-progfiles-icinga2-var214 On Windows, don't create C:\Program Files\Icinga2\var during MSI build	2025-01-16 14:03:58 +01:00
Alexander A. Klimov	e241a240a8	On Windows, don't create C:\Program Files\Icinga2\var during MSI build	2025-01-16 12:04:12 +01:00
Alexander A. Klimov	32d604f954	Ido*sqlConnection#FieldToEscapedString(): don't write out of range time MySQL's FROM_UNIXTIME() NULLs ts <1970, errors for >2038. Postgres' TO_TIMESTAMP() errors for all ts not between 4713BC - 294276AD.	2025-01-14 10:05:36 +01:00
Alexander A. Klimov	ecce7f8dcb	Ido*sqlConnection#FieldToEscapedString(): don't overflow timestamps > long	2025-01-14 10:05:36 +01:00
Yonas Habteab	2c0925cedd	Merge pull request #10293 from Icinga/graceful-tls-disconnect-214 Add a dedicated method for disconnecting TLS connections	2025-01-14 10:03:22 +01:00
Yonas Habteab	ee98a9e335	Merge pull request #10298 from Icinga/timestamp-serialization-issues IcingaDB: limit several numbers not to crash Go daemon	2025-01-13 16:27:27 +01:00
Yonas Habteab	f53e5343c8	Merge pull request #10292 from Icinga/rpc-sync-failures Runtime RPC sync failures	2025-01-13 14:44:02 +01:00
Alexander A. Klimov	cf895e7e3f	IcingaDB::TimestampToMilliseconds(): limit output to four year digits Too high timestamps may overflow uint64_t (and the YYYY format) and negative ones don't fit into uint64_t. Those may crash our Go daemon.	2025-01-13 14:30:54 +01:00
Alexander A. Klimov	c21e99a15c	IcingaDB#SerializeState(): limit execution_time and latency to 2^32-1 not to write higher values into Redis than the Icinga DB schema can hold. This fixes yet another potential Go daemon crash.	2025-01-13 14:30:32 +01:00
Yonas Habteab	14b854d891	Merge pull request #10296 from Icinga/comment-loading-nullptr-deference Address comment loading where host reference is not found gracefully	2025-01-13 13:17:40 +01:00
Yonas Habteab	7defb0c942	Merge pull request #10295 from Icinga/do-not-write-new-messages-on-shutdown JsonRpcConnection: don't write new messages on shutdown	2025-01-13 13:11:17 +01:00
Yannick Martin	ec2645d33c	icinga2: address comment loading where host reference is not found address #9752: check if host reference is valid	2025-01-13 11:19:42 +01:00
Alexander Aleksandrovič Klimov	ebf905a220	JsonRpcConnection: don't write new messages on shutdown In fact, this is already done for the outer loop (for each bulk), just not yet for the inner one (for each message of a bulk). So once the remote signals EOF, don't try to process the remaining queue until write error (which can't be associated with a particular message anyway, due to buffering), but just let the peer go. Flush already half-written messages, though, if possible.	2025-01-13 11:17:23 +01:00
Alexander A. Klimov	2bc1c8e1dc	Document Timeout	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	e544fef7a2	Timeout: explicitly delete #Timeout(const Timeout&), #Timeout(Timeout&&), #operator=(const Timeout&), #operator=(Timeout&&)	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	d956920bd7	Move Timeout instances from heap to stack	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	1703f99d14	Don't call Timeout#Cancel() where Timeout#~Timeout() is called	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	fe1420523a	Timeout#~Timeout(), #Cancel(): support boost::asio::io_context running on multiple threads	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	a47508b7b3	Timeout#Timeout(): drop unnecessary template parameters	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	d69291739f	While using Timeout, don't unnecessarily keep the strand alive via smart pointer	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	ff5ae18b9c	Timeout: use a plain callback, not an unnecessary coroutine	2025-01-13 10:42:36 +01:00
Alexander A. Klimov	f839707c4a	Timeout#Timeout(): don't pass yield_context to callback It's not used. Also, the callback shall run completely at once. This ensures that it won't (continue to) run once another coroutine on the strand calls Timeout#Cancel().	2025-01-13 10:42:36 +01:00
Yonas Habteab	a88d6988b4	JsonRpcConnection: Log message processing time stats Co-Authored-By: Julian Brost <julian.brost@icinga.com>	2025-01-13 10:39:23 +01:00
Yonas Habteab	7225d78047	HttpServerConnection: Log noticable CPU semaphore wait time	2025-01-13 10:39:23 +01:00
Yonas Habteab	7b30cb3431	Don't endlessly wait on writer coroutine on disconnect	2025-01-13 10:36:21 +01:00
Yonas Habteab	f2fbb61ad8	Log before & after an RPC client is disconnected	2025-01-13 10:36:21 +01:00
Yonas Habteab	7ed5c6a2c7	JsonRpcConnection: Don't drop client from cache prematurely PR #7445 incorrectly assumed that a peer that had already disconnected and never reconnected was due to the endpoint client being dropped after a successful socket shutdown. However, the issue at that time was that there was not a single timeout guards that could cancel the `async_shutdown` call, petentially blocking indefinetely. Although removing the client from cache early might have allowed the endpoint to reconnect, it did not resolve the underlying problem. Now that we have a proper cancellation timeout, we can wait until the currently used socket is fully closed before dropping the client from our cache. When our socket termination works reliably, the `ApiListener` reconnect timer should attempt to reconnect this endpoint after the next tick. Additionally, we now have logs both for before and after socket termination, which may help identify if it is hanging somewhere in between.	2025-01-13 10:36:21 +01:00
Julian Brost	2fffb28ab0	Add comment for remaining uses of async_shutdown() why it's safe The reason for introducing AsioTlsStream::GracefulDisconnect() was to handle the TLS shutdown properly with a timeout since it involves a timeout. However, the implementation of this timeout involves spwaning coroutines which are redundant in some cases. This commit adds comments to the remaining calls of async_shutdown() stating why calling it is safe in these places.	2025-01-13 10:33:11 +01:00
Julian Brost	f99d35ed91	HttpServerConnection: use AsioTlsStream::GracefulDisconnect() This new helper function has proper timeout handling which was missing here.	2025-01-13 10:33:11 +01:00
Julian Brost	28776cb37c	JsonRpcConnection: use AsioTlsStream::GracefulDisconnect() This new helper functions allows deduplicating the timeout handling for `async_shutdown()`.	2025-01-13 10:33:11 +01:00
Julian Brost	a593bdfa5f	AsioTlsStream: add GracefulDisconnect() and ForceDisconnect() Calling `AsioTlsStream::async_shutdown()` performs a TLS shutdown which exchanges messages (that's why it takes a `yield_context`) and thus has the potential to block the coroutine. Therefore, it should be protected with a timeout. As `async_shutdown()` doesn't simply take a timeout, this has to be implemented using a timer. So far, these timers are scattered throughout the codebase with some places missing them entirely. This commit adds helper functions to properly shutdown a TLS connection with a single function call.	2025-01-13 10:33:11 +01:00
Julian Brost	156ba265e1	Simplify `DependencyGraph:RemoveDependency()` method	2025-01-13 10:25:48 +01:00
Yonas Habteab	7501525550	ApiListener: Sync runtime configs in order	2025-01-13 10:25:48 +01:00
Yonas Habteab	5c0ce6350c	DependencyGraph: Allow lookups by parent & child dependencies	2025-01-13 10:25:48 +01:00
Alexander A. Klimov	c64ff492ec	DependencyGraph: use ConfigObject, not Object This saves dynamic_cast<ConfigObject*> + if() on every item of GetChildren().	2025-01-13 10:25:42 +01:00
Alexander A. Klimov	6aa2355427	DependencyGraph: switch "parent" and "child" terminology The .ti files call `DependencyGraph::AddDependency(this, service.get())`. Obviously, `service.get()` is the parent and `this` (Downtime, Notification, ...) is the child. The DependencyGraph terminology should reflect this not to confuse its future users.	2025-01-13 10:23:28 +01:00
Alexander A. Klimov	3b40ba1fe4	doc/: fix "a HA" -> "an HA"	2024-12-02 10:13:54 +01:00
Yonas Habteab	d768c90937	HttpServerConnection: Don't spawn useless coroutines Currently, for each `Disconnect()` call, we spawn a coroutine, but every one of them is just usesless, except the first one. However, since all `Disconnect()` usages share the same asio strand and cannot interfere with each other, spawning another coroutine within `Disconnect()` isn't even necessary. When a coroutine calls `Disconnect()` now, it will immediately initiate an async shutdown of the socket, potentially causing the coroutine to yield and allowing the others to resume. Therefore, the `m_ShuttingDown` flag is still required by the coroutines to be checked regularly.	2024-11-19 16:08:37 +01:00
Yonas Habteab	6f9ae05948	Merge pull request #10239 from Icinga/state-before-suppression214 Fix lost recovery notifications after recovery outside of notification time period	2024-11-14 13:49:15 +01:00
Yonas Habteab	eb32283751	Merge pull request #10237 from Icinga/log-connected-endpoint-connection-attempts-214 ApiListener: Log connection attempts from an already connected client	2024-11-14 12:57:41 +01:00
Julian Brost	e313b3f2aa	Use `Checkable::GetStateBeforeSuppression()` only where relevant This fixes an issue where recovery notifications get lost if they happen outside of a notification time period. Not all calls to `Checkable::NotificationReasonApplies()` need `GetStateBeforeSuppression()` to be checked. In fact, for one caller, `FireSuppressedNotifications()` in `lib/notification/notificationcomponent.cpp`, the state before suppression may not even be initialized properly, so that the default value of OK is used which can lead to incorrect return values. Note the difference between suppressions happening on the level of the `Checkable` object level and the `Notification` object level. Only the first sets the state before suppression in the `Checkable` object, but so far, also the latter used that value incorrectly. This commit moves the check of `GetStateBeforeSuppression()` from `Checkable::NotificationReasonApplies()` to the one place where it's actually relevant: `Checkable::FireSuppressedNotifications()`. This made the existing call to `NotificationReasonApplies()` unneccessary as it would always return true: the `type` argument is computed based on the current check result, so there's no need to check it against the current check result.	2024-11-14 12:07:02 +01:00
Yonas Habteab	8acfb9b214	ApiListener: Log connection attempts from an already connected client Something is definitely going wrong if a client tries to reconnect to this endpoint while it still has an active connection to that client. So we shouldn't hide this, but at least log it at info level. Apart from that, I've added some additional information about the currently active client, such as when the last message was sent and received.	2024-11-14 11:09:00 +01:00
Yonas Habteab	d5051c7ea3	ApiListener: Log error context only once When logging at the warning level, the logger will automatically look up for registered context and append them to the log entry accordingly.	2024-11-14 11:05:53 +01:00
Yonas Habteab	d5cd5aff2c	Merge pull request #10080 from Icinga/net-stack-2.14.3 Fix network stack stability issues	2024-11-14 11:02:36 +01:00
Yonas Habteab	4850018464	Don't use thread-local variable in coroutine & process final `cr` in global thread pool	2024-11-13 15:36:57 +01:00
Yonas Habteab	2854c618dd	HttpServerConnection: Drop yet another superfluous `CpuBoundWork` usage	2024-11-13 15:36:57 +01:00
Yonas Habteab	660b82b4f9	JsonRpcConnection: Don't read any data on shutdown When the `Desconnect()` method is called, clients are not disconnected immediately. Instead, a new coroutine is spawned using the same strand as the other coroutines. This coroutine calls `async_shutdown` on the TCP socket, which might be blocking. However, in order not to block indefintely, the `Timeout` class cancels all operations on the socket after `10` seconds. Though, the timeout does not trigger the handler immediately; it creates spawns another coroutine using the same strand as in the `JsonRpcConnection` class. This can cause unexpected delays if e.g. `HandleIncomingMessages` gets resumed before the coroutine from the timeout class. Apart from that, the coroutine for writing messages uses the same condition, making the two symmetrical.	2024-11-13 15:35:57 +01:00
Julian Brost	58197dbbaa	Merge commit from fork Icinga 2.14.3	2024-11-12 15:01:58 +01:00
Julian Brost	2febc5e18a	Security: fix TLS certificate validation bypass The previous validation in set_verify_callback() could be bypassed, tricking Icinga 2 into treating invalid certificates as valid. To fix this, the validation checks were moved into the IsVerifyOK() function. This is tracked as CVE-2024-49369, more details will be published at a later time.	2024-10-22 10:41:00 +02:00
Yonas Habteab	3b57e4915f	Merge pull request #10163 from Icinga/next-check-cluster-sync-issue-2.14 Checkable: Don't recalculate `next_check` while processing remotely genrated check	2024-09-20 11:32:47 +02:00

1 2 3 4 5 ...

6402 Commits