Commit Graph

1137 Commits

Author SHA1 Message Date
Yonas Habteab 1425641931 Don't endlessly wait on writer coroutine on disconnect 2025-01-08 16:30:36 +01:00
Yonas Habteab 41373ad0e5 Log before & after an RPC client is disconnected 2025-01-08 16:30:36 +01:00
Yonas Habteab 3af7cfe2ec JsonRpcConnection: Don't drop client from cache prematurely
PR #7445 incorrectly assumed that a peer that had already disconnected
and never reconnected was due to the endpoint client being dropped after
a successful socket shutdown. However, the issue at that time was that
there was not a single timeout guards that could cancel the `async_shutdown`
call, petentially blocking indefinetely. Although removing the client from
cache early might have allowed the endpoint to reconnect, it did not
resolve the underlying problem. Now that we have a proper cancellation
timeout, we can wait until the currently used socket is fully closed
before dropping the client from our cache. When our socket termination
works reliably, the `ApiListener` reconnect timer should attempt to
reconnect this endpoint after the next tick. Additionally, we now have
logs both for before and after socket termination, which may help
identify if it is hanging somewhere in between.
2025-01-08 16:30:36 +01:00
Alexander A. Klimov 27e0e236cb Move Timeout instances from heap to stack 2025-01-07 18:20:50 +01:00
Alexander A. Klimov d77d7506f1 Don't call Timeout#Cancel() where Timeout#~Timeout() is called 2025-01-07 18:20:14 +01:00
Alexander A. Klimov cb51649363 Timeout#Timeout(): drop unnecessary template parameters 2025-01-07 18:19:39 +01:00
Alexander A. Klimov d2285bcf0e While using Timeout, don't unnecessarily keep the strand alive via smart pointer 2025-01-07 18:18:46 +01:00
Alexander A. Klimov 92ab913226 Timeout#Timeout(): don't pass yield_context to callback
It's not used. Also, the callback shall run completely at once. This ensures that it won't (continue to) run once another coroutine on the strand calls Timeout#Cancel().
2025-01-07 18:18:18 +01:00
Yonas Habteab ff0e12e6ac ApiListener: Sync runtime configs in order 2025-01-07 11:07:46 +01:00
Alexander Aleksandrovič Klimov 383773eb2b
Merge pull request #10264 from Icinga/DependencyGraph-ConfigObject
DependencyGraph: use ConfigObject*, not Object*
2024-12-18 13:36:56 +01:00
Alexander A. Klimov 3a09cf72d6 DependencyGraph: use ConfigObject*, not Object*
This saves dynamic_cast<ConfigObject*> + if() on every item of GetChildren().
2024-12-17 18:33:05 +01:00
Julian Brost 452386cdb6
Merge pull request #10005 from Icinga/graceful-tls-disconnect
Add a dedicated method for disconnecting TLS connections
2024-12-12 16:20:14 +01:00
Julian Brost 3642ca3369
Merge pull request #10263 from Icinga/DependencyGraph-parent-child
DependencyGraph: switch "parent" and "child" terminology
2024-12-12 15:13:08 +01:00
Julian Brost a506d562ae Add comment for remaining uses of async_shutdown() why it's safe
The reason for introducing AsioTlsStream::GracefulDisconnect() was to handle
the TLS shutdown properly with a timeout since it involves a timeout. However,
the implementation of this timeout involves spwaning coroutines which are
redundant in some cases. This commit adds comments to the remaining calls of
async_shutdown() stating why calling it is safe in these places.
2024-12-12 12:10:59 +01:00
Julian Brost e6d103d0dd HttpServerConnection: use AsioTlsStream::GracefulDisconnect()
This new helper function has proper timeout handling which was missing here.
2024-12-12 12:10:59 +01:00
Julian Brost 007e3fbe7e JsonRpcConnection: use AsioTlsStream::GracefulDisconnect()
This new helper functions allows deduplicating the timeout handling for
`async_shutdown()`.
2024-12-12 12:10:59 +01:00
Alexander A. Klimov 188ba53b74 DependencyGraph: switch "parent" and "child" terminology
The .ti files call `DependencyGraph::AddDependency(this, service.get())`. Obviously, `service.get()` is the parent and `this` (Downtime, Notification, ...) is the child. The DependencyGraph terminology should reflect this not to confuse its future users.
2024-12-04 10:57:30 +01:00
Alexander Aleksandrovič Klimov 8f51f54f19
Merge pull request #10221 from Icinga/Al2Klimov-patch-7
JsonRpcConnection: don't write new messages on shutdown
2024-11-29 09:24:10 +01:00
Yonas Habteab 4564c068fe JsonRpcConnection: Log message processing time stats
Co-Authored-By: Julian Brost <julian.brost@icinga.com>
2024-11-27 09:57:38 +01:00
Yonas Habteab e0b053cbe1 HttpServerConnection: Log noticable CPU semaphore wait time 2024-11-27 09:57:38 +01:00
Yonas Habteab 5c0f9bfdaa HttpServerConnection: Don't spawn useless coroutines
Currently, for each `Disconnect()` call, we spawn a coroutine, but every
one of them is just usesless, except the first one. However, since all
`Disconnect()` usages share the same asio strand and cannot interfere
with each other, spawning another coroutine within `Disconnect()` isn't
even necessary. When a coroutine calls `Disconnect()` now, it will
immediately initiate an async shutdown of the socket, potentially causing
the coroutine to yield and allowing the others to resume. Therefore, the
`m_ShuttingDown` flag is still required by the coroutines to be checked
regularly.
2024-11-14 16:47:01 +01:00
Alexander Aleksandrovič Klimov aa7f159a0f
JsonRpcConnection: don't write new messages on shutdown
In fact, this is already done for the outer loop (for each bulk), just not yet for the inner one (for each message of a bulk). So once the remote signals EOF, don't try to process the remaining queue until write error (which can't be associated with a particular message anyway, due to buffering), but just let the peer go. Flush already half-written messages, though, if possible.
2024-11-07 17:32:12 +01:00
Yonas Habteab 1c34610a78 JsonRpcConnection: Don't read any data on shutdown
When the `Desconnect()` method is called, clients are not disconnected
immediately. Instead, a new coroutine is spawned using the same strand
as the other coroutines. This coroutine calls `async_shutdown` on the
TCP socket, which might be blocking. However, in order not to block
indefintely, the `Timeout` class cancels all operations on the socket
after `10` seconds. Though, the timeout does not trigger the handler
immediately; it creates spawns another coroutine using the same strand
as in the `JsonRpcConnection` class. This can cause unexpected delays if
e.g. `HandleIncomingMessages` gets resumed before the coroutine from the
timeout class. Apart from that, the coroutine for writing messages uses
the same condition, making the two symmetrical.
2024-10-31 17:09:13 +01:00
Yonas Habteab 8574357443 ApiListener: Log error context only once
When logging at the warning level, the logger will automatically look up
for registered context and append them to the log entry accordingly.
2024-10-30 16:55:13 +01:00
Yonas Habteab e8b7baa298 JsonRpcConnection: Drop unused `m_NextHeartbeat` variable 2024-10-30 14:31:48 +01:00
Yonas Habteab 9d4625e1ec ApiListener: Log connection attempts from an already connected client
Something is definitely going wrong if a client tries to reconnect to
this endpoint while it still has an active connection to that client. So
we shouldn't hide this, but at least log it at info level. Apart from
that, I've added some additional information about the currently active
client, such as when the last message was sent and received.
2024-10-30 11:26:21 +01:00
Yonas Habteab c9159494c0 HttpServerConnection: Drop yet another superfluous `CpuBoundWork` usage 2024-09-05 15:10:14 +02:00
Yonas Habteab 9f84c1516e ApiListener: Reorder logging in `ApiTimerHandler()` 2024-08-28 16:53:53 +02:00
Yonas Habteab e062ceb901 ApiListener: Catch & supress clients runtime errors 2024-08-28 16:53:53 +02:00
Yonas Habteab 932a53449d JsonRpcConnection: Raise an exception when trying to send to disconnected clients 2024-08-27 14:23:41 +02:00
Julian Brost 9222a63ff7 Make sure log file is reopened when `ApiListener::ReplayLog()` returns 2024-08-27 14:23:41 +02:00
Yonas Habteab 73db30c08b Use `Defer` class for cleanup in `ApiListener::ReplayLog()` 2024-08-27 14:23:41 +02:00
Alexander A. Klimov f074e24d2a ApiListener#ReplayLog(): stop reading files ASAP on send error 2024-08-27 14:23:41 +02:00
Alexander A. Klimov b538ad2528 JsonRpcConnection#Send*(): discard messages ASAP once shutting down
Especially ApiListener#ReplayLog() enqueued lots of messages into
JsonRpcConnection#{m_IoStrand,m_OutgoingMessagesQueue} (RAM) even if
the connection was shut(ting) down. Now #Disconnect() takes effect ASAP.
2024-08-27 14:23:41 +02:00
Alexander A. Klimov 33f8ea6dcc JsonRpcConnection#Disconnect(): spawn coroutine only if necessary
by checking the now atomic #m_ShuttingDown outside of it.
2024-08-27 14:23:41 +02:00
Alexander Aleksandrovič Klimov 02ba5e4101
Merge pull request #10015 from Icinga/malloc_info
/v1/debug/malloc_info: call malloc_info(3) if available
2024-08-12 14:41:09 +02:00
Alexander A. Klimov f3c7ac11e9 /v1/debug/malloc_info: call malloc_info(3) if available
The GNU libc function malloc_info(3) provides memory allocation and usage
statistics of Icinga 2 itself.
2024-08-09 12:59:25 +02:00
Yonas Habteab 546dea95a2 Don't allow to modify/create/delete an object concurrently 2024-06-13 11:26:19 +02:00
Yonas Habteab 099f664ce6 `ConfigObjectUtility#CreateObject()`: Use `Defer` for config path cleanup 2024-06-13 11:26:19 +02:00
Yonas Habteab 433e2de13a ApiListener: Process cluster config updates sequentially 2024-06-13 11:26:19 +02:00
Yonas Habteab 1a55b68541 Introduce RAII style `ObjectNameLock` class 2024-06-13 11:26:19 +02:00
Yonas Habteab 2218ebd6b0 `ConfigObjectUtility`: Use `AtomicFile` to store object config files 2024-06-13 11:26:19 +02:00
Julian Brost 31be43ff6c
Merge pull request #10018 from Icinga/revert-9980-config-sync-conflicts
Revert "Process `config::update/delete` cluster events gracefully"
2024-03-08 16:58:28 +01:00
Julian Brost af97431bfb
Merge pull request #10006 from Icinga/http-error-handling
HttpServerConnection: use exceptions for error handling
2024-03-08 15:06:51 +01:00
Yonas Habteab a924a49cd8
Revert "Process `config::update/delete` cluster events gracefully" 2024-03-07 17:17:17 +01:00
Julian Brost abea2f270c
Merge pull request #9997 from Icinga/ListenerCoroutineProc-remote_endpoint
ApiListener#ListenerCoroutineProc(): get remote endpoint ASAP for logging
2024-02-20 13:46:02 +01:00
Julian Brost 700c5a13d7 HttpServerConnection: use exceptions for error handling
When a HTTP connection dies prematurely while the response is sent,
`http::async_write()` sets the error code to something like broken pipe for
example. When calling `async_flush()` afterwards, it sometimes happens that
this never returns. This results in a resource leak as the coroutine isn't
cleaned up. This commit makes the individual functions throw exceptions instead
of silently ignoring the errors, resulting in the function terminating early
and also resulting in an error being logged as well.
2024-02-19 14:12:41 +01:00
Julian Brost 04ef105caa
Merge pull request #9980 from Icinga/config-sync-conflicts
Process `config::update/delete` cluster events gracefully
2024-02-19 13:49:41 +01:00
Julian Brost 7d1c887a32
Merge pull request #9999 from Icinga/reset-log-message-count-correctly
ApiListener: Reset `m_LogMessageCount` when rotating
2024-02-15 17:06:16 +01:00
Yonas Habteab 456144c1dc ApiListener: Process cluster config updates sequentially 2024-02-14 14:25:53 +01:00