1125 Commits

Author SHA1 Message Date
Alexander Aleksandrovič Klimov
c538324040
Merge pull request #10304 from Icinga/win-progfiles-icinga2-var214
On Windows, don't create C:\Program Files\Icinga2\var during MSI build
2025-01-16 14:03:58 +01:00
Alexander A. Klimov
e241a240a8 On Windows, don't create C:\Program Files\Icinga2\var during MSI build 2025-01-16 12:04:12 +01:00
Yonas Habteab
2c0925cedd
Merge pull request #10293 from Icinga/graceful-tls-disconnect-214
Add a dedicated method for disconnecting TLS connections
2025-01-14 10:03:22 +01:00
Yonas Habteab
f53e5343c8
Merge pull request #10292 from Icinga/rpc-sync-failures
Runtime RPC sync failures
2025-01-13 14:44:02 +01:00
Yonas Habteab
7defb0c942
Merge pull request #10295 from Icinga/do-not-write-new-messages-on-shutdown
JsonRpcConnection: don't write new messages on shutdown
2025-01-13 13:11:17 +01:00
Alexander Aleksandrovič Klimov
ebf905a220 JsonRpcConnection: don't write new messages on shutdown
In fact, this is already done for the outer loop (for each bulk), just not yet for the inner one (for each message of a bulk). So once the remote signals EOF, don't try to process the remaining queue until write error (which can't be associated with a particular message anyway, due to buffering), but just let the peer go. Flush already half-written messages, though, if possible.
2025-01-13 11:17:23 +01:00
Alexander A. Klimov
d956920bd7 Move Timeout instances from heap to stack 2025-01-13 10:42:36 +01:00
Alexander A. Klimov
1703f99d14 Don't call Timeout#Cancel() where Timeout#~Timeout() is called 2025-01-13 10:42:36 +01:00
Alexander A. Klimov
a47508b7b3 Timeout#Timeout(): drop unnecessary template parameters 2025-01-13 10:42:36 +01:00
Alexander A. Klimov
d69291739f While using Timeout, don't unnecessarily keep the strand alive via smart pointer 2025-01-13 10:42:36 +01:00
Alexander A. Klimov
f839707c4a Timeout#Timeout(): don't pass yield_context to callback
It's not used. Also, the callback shall run completely at once. This ensures that it won't (continue to) run once another coroutine on the strand calls Timeout#Cancel().
2025-01-13 10:42:36 +01:00
Yonas Habteab
a88d6988b4 JsonRpcConnection: Log message processing time stats
Co-Authored-By: Julian Brost <julian.brost@icinga.com>
2025-01-13 10:39:23 +01:00
Yonas Habteab
7225d78047 HttpServerConnection: Log noticable CPU semaphore wait time 2025-01-13 10:39:23 +01:00
Yonas Habteab
7b30cb3431 Don't endlessly wait on writer coroutine on disconnect 2025-01-13 10:36:21 +01:00
Yonas Habteab
f2fbb61ad8 Log before & after an RPC client is disconnected 2025-01-13 10:36:21 +01:00
Yonas Habteab
7ed5c6a2c7 JsonRpcConnection: Don't drop client from cache prematurely
PR #7445 incorrectly assumed that a peer that had already disconnected
and never reconnected was due to the endpoint client being dropped after
a successful socket shutdown. However, the issue at that time was that
there was not a single timeout guards that could cancel the `async_shutdown`
call, petentially blocking indefinetely. Although removing the client from
cache early might have allowed the endpoint to reconnect, it did not
resolve the underlying problem. Now that we have a proper cancellation
timeout, we can wait until the currently used socket is fully closed
before dropping the client from our cache. When our socket termination
works reliably, the `ApiListener` reconnect timer should attempt to
reconnect this endpoint after the next tick. Additionally, we now have
logs both for before and after socket termination, which may help
identify if it is hanging somewhere in between.
2025-01-13 10:36:21 +01:00
Julian Brost
2fffb28ab0 Add comment for remaining uses of async_shutdown() why it's safe
The reason for introducing AsioTlsStream::GracefulDisconnect() was to handle
the TLS shutdown properly with a timeout since it involves a timeout. However,
the implementation of this timeout involves spwaning coroutines which are
redundant in some cases. This commit adds comments to the remaining calls of
async_shutdown() stating why calling it is safe in these places.
2025-01-13 10:33:11 +01:00
Julian Brost
f99d35ed91 HttpServerConnection: use AsioTlsStream::GracefulDisconnect()
This new helper function has proper timeout handling which was missing here.
2025-01-13 10:33:11 +01:00
Julian Brost
28776cb37c JsonRpcConnection: use AsioTlsStream::GracefulDisconnect()
This new helper functions allows deduplicating the timeout handling for
`async_shutdown()`.
2025-01-13 10:33:11 +01:00
Yonas Habteab
7501525550 ApiListener: Sync runtime configs in order 2025-01-13 10:25:48 +01:00
Alexander A. Klimov
c64ff492ec DependencyGraph: use ConfigObject*, not Object*
This saves dynamic_cast<ConfigObject*> + if() on every item of GetChildren().
2025-01-13 10:25:42 +01:00
Alexander A. Klimov
6aa2355427 DependencyGraph: switch "parent" and "child" terminology
The .ti files call `DependencyGraph::AddDependency(this, service.get())`. Obviously, `service.get()` is the parent and `this` (Downtime, Notification, ...) is the child. The DependencyGraph terminology should reflect this not to confuse its future users.
2025-01-13 10:23:28 +01:00
Yonas Habteab
d768c90937 HttpServerConnection: Don't spawn useless coroutines
Currently, for each `Disconnect()` call, we spawn a coroutine, but every
one of them is just usesless, except the first one. However, since all
`Disconnect()` usages share the same asio strand and cannot interfere
with each other, spawning another coroutine within `Disconnect()` isn't
even necessary. When a coroutine calls `Disconnect()` now, it will
immediately initiate an async shutdown of the socket, potentially causing
the coroutine to yield and allowing the others to resume. Therefore, the
`m_ShuttingDown` flag is still required by the coroutines to be checked
regularly.
2024-11-19 16:08:37 +01:00
Yonas Habteab
eb32283751
Merge pull request #10237 from Icinga/log-connected-endpoint-connection-attempts-214
ApiListener: Log connection attempts from an already connected client
2024-11-14 12:57:41 +01:00
Yonas Habteab
8acfb9b214 ApiListener: Log connection attempts from an already connected client
Something is definitely going wrong if a client tries to reconnect to
this endpoint while it still has an active connection to that client. So
we shouldn't hide this, but at least log it at info level. Apart from
that, I've added some additional information about the currently active
client, such as when the last message was sent and received.
2024-11-14 11:09:00 +01:00
Yonas Habteab
d5051c7ea3 ApiListener: Log error context only once
When logging at the warning level, the logger will automatically look up
for registered context and append them to the log entry accordingly.
2024-11-14 11:05:53 +01:00
Yonas Habteab
d5cd5aff2c
Merge pull request #10080 from Icinga/net-stack-2.14.3
Fix network stack stability issues
2024-11-14 11:02:36 +01:00
Yonas Habteab
2854c618dd HttpServerConnection: Drop yet another superfluous CpuBoundWork usage 2024-11-13 15:36:57 +01:00
Yonas Habteab
660b82b4f9 JsonRpcConnection: Don't read any data on shutdown
When the `Desconnect()` method is called, clients are not disconnected
immediately. Instead, a new coroutine is spawned using the same strand
as the other coroutines. This coroutine calls `async_shutdown` on the
TCP socket, which might be blocking. However, in order not to block
indefintely, the `Timeout` class cancels all operations on the socket
after `10` seconds. Though, the timeout does not trigger the handler
immediately; it creates spawns another coroutine using the same strand
as in the `JsonRpcConnection` class. This can cause unexpected delays if
e.g. `HandleIncomingMessages` gets resumed before the coroutine from the
timeout class. Apart from that, the coroutine for writing messages uses
the same condition, making the two symmetrical.
2024-11-13 15:35:57 +01:00
Yonas Habteab
235e4d4824
Merge pull request #10121 from Icinga/broken-runtime-config-sync-2.14
Fix broken runtime config sync
2024-09-17 15:20:56 +02:00
Yonas Habteab
b70f4da208 Don't allow to modify/create/delete an object concurrently 2024-09-17 12:33:35 +02:00
Yonas Habteab
395a1398f6 ConfigObjectUtility#CreateObject(): Use Defer for config path cleanup 2024-09-17 12:33:35 +02:00
Yonas Habteab
42891028ca ApiListener: Process cluster config updates sequentially 2024-09-17 12:33:35 +02:00
Yonas Habteab
3d5e0fef69 Introduce RAII style ObjectNameLock class 2024-09-17 12:33:35 +02:00
Yonas Habteab
cf11fe0177 ConfigObjectUtility: Use AtomicFile to store object config files 2024-09-17 12:33:35 +02:00
Alexander A. Klimov
2722deb6aa /v1/debug/malloc_info: call malloc_info(3) if available
The GNU libc function malloc_info(3) provides memory allocation and usage
statistics of Icinga 2 itself.
2024-09-17 12:32:52 +02:00
Yonas Habteab
96839d829b ApiListener: Reorder logging in ApiTimerHandler() 2024-09-03 16:49:02 +02:00
Yonas Habteab
b9b3e7a925 ApiListener: Catch & supress clients runtime errors 2024-09-03 16:49:02 +02:00
Yonas Habteab
561aedab1d JsonRpcConnection: Raise an exception when trying to send to disconnected clients 2024-09-03 16:49:02 +02:00
Julian Brost
02334c5f29 Make sure log file is reopened when ApiListener::ReplayLog() returns 2024-09-03 16:49:02 +02:00
Yonas Habteab
5f2d31bf3c Use Defer class for cleanup in ApiListener::ReplayLog() 2024-09-03 16:49:01 +02:00
Alexander A. Klimov
9a0c7d7c75 ApiListener#ReplayLog(): stop reading files ASAP on send error 2024-09-03 16:49:01 +02:00
Alexander A. Klimov
a6946f9dbf JsonRpcConnection#Send*(): discard messages ASAP once shutting down
Especially ApiListener#ReplayLog() enqueued lots of messages into
JsonRpcConnection#{m_IoStrand,m_OutgoingMessagesQueue} (RAM) even if
the connection was shut(ting) down. Now #Disconnect() takes effect ASAP.
2024-09-03 16:49:01 +02:00
Alexander A. Klimov
81da1cdb26 JsonRpcConnection#Disconnect(): spawn coroutine only if necessary
by checking the now atomic #m_ShuttingDown outside of it.
2024-09-03 16:49:01 +02:00
Julian Brost
5243241b33 HttpServerConnection: use exceptions for error handling
When a HTTP connection dies prematurely while the response is sent,
`http::async_write()` sets the error code to something like broken pipe for
example. When calling `async_flush()` afterwards, it sometimes happens that
this never returns. This results in a resource leak as the coroutine isn't
cleaned up. This commit makes the individual functions throw exceptions instead
of silently ignoring the errors, resulting in the function terminating early
and also resulting in an error being logged as well.
2024-06-10 13:19:46 +02:00
Alexander A. Klimov
8ff7121e93 ApiListener#ListenerCoroutineProc(): get remote endpoint ASAP for logging
On incoming connection timeout we log the remote endpoint which isn't
available if it was already disconnected - an exception is thrown.  Get it
as long as we're still connected not to lose it, nor to get an exception.
2024-06-10 13:19:46 +02:00
Yonas Habteab
dfffb29c81 ApiListener: Reset m_LogMessageCount when rotating
Closing and re-opening that very same log file shouldn't reset the
counter, otherwise some log files may exceed the max limit per file as
their offset indicator is reset each time they are re-opened.
2024-06-10 13:19:46 +02:00
Yonas Habteab
ed8156db28 Drop redundant CpuBoundWork usage in JsonRpcConnection::Disconnect()
Although there is locking involved here, it shoudln't take too long for
the thread to actually acquire it, since there aren't that many threads
dealing with endpoint clients concurrently. It's just wasting pointless
time trying to obtain a CPU slot.
2024-06-10 13:19:46 +02:00
Yonas Habteab
e66f8567de HttpServerConnection: Drop superfluous CpuBoundWork usage 2024-06-10 13:19:46 +02:00
Yonas Habteab
599a54aae0 EventsHandler: Drop superfluous CpuBoundWork usage 2024-06-10 13:19:46 +02:00