6493 Commits

Author SHA1 Message Date
Yonas Habteab
9f84c1516e ApiListener: Reorder logging in ApiTimerHandler() 2024-08-28 16:53:53 +02:00
Yonas Habteab
e062ceb901 ApiListener: Catch & supress clients runtime errors 2024-08-28 16:53:53 +02:00
Julian Brost
88e79ea41a
Merge pull request #10111 from Icinga/unregister-invalid-objects-properly
Unregister invalid config objects properly
2024-08-27 14:30:38 +02:00
Yonas Habteab
932a53449d JsonRpcConnection: Raise an exception when trying to send to disconnected clients 2024-08-27 14:23:41 +02:00
Julian Brost
9222a63ff7 Make sure log file is reopened when ApiListener::ReplayLog() returns 2024-08-27 14:23:41 +02:00
Yonas Habteab
a5a83e311a Defer: Allow empty initialization & add SetFunc() method 2024-08-27 14:23:41 +02:00
Yonas Habteab
73db30c08b Use Defer class for cleanup in ApiListener::ReplayLog() 2024-08-27 14:23:41 +02:00
Alexander A. Klimov
f074e24d2a ApiListener#ReplayLog(): stop reading files ASAP on send error 2024-08-27 14:23:41 +02:00
Alexander A. Klimov
b538ad2528 JsonRpcConnection#Send*(): discard messages ASAP once shutting down
Especially ApiListener#ReplayLog() enqueued lots of messages into
JsonRpcConnection#{m_IoStrand,m_OutgoingMessagesQueue} (RAM) even if
the connection was shut(ting) down. Now #Disconnect() takes effect ASAP.
2024-08-27 14:23:41 +02:00
Alexander A. Klimov
33f8ea6dcc JsonRpcConnection#Disconnect(): spawn coroutine only if necessary
by checking the now atomic #m_ShuttingDown outside of it.
2024-08-27 14:23:41 +02:00
Alexander A. Klimov
f96e7c67ee On Windows, don't create C:\Program Files\Icinga2\var during MSI build 2024-08-23 12:49:09 +02:00
Julian Brost
39ae2e8ca4 Utility::FormatDateTime(): provide an overload for tm*
This allows the function to be used both with a double timestamp or a pointer
to a tm struct. With this, a similar implementation inside the tests can simply
use our regular function.
2024-08-23 12:48:50 +02:00
Julian Brost
d5b3ffaa6d Utility::FormatDateTime(): handle invalid format strings on Windows
On Windows, the strftime() function family invokes an invalid parameter handler
when the format string is invalid (see the "Remarks" section in their
documentation). std::put_time() shows the same behavior as it uses
_wcsftime_l() internally. The default invalid parameter handler may terminate
the process, which can be a problem given that the format string can be
specified by the user from the Icinga DSL.

Thus, temporarily set a thread-local no-op handler to disable the default one
allowing the program to continue. This then simply results in the function
returning an error which then results in an exception as we ask the stream to
throw one.

See also:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strftime-wcsftime-strftime-l-wcsftime-l?view=msvc-170
https://learn.microsoft.com/en-us/cpp/c-runtime-library/parameter-validation?view=msvc-170
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/set-invalid-parameter-handler-set-thread-local-invalid-parameter-handler?view=msvc-170
2024-08-23 12:48:50 +02:00
Julian Brost
0285028689 Utility::FormatDateTime(): handle errors from strftime()
So far, the return value of strftime() was simply ignored and the output buffer
passed to the icinga::String constructor. However, there are error conditions
where strftime() returns 0 to signal an error, like if the buffer was too small
for the output. In that case, there's no guarantee on the buffer contents and
reading it can result in undefined behavior. Unfortunately, returning 0 can
also indicate success and strftime() doesn't set errno, so there's no reliable
way to distinguish both situations. Thus, the implementation now returns the
empty string in both cases.

I attempted to use std::put_time() at first as that allows for better error
handling, however, there were problems with the implementation on Windows (see
inline comment), so I put that plan on hold at left strftime() there for the
time being.
2024-08-23 12:42:54 +02:00
Julian Brost
c2c66908f6 Utility::FormatDateTime(): use localtime_s() on Windows
localtime() is not thread-safe as it returns a pointer to a shared tm struct.
Everywhere except on Windows, localtime_r() is used already which avoids the
problem by using a struct allocated by the caller for the output.

Windows actually has a similar function called localtime_s() which has the same
properties, just with a different name and order of arguments.
2024-08-23 12:42:32 +02:00
Julian Brost
704acdc698 Utility::FormatDateTime(): use boost::numeric_cast<>()
The previous implementation actually had undefined behavior when called with a
double that can't be represented as time_t. With boost::numeric_cast, there's a
convenient cast available that avoids this and throws an exceptions on
overflow.

It's undefined behavior ([0], where the implicit conversion rule comes into
play because the C-style cast uses static_cast [1] which in turn uses the
imlicit conversion as per rule 5 of [2]):

> A prvalue of floating-point type can be converted to a prvalue of any integer
> type. The fractional part is truncated, that is, the fractional part is
> discarded.
>
> * If the truncated value cannot fit into the destination type, the behavior
>   is undefined (even when the destination type is unsigned, modulo arithmetic
>   does not apply).

Note that on Linux amd64, the undefined behavior typically manifests itself in
the result being the minimal value of time_t which then results in localtime_r
failing with EOVERFLOW.

[0]: https://en.cppreference.com/w/cpp/language/implicit_conversion#Floating.E2.80.93integral_conversions
[1]: https://en.cppreference.com/w/cpp/language/explicit_cast
[2]: https://en.cppreference.com/w/cpp/language/static_cast
2024-08-23 12:42:30 +02:00
Julian Brost
4c83d793a6
Merge pull request #9983 from Icinga/broken-timeperiod
Fix broken `TimePeriod/ScheduledDowntime`s
2024-08-20 10:05:59 +02:00
Yonas Habteab
ca7cc54438 Checkable: Don't recalculate next_check while processing remotely genrated check
Currently, when processing a `CheckResult`, it will first trigger an
`OnNextCheckChanged` event, which is sent to all connected endpoints.
Then, when `Checkable::ProcessCheckResult()` returns, an `OnCheckResult`
event is fired, which is of course also sent to all connected endpoints.

Next, the other endpoints receive the `event::SetNextCheck` cluster
event followed by `event::CheckResult`and invoke
`checkable#SetNextCheck()` and `Checkable#CheckResult()` with the newly
received check. So they also try to recalculate the next check
themselves and invalidate the previously received next check timestamp
from the source endpoint. Since each endpoint randomly initialises its
own scheduling offset, the recalculated next check will always differ by
a split second/millisecond on each of them. As a consequence, two Icinga
DB HA instances will generate two different checksums for the same state
and causes the state histories to be fully resynchronised after a
takeover/Icinga 2 reload.
2024-08-16 16:15:56 +02:00
Alexander Aleksandrovič Klimov
02ba5e4101
Merge pull request #10015 from Icinga/malloc_info
/v1/debug/malloc_info: call malloc_info(3) if available
2024-08-12 14:41:09 +02:00
Alexander A. Klimov
f3c7ac11e9 /v1/debug/malloc_info: call malloc_info(3) if available
The GNU libc function malloc_info(3) provides memory allocation and usage
statistics of Icinga 2 itself.
2024-08-09 12:59:25 +02:00
Julian Brost
2bfa1f1649
Merge pull request #10107 from Icinga/timeperiod-nth-day-of-month-off-by-one
Timeperiods: fix off by one when calculating n-th last weekday of the month
2024-08-08 14:40:18 +02:00
Julian Brost
c45829b59f Timeperiods: fix off by one when calculating n-th last weekday of the month
A day specification like "monday -1" refers to the last Monday of the month.
However, there was an off by one if the first day of the next month is the same
day of the week, i.e. a Monday in this example.

LegacyTimePeriod::FindNthWeekday() picks a day to start the search for the day
in question. When given a negative n to search for the n-th last day, it
wrongly used the first day of the following month as the start and counted it
as if it was within the current month. This resulted in a 1/7 chance that the
result was one week too late.

This is fixed by using the last day of the current month instead.
2024-08-07 12:06:05 +02:00
Yonas Habteab
c4edecc1fb Unregister invalid config objects properly 2024-08-06 16:59:30 +02:00
Julian Brost
07d253009a
Merge pull request #10013 from Icinga/broken-runtime-config-sync
Fix broken runtime config sync
2024-08-06 11:57:24 +02:00
Yonas Habteab
86347013a6 Check segemnt start date inclusively in TimePeriod::IsInside() 2024-08-01 16:16:48 +02:00
Yonas Habteab
4daa03dc02 Fix broken timeperiods/scheduleddowntimes 2024-08-01 15:14:34 +02:00
Yonas Habteab
546dea95a2 Don't allow to modify/create/delete an object concurrently 2024-06-13 11:26:19 +02:00
Yonas Habteab
099f664ce6 ConfigObjectUtility#CreateObject(): Use Defer for config path cleanup 2024-06-13 11:26:19 +02:00
Yonas Habteab
433e2de13a ApiListener: Process cluster config updates sequentially 2024-06-13 11:26:19 +02:00
Yonas Habteab
1a55b68541 Introduce RAII style ObjectNameLock class 2024-06-13 11:26:19 +02:00
Yonas Habteab
2218ebd6b0 ConfigObjectUtility: Use AtomicFile to store object config files 2024-06-13 11:26:19 +02:00
Alexander Aleksandrovič Klimov
f1be9b73ab
Merge pull request #10060 from Icinga/IcingaDB-SerializeState-execution_time-latency
IcingaDB#SerializeState(): limit execution_time and latency to 2^32-1
2024-06-13 09:55:45 +02:00
Yonas Habteab
81a94a0759 Don't fail to remove obsolete downtimes 2024-05-23 10:09:41 +02:00
Yonas Habteab
4eeccce36c Don't loose args in recursive Downtime::RemoveDowntime() call 2024-05-23 10:09:41 +02:00
Yonas Habteab
e0fd0d3df4 Introduce & use enum DowntimeRemovalReason 2024-05-23 09:34:15 +02:00
Alexander Aleksandrovič Klimov
cc3965c3ce
Merge pull request #10065 from Icinga/heavy-update-missing-table-relations
Update `object#config_hash` after all relations queries
2024-05-22 15:38:31 +02:00
Yonas Habteab
1019398d55 Update object#config_hash after all relations queries 2024-05-22 13:39:30 +02:00
Yonas Habteab
3d64240ee3
Merge pull request #10066 from Icinga/Checkable-RemoveAllDowntimes
Remove unused Checkable#RemoveAllDowntimes()
2024-05-21 17:13:16 +02:00
Alexander A. Klimov
e2bdb8a2f1 Remove unused Checkable#RemoveAllDowntimes() 2024-05-21 14:28:39 +02:00
Alexander A. Klimov
f9adf18111 IcingaDB#SerializeState(): limit execution_time and latency to 2^32-1
not to write higher values into Redis than the Icinga DB schema can hold.
This fixes yet another potential Go daemon crash.
2024-05-15 12:55:41 +02:00
Alexander Aleksandrovič Klimov
8c2eb3c1ed
Merge pull request #10049 from Icinga/AddDowntime-trigger_name
Downtime::AddDowntime(): NULL-check pointer before deref not to crash
2024-05-06 10:26:26 +02:00
Alexander Aleksandrovič Klimov
d8f8d64f1a
Merge pull request #10027 from macdems/master
Fix missing values in PerfData normalization
2024-04-25 19:38:21 +02:00
Maciej Dems
2bb5cc62e2 Fix missing values in PerfData normalization 2024-04-25 17:41:12 +02:00
Alexander A. Klimov
5f80ac17aa l_LegacyDowntimesCache: delete removed objects not to leak memory 2024-04-25 12:13:52 +02:00
Alexander A. Klimov
c0f87dd4c9 /v1/actions/schedule-downtime: reject request on invalid trigger_name
For this purpose lookup the specified Downtime. Also pass Downtime objects,
not just names, to Downtime::AddDowntime() not to lookup it twice.
2024-04-25 12:13:52 +02:00
Alexander A. Klimov
f0b5239a15 [Refactor] Downtime::GetDowntimeIDFromLegacyID(): return the Downtime itself
not just its name.
2024-04-25 12:13:52 +02:00
Alexander A. Klimov
28b0f7a48c [Refactor] l_LegacyDowntimesCache: store Downtime objects, not just their names
to avoid names of vanished objects.
2024-04-24 12:33:56 +02:00
Alexander A. Klimov
bb13e98ca5 PluginCheckTask::ProcessFinishedHandler(): warn about exit codes outside 0..3
in the plugin output as well, in addition to the warning log.
2024-04-23 17:45:31 +02:00
Alexander A. Klimov
e33befabfb Make ProcessResult#ExitStatus and CheckResult#exit_status 64-bit ints
so that they can hold Windows exit codes like 3221225477 (>2147483647).
2024-04-23 17:45:31 +02:00
Alexander A. Klimov
5c17465a19 OpenTsdbWriter#CheckResultHandler(): skip custom tags with empty values
refs #7724
2024-04-18 11:36:21 +02:00