The point of logging to the Windows Event Log was to catch errors that happen
before the full logging configuration has been loaded and enabled. Messages
like the number of loaded objects per type just cause noise in the log and
provide little benefit. Therefore raise the required log level at this stage.
Note that this commit removes the (never documented) ability to use the -x flag
to change the level. But doing so would require patching the command line of
the service in the registry anyways.
Apparently there was a reason for making the members of generated classes
atomic. However, this was only done for some types, others were still accessed
using non-atomic operations. For members of type T::Ptr (i.e. intrusive_ptr<T>),
this can result in a double free when multiple threads access the same variable
and at least one of them writes to the variable.
This commit makes use of std::atomic<T> for more T (it removes the additional
constraint sizeof(T) <= sizeof(void*)) and uses a type including a mutex for
load and store operations as a fallback.
add_definitions would set SD_JOURNAL_SUPPRESS_LOCATION for all targets
in directory and sub-directories. However, another future target might
want the opposite, so define it as local as possible to journaldlogger.cpp.
To make this work, we must take journaldlogger.cpp out of the unity
build, because all files from a unity of share compiler definitions.
As proposed in #8857, this adds a Logger subclass that writes structured
log messages via journald's native protocol by calling sd_journal_sendv.
The feature therefore depends on the systemd library. sd_journal_sendv is
available since the early days (systemd v38), so a version check is
probably superflous.
We add the following fields to each record:
- MESSAGE: The log message
- PRIORITY (aka severity): Numeric severity as in RFC5424 section 6.2.1
- SYSLOG_FACILITY: Numeric facility as in RFC5424 section 6.2.1
- SYSLOG_IDENTIFIER: If provided, use value from configuration.
Else use systemd's default behaior, which is to determine the field
by using libc's program_invocation_short_name, resulting in "icinga2".
- ICINGA2_FACILITY: Facility as in Log::Log(..., String facility, ...),
e.g. "ApiListener"
- some more fields are added automatically by systemd
Fields are stored indexed, so we can do fast queries for certain field
values. Example:
$ journalctl -t icinga2 ICINGA2_FACILITY=ApiListener -n 5
Syslog compatiblity is ratained because good old tag, severity and facility
is stored along, and systemd can forward to syslog daemons.
See also https://systemd.io/JOURNAL_NATIVE_PROTOCOL/.
The upcoming JournaldLogger will need the same syslog validation and
conversion logic, so factor it out from SyslogLogger to make it
reusable.
Also explicitely include syslog.h, which defines the syslog()
function.
As Icinga first sends a SIGTERM to a check plugin on timeout to allow it to
terminate gracefully, this is not really part of the plugin API specification
and we cannot assume that plugins will handle this correctly and still exit
with an exit code that maps to UNKNOWN. Therefore, once Icinga decides to kill
a process, force its exit code to 128 to be sure the state will be UNKNOWN
after a timeout.
So far, the documentation has claimed that loggers have a default severity
(information for FileLogger and warning for SyslogLogger). However, this was
not the case and not setting the severity resulted in a configuration error.
This commit changes the default value to be information for all loggers.
When Icinga 2 is started as a service, the early log messages generated
until the FileLogger object is activated are lost and make it really
hard to debug issues that (only) occur when Icinga 2 reloads.
With this commit, these early log messages are written to the Windows
Event Log.
Even if a double represents an integer value, it might not be safe to cast it
to long long as it may overflow the type. Instead just use print the double
value with 0 decimals using std::setprecision.
Before:
<1> => 18446744073709551616.to_string()
"-9223372036854775808"
After:
<1> => 18446744073709551616.to_string()
"18446744073709551616"
Fixes the following build error:
/home/jbrost/dev/icinga2/lib/base/stdiostream.cpp: In member function ‘virtual size_t icinga::StdioStream::Read(void*, size_t, bool)’:
/home/jbrost/dev/icinga2/lib/base/stdiostream.cpp:28:15: error: invalid use of incomplete type ‘std::iostream’ {aka ‘class std::basic_iostream<char>’}
28 | m_InnerStream->read(static_cast<char *>(buffer), size);
| ^~
Unfortunately, the symbol resolution of boost::stacktrace is broken on
FreeBSD, therefore fall back to using backtrace_symbols() to print the
stack trace saved by Boost.
Additionally, -D_GNU_SOURCE is required on FreeBSD for the
_Unwind_Backtrace function used by boost::stacktrace.
This makes the format more similar to what the uncaught C++ and SEH
exception handlers write. Previously there was no indication in the
crash log that a SIGABRT happened.
Maybe this will save the next person who has to look at this code some
time. Please don't blame me for the implementation, I'm just trying to
reconstruct what it does.
The logic for selecting the traces to print stays the same, but there
are fewer nested ifs now. This changes the format of the returned string
a bit by adding a heading for both traces.
By default, DiagnosticInformation uses the stack trace saved when the
exception was thrown, but this mechanism is not in use on Windows.
Gathering a stacktrace in the terminate handler serves as a fallback.
On Windows, the termination handler is executed for uncaught C++
exceptions unless a SEH unhandled exception filter is also set. In this
case, this filter has to explicitly chain the default filter to keep
this behavior.
Boost.Beast changed the signature of
boost::beast::http::basic_fields::set in version 1.74 so that no longer
allows passing an icinga::String instance as value. This adds a
conversion function so that it works again.
When a CRL is specified in the ApiListener configuration, Icinga 2 only
used it when connections were established so far, but not when a
certificate is requested. This allows a node to automatically renew a
revoked certificate if it meets the other conditions for auto-renewal
(issued before 2017 or expires in less than 30 days).
The function was never used and it's implementation contains a bug where
a buffer of too small size is used as a paramter to ERR_error_string.
According to the `man 3 ERR_error_info`, the buffer has to be at least
256 bytes in size.
Also the function seems of limited use as it allows to output the tag
object used with additional error information for exceptions in Boost.
However, you boost::get_error_info<>() just returns the value type but
not the full tag object from the exception.
The scientific notation is basically allowed in our performance data
parser. In an edge case where scientific notation is delivered with
an upercas E letter our parser does not recognize it and drops it as
false performance data.
The wrong interpretation as false performance data affects all parts of
the performance data, the value itself, the warning value, the critical
value, the minimum value and the maximum value.
On Windows this terminates checks that reached the timeout with the UNKNOWN
state instead the WARNING state.
Co-authored-by: araujorm <araujorm@users.noreply.github.com>