Having the command type be a part of the command ID isn't needed anywhere. Removing this simplifies the way we generate IDs in general, because we don't need Prepend() anymore.
The command type was only needed to prevent ID collisions within the command_envvar and command_argument tables. Those tables have since been separated into {check,event,notification}command_envvar and {check,event,notification}command_argument tables.
PR #9036 introduces some incompatible changes to the Redis schema, most
importantly where Icinga DB has to read the environment from: now it has to use
a new top-level key of the icinga:stats message instead of a value in the
IcingaApplication part of that message.
In order to avoid changes to the environment ID, it is now no longer derived
from the Environment constant but instead from the public key of the CA
certificate. This ensures that it is different between clusters by default, so
no additional changes have to be done to allow two clusters to use Icinga DB to
write into the same database.
To prevent the ID from changing when the CA certificate is replaced, it is also
persisted into the file /var/lib/icinga2/icingadb.env, so if that file exists,
it takes precedence over the CA certificate.
... i.e. UUID -> SHA1(env, eventType, x...) given that SHA1(env, x...) = type-specific ID.
Rationale: allow both masters to write the same history concurrently (while not
in split-brain), so that REPLACE INTO deduplicates the same events written twice.
* ack: SHA1(env, "ack_set"|"ack_clear", checkable.name, setTime)
* comment: SHA1(env, "comment_add"|"comment_remove", comment.name)
* downtime: SHA1(env, "downtime_start"|"downtime_end", downtime.name)
* flapping: SHA1(env, "flapping_start"|"flapping_end", checkable.name, startTime)
* notification: SHA1(env, "notification", notification.name, notificationType, sendTime)
* state: SHA1(env, "state_change", checkable.name, changeTime)
... i.e. UUID -> SHA1(x..., send time) given that SHA1(x...) = notification id.
Rationale: allow both masters to write the same notification history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
... i.e. UUID -> SHA1(x..., check time) given that SHA1(x...) = checkable id.
Rationale: allow both masters to write the same state history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
add_definitions would set SD_JOURNAL_SUPPRESS_LOCATION for all targets
in directory and sub-directories. However, another future target might
want the opposite, so define it as local as possible to journaldlogger.cpp.
To make this work, we must take journaldlogger.cpp out of the unity
build, because all files from a unity of share compiler definitions.
As proposed in #8857, this adds a Logger subclass that writes structured
log messages via journald's native protocol by calling sd_journal_sendv.
The feature therefore depends on the systemd library. sd_journal_sendv is
available since the early days (systemd v38), so a version check is
probably superflous.
We add the following fields to each record:
- MESSAGE: The log message
- PRIORITY (aka severity): Numeric severity as in RFC5424 section 6.2.1
- SYSLOG_FACILITY: Numeric facility as in RFC5424 section 6.2.1
- SYSLOG_IDENTIFIER: If provided, use value from configuration.
Else use systemd's default behaior, which is to determine the field
by using libc's program_invocation_short_name, resulting in "icinga2".
- ICINGA2_FACILITY: Facility as in Log::Log(..., String facility, ...),
e.g. "ApiListener"
- some more fields are added automatically by systemd
Fields are stored indexed, so we can do fast queries for certain field
values. Example:
$ journalctl -t icinga2 ICINGA2_FACILITY=ApiListener -n 5
Syslog compatiblity is ratained because good old tag, severity and facility
is stored along, and systemd can forward to syslog daemons.
See also https://systemd.io/JOURNAL_NATIVE_PROTOCOL/.
The upcoming JournaldLogger will need the same syslog validation and
conversion logic, so factor it out from SyslogLogger to make it
reusable.
Also explicitely include syslog.h, which defines the syslog()
function.
The loop iterated over the services of the wrong host resulting in duplicate
downtimes scheduled for services of the parent host instead of downtimes for
services of the child host.
IdoPgsqlConnection::Escape() internally uses PQescapeStringConn() and its
documentation states the following:
Furthermore, PQescapeStringConn does not generate the single quotes that must
surround PostgreSQL string literals; they should be provided in the SQL
command that the result is inserted into.
So it's intended to use the result in 'string' literals, not in E'string'
literals as Icinga did. This results in problems as the behavior of
PQescapeStringConn() depends on how the current connection will interpret
regular single quoted literals, namely on the value of the
standard_conforming_strings variable.
The E'string' literals were initially introduced in
ac6f3f8acf to fix#1206 where PostgreSQL started
warning about escape sequences in string literals not supported by the SQL
standard (but by PostgreSQL depending on the value of
standard_conforming_strings). In the meantime the oldest PostgreSQL version on
any platform supported by Icinga increased to 9.2 (CentOS 7) and starting with
9.1, standard_conforming_strings is enabled by default, so there will be no
warnings about escape sequences (as the warning is only issued if the escape
sequence is actually interpreted by PostgreSQL).
Icinga started the initial config sync right after the first Redis connection
was established. If any other connections would take longer to connect than
when it's first needed, queries were discarded.
The loop in the connected callback of the parent connection uses m_Rcons which
previously was only initialized after that connection was already started.
This allows the callback to call RedisConnection::SetConnectedCallback() to set
another callback for this connection. This sets m_ConnectedCallback and thereby
destroys the std::function while it's running resulting in undefined behavior.
By operating on a copy, m_ConnectedCallback can be set without affecting the
currently running callback.
As Icinga first sends a SIGTERM to a check plugin on timeout to allow it to
terminate gracefully, this is not really part of the plugin API specification
and we cannot assume that plugins will handle this correctly and still exit
with an exit code that maps to UNKNOWN. Therefore, once Icinga decides to kill
a process, force its exit code to 128 to be sure the state will be UNKNOWN
after a timeout.
I.e. do the following in parallel (highest priority first):
* Stream state changes to icinga:runtime:state
* Sync config and initial state,
then let queued runtime updates to the just synced state pass