Commit Graph

6270 Commits

Author SHA1 Message Date
Julian Brost 0ed9c09a1d
Merge pull request #9513 from Icinga/9501
Icinga DB: on every check result update state only 1x, not 3x in a row
2022-10-07 10:18:56 +02:00
Yonas Habteab 85c77bd878 IcingaDB: Cache generated object hash 2022-09-12 17:23:06 +02:00
Yonas Habteab 07e60c1961 ConfigObject: Introduce new `icingadb_identifier` attr 2022-09-12 17:22:57 +02:00
Yonas Habteab 28c29c1fbc Don't allow to change object parent,host/service_name at runtime 2022-09-09 18:26:28 +02:00
Alexander A. Klimov a6b36a2d7b Utility::ValidateUTF8(): move a string instead of copying a vector
less malloc() = more speed

Especially as JsonEncode() validates every single input string.
2022-09-09 10:50:42 +02:00
Alexander A. Klimov 22bfcf9ac5 icinga2 daemon: remove no-op SIGCHLD handling
1. Don't set a custom handler for SIGCHLD (in the umbrella process)
   as that handler doesn't actually handle SIGCHLD anymore
2. Don't reset the SIGCHLD handler (in the worker process)
   as there's nothing to reset anymore due to the above change
3. Don't block SIGCHLD across fork(2) as its handler doesn't change anymore
   due to the above changes
2022-09-07 12:12:09 +02:00
Alexander A. Klimov 3de714489c Remove unused UnixWorkerState::Failed 2022-09-07 12:08:33 +02:00
Alexander A. Klimov df9008bfc4 StartUnixWorker(): watch forked child via waitpid(), not SIGCHLD handler
Before:

On SIGCHLD from the forked worker the umbrella process sets a failure flag.
StartUnixWorker() recognises that and does waitpid(), failure message, etc..
On OpenBSD we can't tell the signal source, so we always set the failure flag.
That's not how our IPC shall work, that breaks the IPC sooner or later.

After:

No SIGCHLD handling and no failure flag setting.
Instead StartUnixWorker()'s wait loop uses waitpid(x,y,WNOHANG)
to avoid false positives while watching the forked worker.
2022-09-07 11:46:46 +02:00
Yonas Habteab 31785b48fd Expression: Decrease `frame.Depth` only when calling `IncreaseStackDepth()` succeeds
This ensures that `frame.Depth` is only decreased when preceding `frame.IncreaseStackDepth()` callee was successful.
This way, `frame.Depth` will have the same depth prior to and after evaluating a frame.
2022-09-07 09:41:16 +02:00
Alexander A. Klimov 5e9f95c007 Icinga DB: on every check result update state only 1x, not 3x in a row
Before (time: vertical, stack: horizontal):

* Checkable::ExecuteCheck
  * Checkable::UpdateNextCheck
    * IcingaDB::NextCheckChangedHandler
      * HSET icinga:host:state
      * HSET icinga:checksum:host:state
      * ZADD icinga:nextupdate:host
  * RandomCheckTask::ScriptFunc
    * Checkable::ProcessCheckResult
      * Checkable::UpdateNextCheck
        * IcingaDB::NextCheckChangedHandler
          * HSET icinga:host:state
          * HSET icinga:checksum:host:state
          * ZADD icinga:nextupdate:host
      * IcingaDB::NewCheckResultHandler
        * HSET icinga:host:state
        * HSET icinga:checksum:host:state
        * ZADD icinga:nextupdate:host
  * IcingaDB::StateChangeHandler
    * XADD icinga:runtime:state
    * IcingaDB::ForwardHistoryEntries
      * XADD icinga:history:stream:state

After:

* Checkable::ExecuteCheck
  * Checkable::UpdateNextCheck
  * RandomCheckTask::ScriptFunc
    * Checkable::ProcessCheckResult
      * Checkable::UpdateNextCheck
      * IcingaDB::NewCheckResultHandler
        * HSET icinga:host:state
        * HSET icinga:checksum:host:state
        * ZADD icinga:nextupdate:host
  * IcingaDB::StateChangeHandler
    * XADD icinga:runtime:state
    * IcingaDB::ForwardHistoryEntries
      * XADD icinga:history:stream:state

The first state + nextupdate (for overdue) update comes from next_check being
set to now + interval immediately before doing the actual check (not to trigger
it twice). This update is not only not important for the end user, but even
inappropriate. The end user SHALL see next_check being e.g. in -4s, not 5m, as
the check is running at the moment.

The second one is just redundant as IcingaDB::NewCheckResultHandler (the third
one) is called anyway and will update state + nextupdate as well.
2022-09-06 10:10:14 +02:00
Alexander A. Klimov 01bc7d4043 Application::Exit(): don't exit(), but _exit(), even in debug build mode
Case:

1. icinga2 api setup
2. icinga2 daemon -C -x debug

Before: Second commands crashes at exit.
After: No crash.

As the comment between the removed lines clearly says:
Our destructors haven't been built for static data.

This is build type independent.
2022-08-23 13:12:21 +02:00
Alexander A. Klimov 790bad9250 Fix compile error on Solaris 11.4
by not using LOG_FTP which is not defined there.
2022-08-16 12:07:05 +02:00
Julian Brost d3cca0e621
Merge pull request #9409 from Icinga/feature/disallow-empty-object-names
Disallow empty object names
2022-08-11 16:53:28 +02:00
Alexander A. Klimov a2362ebf17 IcingaDB::VersionChangedHandler(): don't handle not synced types
not to surprise (and crash) the Icinga DB daemon with unknown types.
2022-08-10 13:24:44 +02:00
Alexander A. Klimov 32871ca40c IcingaDB::SendCustomVarsChanged(): don't delete custom vars of not synced types
not to surprise (and crash) the Icinga DB daemon with unknown types.
2022-08-10 11:40:53 +02:00
Alexander A. Klimov c9d6eecc7f Dump state file atomically not to corrupt it
by using fsync(2) before close(2) and rename(2).
2022-07-28 18:00:37 +02:00
Alexander A. Klimov 600fb0e3c2 Introduce AtomicFile 2022-07-28 18:00:37 +02:00
Julian Brost 913566cbfa Windows: output useful error message for syscall errors 2022-07-28 17:00:57 +02:00
Yonas Habteab 148f5b8416 ConfigObject: Mark object names as required
This prevents a user from creating an object without a valid name such as `"", null`.
2022-07-15 15:51:58 +02:00
Julian Brost a927ba39b7 Windows: only include critical messages in early log messages
The point of logging to the Windows Event Log was to catch errors that happen
before the full logging configuration has been loaded and enabled. Messages
like the number of loaded objects per type just cause noise in the log and
provide little benefit. Therefore raise the required log level at this stage.

Note that this commit removes the (never documented) ability to use the -x flag
to change the level. But doing so would require patching the command line of
the service in the registry anyways.
2022-07-14 14:07:56 +02:00
Julian Brost bd2118c4cd
Merge pull request #9420 from Icinga/IcingaDB-soft_state
Icinga DB: icinga:*:state: rename state to soft_state
2022-06-29 12:24:52 +02:00
Alexander A. Klimov ba9a5c614c Icinga DB: icinga:*:state: rename state to soft_state 2022-06-29 11:49:06 +02:00
Julian Brost 9b24056e05
Merge pull request #9346 from Icinga/icingadb-check
Introduce Icinga DB check (like the IDO one)
2022-06-28 18:24:29 +02:00
Julian Brost 3222fab05a Icinga DB Check: don't check runtime update backlog during full sync 2022-06-28 13:33:00 +02:00
Julian Brost 4f125753bf Icinga DB Check: ignore suppressed queries in Redis backlog check
If some kind of query is not supposed to be processed at the moment, there is
little point in checking it. During a full dump, state updates are suppressed
(i.e. delayed), so when a dump takes very long, this would have resulted in a
false Redis backlog warning.
2022-06-28 13:33:00 +02:00
Julian Brost 5550fb713c Icinga DB Check: include ongoing dumps in OK message
Also use the "current" and "full dump/sync" terminology in the other messages.
2022-06-28 13:33:00 +02:00
Julian Brost 3ded7a9268 Icinga DB Check: rename dump/sync related perfdata values
Scope all values using current/last instead of takes/took.
2022-06-28 13:33:00 +02:00
Julian Brost e36bc92a2c Icinga DB Check: add unit hints to all rates 2022-06-28 13:33:00 +02:00
Julian Brost eaae7d5863 Icinga DB Check: update not connected message
The check makes no attempt to explicitly connect to Redis, it uses the
connection of the IcingaDB feature, so this message better describes the state
in this situation.
2022-06-28 13:33:00 +02:00
Julian Brost 2fafffb85f Icinga DB Check: fix race-condition with IcingaDB::Start()
IcingaDB::GetConnection() uses IcingaDB::m_Rcon which is only initialized in
IcingaDB::Start(), therefore add a nullptr check to the check command.
Additionally, as m_Rcon is potentially accessed concurrently, add a copy of the
value that is safe for concurrent use.
2022-06-28 13:33:00 +02:00
Julian Brost 953e113465 Icinga DB Check: remove markdown headings from output
icingadb-web shows multiple lines from the check output collapsed into a single
line. The lines containing just minuses make this look cluttered and making
making it a heading provides little to no benefit. Even when rendering markdown
in the check output at some point, having the lists labeled using normal
paragraphs would look just fine.
2022-06-28 13:33:00 +02:00
Julian Brost c59d44cd8b Icinga DB Check: rename perfdata values
- Add icinga2_ and icingadb_ prefixes to make clear which component is
  responsible for the value.
- Rename heartbeat_lag to heartbeat_age, describes it better in my opinion and
  sound a bit less like something that should be as close to zero as possible.
- Rename redis_dump/database_sync into full_dump/full_sync as this is how these
  operations are refered to in log messages as well.
- Rename Redis backlog into Redis query backlog, makes it a bit clearer in my
  opinion.
- Rename runtime_backlog into runtime_update_backlog, as the component in
  Icinga DB is called that way and this naming is also exposed in log messages.
- Rename dump_config/state/history into config/state/history_dump, makes it
  sound more natural.
2022-06-28 13:33:00 +02:00
Julian Brost d0382f71ab Icinga DB Check: rename variables from takes to duration
Sounds more natural in my opinion and I doubt that many users would get that
due to the difference between takes/took, this refers to ongoing dumps.
2022-06-28 13:33:00 +02:00
Julian Brost 3c29b15214 Icinga DB Check: use more natural names for sync/cleanup metrics 2022-06-28 13:33:00 +02:00
Julian Brost d70a27b982 Icinga DB Check: report history and runtime update backlog separately
Probably makes little difference for an end-user, but for support and
development it's great to know which of the two is causing problems.
2022-06-28 13:33:00 +02:00
Julian Brost 2a4605f4b7 Icinga DB Check: clearly state Icinga 2 Redis backlog
Should make it easier to understand that this refers to Redis queries issued by
Icinga 2.
2022-06-28 13:33:00 +02:00
Julian Brost 5613412b81 Icinga DB Check: replace nested calls to fmax() with std::max()
Improves readability, even more so after splitting it into separate lines.
2022-06-28 13:33:00 +02:00
Julian Brost f3f1373f83 Icinga DB Check: spell out "error" in perfdata 2022-06-28 13:33:00 +02:00
Julian Brost 31c7dfee53 Icinga DB Check: fix error message on Redis query error
Not only XREAD queries are performed, so the previous error message was incorrect.
2022-06-28 13:33:00 +02:00
Julian Brost 4f1f70f843 Icinga DB Check: remove unused includes 2022-06-28 13:33:00 +02:00
Julian Brost 2b310718e3 Icinga DB Check: rename keys in heartbeat stream
In both C++ and Go, the keys are only used as constant strings, so namespacing
them just adds clutter for the `general:*` keys, therefore remove it.
2022-06-28 13:33:00 +02:00
Julian Brost d74fbbbb82 Icinga DB Check: remove *_1sec metrics
They add no additional information compared to the *_1min values as it's always
the same value divided by 60 anyways. Adding the actual value from the last
second makes little sense for realistic values of check_interval.
2022-06-28 13:33:00 +02:00
Julian Brost 44cbd04088 Icinga DB Check: read performance data string from Redis
Use the already existing format to pass performance data to Icinga 2 rather
than some new JSON structure. Has the additional benefit of doing more things
in Go than in C++.
2022-06-28 13:33:00 +02:00
Yonas Habteab 0ffef02c1d IcingaDB: Adjust some column names according to the DB schema 2022-06-23 14:27:34 +02:00
Alexander A. Klimov e4a36bc217 Introduce Icinga DB check (like the IDO one) 2022-06-23 11:14:31 +02:00
Alexander A. Klimov 88c8d29ee6 Remove Icinga DB perfdata from Icinga check
as the Icinga DB check already yields it.
2022-06-22 13:25:29 +02:00
Julian Brost 6b4681ee9e Icinga DB: make error message more helpful if API isn't set up 2022-06-20 14:57:19 +02:00
Alexander A. Klimov 8eef51afeb Introduce IcingaDB::AddKvsToMap() 2022-06-20 13:47:39 +02:00
Alexander A. Klimov 2c3d2f8b87 RedisConnection::ReadRESP(): *-1\r\n is null, not [ ] 2022-06-20 13:47:39 +02:00
Alexander Aleksandrovič Klimov 4522522444
Merge pull request #9362 from Icinga/bugfix/remove-redundant-serialization
Remove redundant call to Serialize() in ConfigItem::Commit()
2022-06-15 09:34:38 +02:00
Julian Brost ad218c9a12 Icinga DB: initialize environment ID during config validation
IcingaDB may receive callbacks from Boost signals before being fully started.
This resulted in situations where m_EnvironmentId was used before it was
initialized properly. This is fixed by initializing it earlier (during the
config validation stage). However, at this stage, it should not yet write to
disk, therefore, persisting the environment ID to disk is delayed until later
in the startup process.

Initializing at this stage has an extra benefit: if there is an error for some
reason (possibly corrupt icingadb.env file), this now shows up as a nice error
during config validation.

Additionally, this replaces the use of std::call_once with std::mutex due to
bug in libstdc++ (see inline comment for reference).
2022-06-10 14:19:58 +02:00
Yonas Habteab 45f536ca06 Bump Redis schema version to 5 2022-06-07 12:55:12 +02:00
Yonas Habteab 92becec37f IcingaDB: Add `_name` suffix to columns referring to name 2022-05-31 16:41:40 +02:00
Eric Lippmann 18c8b4ad54
Merge pull request #9371 from Icinga/bugfix/icingadb-command-arguments-null
IcingaDB: handle null (Empty) for value/set_if/separator in command arguments
2022-05-23 16:01:49 +02:00
Julian Brost 3220fecd4c
Merge pull request #7919 from Icinga/feature/parameter-delimiters-check-execution-6277
Introduce Command#arguments[].separator
2022-05-23 13:23:36 +02:00
Julian Brost f110e26635 IcingaDB: handle null (Empty) for value/set_if/separator in command arguments
Icinga 2 treats null (Empty) as if the corresponding attribute is not
specified. However, without this commit, it would serialize the value as "null"
(i.e. type string), so that it ends up in the database as this string instead
of NULL. This commit adds handling for ValueEmpty so that is serialized as JSON
null value and ends up in the database as NULL.
2022-05-23 11:53:41 +02:00
Alexander A. Klimov 069c3968d9 Introduce Command#arguments[].sep
... for letting check commands produce argv like --key=value,
not just --key value.

refs #6277
2022-05-11 17:50:12 +02:00
Julian Brost 4184dcd62c
Merge pull request #9354 from WuerthPhoenix/feature/return-correct-status-in-process-check-result-api
Return correct status codes in process-check-result API
2022-05-05 15:30:09 +02:00
Julian Brost abe2dfa763 Replace EventuallyAtomic with AtomicOrLocked which falls back to a mutex
Apparently there was a reason for making the members of generated classes
atomic. However, this was only done for some types, others were still accessed
using non-atomic operations. For members of type T::Ptr (i.e.  intrusive_ptr<T>),
this can result in a double free when multiple threads access the same variable
and at least one of them writes to the variable.

This commit makes use of std::atomic<T> for more T (it removes the additional
constraint sizeof(T) <= sizeof(void*)) and uses a type including a mutex for
load and store operations as a fallback.
2022-05-03 12:02:46 +02:00
Julian Brost 2dcdae4470 Remove redundant call to Serialize() in ConfigItem::Commit()
The very same object is already serialized a few lines above, the result is
even stored in a variable, but that variable was not used before. Simply using
this variable results in a noticeable improvement of config validation times.
2022-04-28 17:09:16 +02:00
Damiano Chini 9d9810b44d Return correct status codes in process-check-result API 2022-04-26 13:33:59 +02:00
Julian Brost 51cd7e7b0b Take host state into account when sending suppressed notifications
Checkable::FireSuppressedNotifications() compares the time of the current
checkable with the last recovery time of parents to avoid notification right
after a parent recovered and before the current checkable was checked.

This commit makes this check also include to host if the checkable is a
service.  This makes the behavior consistent with the documentation that states
there is an implicit dependency on the host (which isn't realized as implicitly
generating a Dependency object unfortunately).
2022-04-19 16:13:15 +02:00
Julian Brost 178aaaeca9
Merge pull request #9332 from Icinga/bugfix/compare-cluster-tickets-in-constant-time
Compare cluster tickets in constant time
2022-04-11 15:32:32 +02:00
Julian Brost b24a2fa2a5
Merge pull request #9179 from Icinga/Al2Klimov-patch-3
Let new cluster certificates expire after 397 days, not 15 years
2022-04-11 15:29:05 +02:00
Julian Brost 0e880048ee
Merge pull request #7961 from Icinga/bugfix/startup-log
Place startup.log and status in /var/lib/icinga2/api, not /var/lib/icinga2/api/zones-stage
2022-04-11 14:41:07 +02:00
Alexander A. Klimov b15763bd86 Compare cluster tickets in constant time
Just to be sure.
2022-04-11 11:17:05 +02:00
Alexander A. Klimov 08a23f4035 Write also /var/lib/icinga2/api/zones-stage-startup-last-failed.log
in addition to /var/lib/icinga2/api/zones-stage-startup.log
to prevent the next success to overwrite the last failure.
2022-04-11 11:14:42 +02:00
Alexander A. Klimov c9e4c016e0 Protect ApiListener#m_SSLContext with a mutex 2022-04-11 11:02:45 +02:00
Alexander A. Klimov e490883577 Renew certificates also periodically 2022-04-11 11:02:39 +02:00
Alexander Aleksandrovič Klimov 39d642af75
Merge pull request #9321 from Icinga/perfdata-resume-signal
Perfdata writers: disconnect handlers from signals in Pause()
2022-04-07 15:51:02 +02:00
Alexander A. Klimov ce6d1b8961 Place startup.log and status in /var/lib/icinga2/api, not /var/lib/icinga2/api/zones-stage
not to loose them.
2022-04-07 11:24:24 +02:00
Alexander Aleksandrovič Klimov b29b95e882
Merge pull request #9267 from Icinga/bugfix/parallel-api-package-calls-do-not-finish-while-reload
Worker process doesn't let parallel API package stage updates to complete when terminated
2022-04-06 13:27:44 +02:00
Alexander A. Klimov 56933b8877 Perfdata writers: disconnect handlers from signals in Pause()
as they would be re-connected in Resume() (HA).

Before they were still connected during pause and connected X+1 times
after X split-brains (the same data was written X+1 times).
2022-04-06 13:09:26 +02:00
Alexander A. Klimov 3753f86c80 ApiListener#Start(): auto-renew own cert if CA owner
otherwise that particular cert would expire.
2022-04-04 12:13:31 +02:00
Alexander A. Klimov 6d470a3ca5 Introduce ApiListener#RenewCert() 2022-04-04 12:12:31 +02:00
Alexander Aleksandrovič Klimov f749c7556e
Merge pull request #9314 from Icinga/latin1
IDO MySQL: reason latin1 charset for actually UTF-8 bytes
2022-04-04 11:05:12 +02:00
Alexander A. Klimov 11b8d0f058 IDO MySQL: reason latin1 charset for actually UTF-8 bytes 2022-03-31 18:10:21 +02:00
Alexander Aleksandrovič Klimov 2fa26961ac
Merge pull request #9311 from Icinga/9308
IDO MySQL: explicitly use latin1
2022-03-31 16:44:11 +02:00
Alexander A. Klimov 245fbad1e5 IDO MySQL: explicitly use latin1
for the case the MySQL client lib is compiled with another default
not to turn Unicode chars into ??.
2022-03-31 15:04:45 +02:00
Yonas Habteab 6193a911bf ConfigStagesHandler: Don't allow concurrent package updates anymore
To prevent Icinga2 from being restarted while
one or more requests are still in progress and end up
as corrupted stages without status file and startup logs.
2022-03-30 09:42:22 +02:00
Yonas Habteab 362adcab1a ConfigPackageUtility: Don't reset ongoing package updates on config validation success and process is going to be reloaded 2022-03-30 09:42:22 +02:00
Yonas Habteab 575af4c980 Defer: Allow to cancel the callback before going out of scope 2022-03-30 09:42:22 +02:00
Alexander A. Klimov 9be2eb8e5e Introduce IsCertUptodate() 2022-03-29 16:47:23 +02:00
Alexander A. Klimov 5f2e021390 Request certificate renewal also master2->master1
not only sat->master to prevent master2's certificate from expiring.
2022-03-29 16:47:23 +02:00
Alexander A. Klimov e06b631f3a Let new cluster certificates expire after 397 days, not 15 years
https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-1.7.3.pdf, section 6.3.2:

"Subscriber Certificates issued on or after 1 September 2020 SHOULD NOT have a Validity Period greater than 397 days and MUST NOT have a Validity Period greater than 398 days."
2022-03-29 16:47:23 +02:00
Alexander Aleksandrovič Klimov d171301b9d
Merge pull request #9298 from Icinga/bugfix/icingadb-remove-comment-history
Icinga DB: discard comment removals with missing information
2022-03-29 11:25:01 +02:00
Alexander Aleksandrovič Klimov bbc2b59b0d
Merge pull request #9287 from Icinga/9275
Icinga DB: correct ack comments' is_sticky
2022-03-28 22:42:52 +02:00
Julian Brost d139bc31c8 Icinga DB: discard comment removals with missing information
If comments get removed in unintended ways (i.e. not by expiring or by using
the remove-comment API action), the comment object misses information to create
a proper history event for Icinga DB. Therefor, discard these events.
2022-03-28 16:58:05 +02:00
Alexander A. Klimov 1220ad8a2f Icinga DB: correct ack comments' is_sticky
On ack Icinga first adds a comment, then acks the checkable
so the ack event has the comment ID.

But due to the yet missing ack the comment was missing is_sticky.
That's corrected now.
2022-03-24 16:42:18 +01:00
Alexander A. Klimov 4399e82d9d Introduce Comment#sticky
Carries whether ack was sticky for ack comments.
2022-03-24 16:42:18 +01:00
Julian Brost ba154d2a38
Merge pull request #7929 from Icinga/bugfix/override-default-template-apply-rules-7914
Apply rules: import default templates first
2022-03-23 11:30:51 +01:00
Julian Brost cfa6f1c6a9
Merge pull request #9288 from Icinga/9272
IcingaDB#SendRemovedComment(): ignore ack comments like #SendAddedComment()
2022-03-22 15:06:06 +01:00
Alexander A. Klimov 27966c3c08 IcingaDB#SendRemovedComment(): ignore ack comments like #SendAddedComment()
Icinga DB doesn't expect comment history for ack comments.

Before:

1. Acked checkable recovers
2. Icinga clears ack comments w/o setting removal time
3. Icinga DB gets neither removal time, nor expire time
4. Icinga DB falls back to NULL and violates NOT NULL constraint
2022-03-21 17:06:35 +01:00
Julian Brost 9630e86997 Add missing array locking in IcingaDB::GetArrayDeletedValues()
icinga::Array requires locking by the caller when iterating using Begin() and
End(). This is only checked in debug builds but there it makes this function
fail.
2022-03-09 14:29:44 +01:00
Julian Brost bf5b905707
Merge pull request #9250 from Icinga/feature/fix-compiler-warning-do-not-move-local-variables
Fix compiler warnings don't move local variables
2022-03-08 11:37:09 +01:00
Julian Brost 90848f602b Checkable: Add test for state notifications after a suppression ends 2022-03-03 14:25:23 +01:00
Julian Brost cbc0b21b86 Checkable: sync state_before_suppression in cluster
This ensures that in case of a failover in an HA zone, the other can take over
properly and has the required state to send the proper notifications.
2022-03-03 14:25:23 +01:00
Julian Brost 39cee3538a Checkable: improve state notifications after suppression ends
This commit changes the Checkable notification suppression logic (notifications
are currently suppressed on the Checkable if it is unreachable, in a downtime,
or acknowledged) to that after the suppression reason ends, a state
notification is sent if and only if the first hard state after is different
from the last hard state from before. If the checkable is in a soft state after
the suppression ends, the notification is further suppressed until a hard state
is reached.

To achieve this behavior, a new attribute state_before_suppression is added to
Checkable. This attribute is set to the last hard state the first time either a
PROBLEM or a RECOVERY notification is suppressed. Compared to from before,
neither of these two flags in the suppressed_notification will ever be cleared
while the supression is still ongoing but only after the suppression ended and
the current state is compared with the old state stored in
state_before_suppression.
2022-03-03 14:25:23 +01:00
Alexander A. Klimov 6b5106ffdd IcingaDB#Stop(): don't block shutdown, timeout instead 2022-03-02 16:39:44 +01:00
Alexander A. Klimov 3a8efcb4ea IcingaDB#Send*(): don't enqueue any history once stopped 2022-03-02 16:39:44 +01:00
Alexander A. Klimov cac22fe38b RedisConnection#Connect(): wait for all promises to be completed
by the read loop from the previous connection.
2022-03-02 16:39:44 +01:00
Alexander A. Klimov 9585a63fa0 Introduce IoEngine::YieldCurrentCoroutine() 2022-03-02 16:39:44 +01:00
Alexander A. Klimov 732d5c472d RedisConnection#ReadLoop(): don't crash (silently) if a promise to be set is already set 2022-03-02 16:39:37 +01:00
Alexander A. Klimov 50fee6aeb9 Icinga DB: include amount of history kept in memory in /v1/status 2022-03-02 16:39:37 +01:00
Alexander A. Klimov ad0fe764f7 Icinga DB: log amount of history kept in memory every 10s 2022-03-02 16:39:37 +01:00
Alexander A. Klimov 8ea62f7fc7 Icinga DB: keep history in memory until written to Redis
by putting the messages into a Bulker and retrying each chunk.
2022-03-02 16:39:37 +01:00
Alexander A. Klimov 9a8d388734 Introduce Bulker 2022-03-02 16:39:37 +01:00
Alexander Aleksandrovič Klimov 3fee562e7a
Merge pull request #9256 from Icinga/bugfix/add-some-missing-locks
Add some missing locks to prevent data races
2022-03-01 16:12:50 +01:00
Julian Brost 9d3eba8383
Merge pull request #9259 from Icinga/bugfix/event-handler-spamming-8704
Checkable#ExecuteEventHandler(): don't outsource event command run twice
2022-02-25 16:51:31 +01:00
Yonas Habteab f00a3c9693 ConfigObject: Initialize local static var at declaration to ensure thread safety 2022-02-25 15:23:49 +01:00
Yonas Habteab fb21345bfd ConfigItem: Use atomic variables for notified and commited items count 2022-02-25 15:17:33 +01:00
Alexander A. Klimov 74935dad7b Checkable#ExecuteEventHandler(): don't outsource event command run twice
refs #8704
2022-02-24 14:03:57 +01:00
Yonas Habteab a0607aceff Fix compiler warnings don't move local variables 2022-02-22 17:51:43 +01:00
Julian Brost 5383df3c79
Merge pull request #9212 from Icinga/bugfix/multi-ido-notification-id
IDO: fix incorrect contacts in notification history with multiple IDO instances on a single node
2022-02-21 11:40:46 +01:00
Julian Brost 8e81faf3e0
Merge pull request #9221 from Icinga/bugfix/processcheckresult-dependency-deadlock
Prevent deadlock in ProcessCheckResult
2022-02-18 14:14:46 +01:00
Julian Brost 99008755b5
Merge pull request #9213 from Icinga/feature/icingadb-add-previous_soft_state-to-host_state-and-service_state-9210
IcingaDB: Add previous_soft_state to host_state and service_state
2022-02-18 14:09:35 +01:00
Julian Brost 3bb9cdb8cc Prevent deadlock in ProcessCheckResult
Without this commit, children and parents of a checkable were rescheduled on a
state change while holding the lock for the current checkable. If both ends of
a dependency are checked at the same time and both change state, they could end
up in a deadlock waiting for each other.

This commit fixes this problem by changing the code so that other checkables
are rescheduled only after releasing the lock for the current checkable.
2022-02-17 16:13:25 +01:00
Alexander A. Klimov c613e62454 IcingaDB: Add previous_soft_state to host_state and service_state
refs #9210
2022-02-14 11:32:46 +01:00
Julian Brost 7c9d0fff01 IDO: use per-instance notification_id in history
When there are multiple active IDO instances on the same node, before this
commit, all of them would share a single DbValue object for the notification_id
column of the icinga_contactnotifications table. This resulted in the issue
that one database references the notification_id in another database.

This commit fixes this by using a separate DbValue value for each IDO instance.
This needs a new signal as the existing OnQuery and OnMultipleQueries signals
perform the same queries on all IDO instances, but different queries are needed
here per instance (they only differ in the referenced DbValue). Therefore, a
new signal OnMakeQueries is added that takes a std::function which is called
once per IDO instance and can access callbacks to perform one or multiple
queries only on this specific IDO instance.
2022-02-10 16:36:35 +01:00
Julian Brost 1b0ad099f1
Merge pull request #9154 from Icinga/bugfix/icingadb-reachabilitychangehandler-9143
Icinga DB: ensure is_reachable and severity don't miss updates
2022-02-03 14:53:51 +01:00
Alexander A. Klimov 2ef3dd6a38 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged less often
Call it only on state changes to reduce no-op Redis/IDO updates a lot.

refs #9143
2022-02-03 11:12:53 +01:00
Alexander Aleksandrovič Klimov ff712f6b23
Service#GetSeverity(): behave as the respective IDO query of Icinga Web
which doesn't include host reachability.
2022-01-27 12:21:06 +01:00
Alexander A. Klimov 4c38715ef2 Checkable#ProcessCheckResult(): call Checkable::OnReachabilityChanged last
to ensure Checkable#IsReachable() returns correctly for dependency children inside OnReachabilityChanged().
That needs the dependency parent to be already in the correct state.

refs #9143
2022-01-25 13:33:46 +01:00
Alexander A. Klimov 84d09876b4 Icinga DB: ensure is_reachable and severity don't miss updates
refs #9143
2022-01-25 13:33:46 +01:00
Julian Brost 185fab3761
Merge pull request #9144 from Icinga/bugfix/icingadb-state-history
Icinga DB: don't write state history for ack/downtime/host problem changes
2022-01-20 12:00:24 +01:00
Julian Brost 6390911262
Merge pull request #9123 from Icinga/bugfix/icinga2-crashes-when-sending-notifications-8186
Avoid "type" key in dicts being part of object state attrs
2022-01-19 11:48:40 +01:00
Julian Brost 463b159414
Merge pull request #9171 from Icinga/bugfix/icinga-db-notification-history-might-use-incorrect-previous_hard_state-9132
IcingaDB#SendSentNotification(): make stream deterministic via CheckResult#previous_hard_state
2022-01-18 16:54:16 +01:00
Julian Brost 31da6a56e6 Icinga DB: remove obsolete StateChangeHandler overload
This version of StateChangeHandler is no longer called anywhere as it was the
wrong function for all previous callers anyways.
2022-01-18 12:26:43 +01:00
Julian Brost cf73c6136b Icinga DB: make host problem change events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-01-18 12:26:43 +01:00
Julian Brost 855e342b63 Icinga DB: make acknowledgement events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-01-18 12:26:43 +01:00
Julian Brost f63268b0dd Icinga DB: make downtime events update the state tables but not write state history
StateChangeHandler() is the function used when the actual hard/soft state
changes and thus also writes state history. This is not desired in this case,
instead, a runtime update should be generated, therefore call UpdateState()
instead.

refs #9063
2022-01-18 12:26:43 +01:00
Julian Brost 447884be72 Icinga DB: don't reimplement volatile state update in SendConfigUpdate
Sending a volatile state update is already implemented in UpdateState, so just
use that function instead of generating the update queries.
2022-01-18 12:26:43 +01:00
Julian Brost a6d6cb788e Icinga DB: Merge SendStatusUpdate into UpdateState
Previously, both funktions did related operations but had unclear and confusing
naming:
- UpdateState updated the icinga:{host,service}:state Redis keys.
- SendStatusUpdate sent a runtime update for the icinga:{host,service}:state.

This commit merges both functions into one with a new mode parameter. The
following modes are now supported:
- Volatile: Update the icinga:{host,service}:state Redis key.
- Full: Perform the volatile state update and in addition send a corresponding
  runtime update so that this state update gets written through to the
  persistent database by a running icingadb process.
- RuntimeOnly: Special mode for callers that can ensure that a volatile update
  for the current state was already performed but has to be upgraded to a full
  update.

refs #9063
2022-01-18 12:26:43 +01:00
Alexander A. Klimov 1fee3f1b12 IcingaDB#SendSentNotification(): make stream deterministic via CheckResult#previous_hard_state
Now it gets everything from one source, the CheckResult.

refs #9132
2022-01-10 19:18:11 +01:00
Julian Brost 3d04b04172
Merge pull request #9138 from Icinga/bugfix/mysql-schema-versions
Make MySQL schema version in full schema file and upgrade files consistent
2022-01-10 09:54:38 +01:00
Julian Brost e518dc2436
Merge pull request #9112 from Icinga/bugfix/sync-missing-history-information
Icinga DB: ensure consistent history streams in HA setup
2022-01-07 15:14:06 +01:00
Julian Brost a99c04030c
Merge pull request #9150 from Icinga/bugfix/icingadb-cmd-arg-order-int
Icinga DB: ensure icinga:*command:argument#order is an int
2022-01-05 16:07:30 +01:00
Julian Brost 3e73a262cc Sync comment and downtime removal info for Icinga DB history
When a comment or downtime is removed manually, the name of the requestor and
timestamp have to be synced to other nodes in the cluster to allow all of them
to generate a consistent Icinga DB history stream.

refs #9101
2022-01-05 10:27:13 +01:00
Alexander Aleksandrovič Klimov 1b50d912a0
Merge pull request #9137 from Icinga/bugfix/influxdb-writer-synchronization
Fix unsafe concurrent access to m_DataBuffer in InfluxdbCommonWriter
2022-01-04 17:37:28 +01:00
Alexander A. Klimov e9e555468d Handle "type" key in dicts being part of object state attrs
i.e. the confusion of the state file deserializator with e.g. `"type":32` on startup.
That would unexpectedly restore (the now ignored) null (not `{"type":32}`) as there's no type "32".

refs #8186
2022-01-04 17:17:20 +01:00
Alexander Aleksandrovič Klimov 80663cf5e6
Merge pull request #9048 from Icinga/bugfix/timeperiod-dst-2.0
LegacyTimePeriod::ScriptFunc: fix DST edge-cases
2022-01-03 18:11:32 +01:00
Alexander A. Klimov a8c9d19dae Icinga DB: ensure icinga:*command:argument#order is an int
The config parser requires *Command#arguments#order to be a Number, i.e. 42,
4.2 or even "4.2". That's int-casted where needed, now also for Icinga DB.

Before:

```
object CheckCommand "9117" {
	command = [ "true" ]
	arguments = {
		"4.2" = { order = "4.2" }
	}
}
```

2022-01-03T13:25:07.166+0100	FATAL	icingadb	json: cannot unmarshal string into Go value of type int64
2022-01-03 13:28:19 +01:00
Julian Brost 33781496da InfluxdbCommonWriter: use atomic_size_t to data buffer size from stats function
m_DataBuffer may be modified concurrently while StatsFunc() is called, thus
it's unsafe to call size() on it. As write access to m_DataBuffer is already
synchronized by only modifying it from the single work queue thread, instead of
adding a mutex, this commit adds a new std::atomic_size_t which is additionally
updated when modifying m_DataBuffer and can safely be accessed in StatsFunc().
2022-01-03 12:24:26 +01:00
Julian Brost e6300aacf9 InfluxdbCommonWriter: only flush from work queue
There is no explicit synchronization of access to m_DataBuffer which is fine if
it is only accessed from the single-threaded work queue. However, Stop() also
called Flush() in another thread, leading to concurrent write access to
m_DataBuffer which can result in a crash due to use after free/double free.

Changes in this commit:
* Flush() is renamed to FlushWQ() to show that it should only be called from
  the work queue. Additionally, it now asserts that it is running on the work
  queue.
* Visibility of some data members is changed from protected to private. No
  other classes have to access these at the moment. By this change, accidental
  concurrent access from derived classes in the future is prevented.
* Stop() now flushes by posting FlushWQ() to the work queue and joining it.
2022-01-03 12:24:26 +01:00
Julian Brost 23693248d4 Make MySQL schema version in full schema file and upgrade files consistent
In the 2.12.6 release, the full schema file sets the version to 1.14.3, whereas
the latest available upgrade file 2.11.0.sql sets it to 1.15.0. Therefore, ship
a new upgrade file 2.12.7.sql for all users who imported their schema with
version 2.11.0 or later and never performed an upgrade since then. Their
databases incorrectly state schema version 1.14.3 and is bumped to the correct
version 1.15.0 by the upgrade.

In the 2.13.2 release, the full schema file sets the version to 1.15.0, whereas
the latest available upgrade file 2.13.0.sql sets it to 1.15.1. Therefore,
rename the incorrectly named upgrade file 2.13.1.sql (it was not shipped in
this or any other release so far) to 2.13.3.sql for users who imported their
schema with version 2.13.0 or later and never performed an upgrade since then.
Their databases incorrectly state schema version 1.15.0 and are bumped to the
correct version 1.15.1 by the upgrade.

The full schema is not touched by this commit as for the current branch, this
was already fixed by 815533b334.
2021-12-16 15:48:12 +01:00
Julian Brost 13ea635188 Don't trigger a fixed downtime like a flexible one
When creating a fixed downtime that starts immediately while the checkable is
in a non-OK state, previously the code path for flexible downtimes was used to
trigger this downtime. This is fixed by this commit which resolves two issued:

1. Missing downtime start notification: notifications work differently for
   fixed and flexible downtimes. This resulted in missing downtime start
   notifications under the conditions described above.
2. Incorrect downtime trigger time: this code path would incorrectly assume the
   timestamp of the last checkable as the trigger time which is incorrect for
   fixed downtimes.
2021-12-14 11:02:40 +01:00
Alexander A. Klimov eb71fb7529 Avoid "type" key in dicts being part of object state attrs
not to confuse the state file deserializator with e.g. `"type":32` on startup.
That would unexpectedly restore null (not `{"type":32}`) as there's no type "32".

refs #8186
2021-12-13 17:56:12 +01:00
Julian Brost c71029f2e8 Set downtime trigger time deterministically
When triggering a downtime, the time of the causing event is now passed on as
the trigger time. That time is:

* For fixed downtimes: the later one of start and entry time.
* If a check result triggers the downtime: The execution end of the check
  result.
* If another downtime triggers the downtime: The trigger time of the first
  downtime.

This is done so two nodes in a HA setup can write consistent Icinga DB downtime
history streams.

refs #9101
2021-12-08 14:15:50 +01:00
Alexander Aleksandrovič Klimov 577cf94b59
Merge pull request #8956 from Icinga/Al2Klimov-patch-3
Fix IDO MySQL schema version
2021-12-07 15:31:00 +01:00
Alexander Aleksandrovič Klimov 31c564182a
Merge pull request #8990 from Icinga/bugfix/downtime-all-services-on-child-hosts
Fix scheduling of downtimes for all services on child hosts
2021-12-07 12:48:01 +01:00
Julian Brost 596fcdc123 Downtime::DowntimesExpireTimerHandler: don't copy vector
`ConfigType::GetObjectsByType<Downtime>()` already returns a
`std::vector<Downtime::Ptr>` so there is no point in copying it into another
vector of the same type just to then iterate the copied vector instead of the
original one.
2021-12-01 13:05:23 +01:00
Yonas Habteab 361807f7a9
Adjust incosistent pki log messages (#8965) 2021-11-22 16:06:55 +01:00
Julian Brost d09925189a
Merge pull request #9037 from Icinga/Al2Klimov-patch-4
InfluxdbCommonWriter#Flush(): fix log message
2021-11-19 17:09:05 +01:00
Julian Brost 2ad0a4b8c3 Add missing include to fix non-unity builds
This commit fixes the following build error:

    [ 55%] Building CXX object lib/icinga/CMakeFiles/icinga.dir/usergroup.cpp.o
    lib/icinga/usergroup.cpp:79:24: error: incomplete type ‘icinga::Notification’ used in nested name specifier
       79 | std::set<Notification::Ptr> UserGroup::GetNotifications() const
          |                        ^~~
2021-11-17 16:11:15 +01:00
Julian Brost a740b1d66c LegacyTimePeriod::ScriptFunc: fix DST edge-cases
This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
  full days, but ScriptFunc may have called them with some random timestamp
  during the day. This is fixed by always using midnight of the day as
  reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
  On days with DST changes, this could either still be on the same day (a day
  may have 25 hours) or skip an entire day (a day may have 23 hours). This is
  fixed by using a struct tm to advance the time to the next day.
2021-11-17 13:09:10 +01:00
Noah Hilverling 4d3b1709fd
Merge pull request #9009 from Icinga/bugfix/icingadb-runtime-updates-delete-relationships
Icinga DB: Make sure object relationships are handled correctly during runtime updates
2021-11-12 17:52:59 +01:00
Julian Brost b9e6273ba0 Icinga DB: only log queries at debug level 2021-11-12 15:41:17 +01:00
Noah Hilverling 7a0796061a IcingaDB::AddObjectDataToRuntimeUpdates(): Copy data before modifying 2021-11-12 13:34:57 +01:00
Noah Hilverling 10bde2075a Dictionary: Make sure underlaying map is ordered 2021-11-12 13:34:57 +01:00
Noah Hilverling 73e0d6e61b Icinga DB: Make sure object relationships are handled correctly 2021-11-12 13:34:57 +01:00
Noah Hilverling 4e79eb080c
Merge pull request #9058 from Icinga/bugfix/icingadb-prefix-command_id
IcingaDB: Prefix command_id with command type
2021-11-11 11:50:26 +01:00
Noah Hilverling c1098bef35
Merge pull request #9061 from Icinga/add-downtime-duration-and-service-state-host-id-streams
Icinga DB: Add `downtime.duration` & `service_state.host_id` to Redis
2021-11-11 10:19:47 +01:00
Noah Hilverling a9c2304c61 IcingaDB: Prefix command_id with command type 2021-11-09 12:26:30 +01:00
Eric Lippmann 35053ac1dd Icinga DB: Sync groups earlier
Host and service groups are structural information that are used
for Web filters and should therefore be synchronized as soon as
possible.
2021-11-09 11:17:01 +01:00
Alexander A. Klimov 07c8440fd2 Icinga DB: sync checkables along with their states first
`WorkQueue#ParallelFor(x, false, y)` will enqueue x's items in FIFO order,
so x has to start with host and service.
2021-11-09 11:17:01 +01:00
Yonas Habteab fe5aa1e18d Icinga DB: Add `service_state.host_id` to Redis 2021-11-09 11:08:22 +01:00
Yonas Habteab 5dc45baebb Icinga DB: Add `downtime.duration` & `scheduled_duration` to Redis 2021-11-09 11:08:22 +01:00
Julian Brost 848f1ae167
Merge pull request #8998 from Icinga/bugfix/icingadb-program-start-milliseconds
Icinga DB: set value in milliseconds for program_start in stats/heartbeat
2021-11-08 18:18:19 +01:00
Julian Brost 524fe92a1d
Merge pull request #9028 from Icinga/bugfix/icingadb-zone-parent
IcingaDB: actually write parent to parent_id of zones
2021-11-08 18:08:48 +01:00
Julian Brost e46d83b6be Icinga DB: set value in milliseconds for program_start in stats/heartbeat 2021-11-08 14:37:08 +01:00
Noah Hilverling 0b9317a5bf IcingaDB: Remove GetObjectIdentifiersWithoutEnv()
Having the command type be a part of the command ID isn't needed anywhere. Removing this simplifies the way we generate IDs in general, because we don't need Prepend() anymore.

The command type was only needed to prevent ID collisions within the command_envvar and command_argument tables. Those tables have since been separated into {check,event,notification}command_envvar and {check,event,notification}command_argument tables.
2021-11-05 17:01:40 +01:00
Julian Brost 3c8672b4dc Icinga DB: increase Redis schema version
PR #9036 introduces some incompatible changes to the Redis schema, most
importantly where Icinga DB has to read the environment from: now it has to use
a new top-level key of the icinga:stats message instead of a value in the
IcingaApplication part of that message.
2021-11-05 14:14:37 +01:00
Julian Brost 6007848146 IcingaDB: export environment_id via API
Primarily required for Icinga DB integration tests at the moment, but could
also be helpful in other situations.
2021-11-05 14:14:37 +01:00
Julian Brost 4ade4c757b IcingaDB: write new environment to icinga:stats stream 2021-11-05 14:14:37 +01:00
Julian Brost 525dd50859 IcingaDB: introduce a new environment ID derived from the CA public key
In order to avoid changes to the environment ID, it is now no longer derived
from the Environment constant but instead from the public key of the CA
certificate. This ensures that it is different between clusters by default, so
no additional changes have to be done to allow two clusters to use Icinga DB to
write into the same database.

To prevent the ID from changing when the CA certificate is replaced, it is also
persisted into the file /var/lib/icinga2/icingadb.env, so if that file exists,
it takes precedence over the CA certificate.
2021-11-05 14:14:37 +01:00
Julian Brost 6cd3a483a0 tlsutility: move hex encoding into a separate function BinaryToHex 2021-11-05 14:14:37 +01:00
Julian Brost f976e351f4
Merge pull request #9044 from Icinga/bugfix/idb-dump-buf-lost
Icinga DB init. dump: flush both buffered states and state checksums
2021-11-04 12:26:28 +01:00
Alexander A. Klimov 0ff7d0a06e Icinga DB: raise icinga:schema 1 -> 2 2021-11-02 15:00:55 +01:00
Alexander A. Klimov b1714a10c2 Icinga DB: make icinga:history:stream:*#event_id deterministic
... i.e. UUID -> SHA1(env, eventType, x...) given that SHA1(env, x...) = type-specific ID.
Rationale: allow both masters to write the same history concurrently (while not
in split-brain), so that REPLACE INTO deduplicates the same events written twice.

* ack: SHA1(env, "ack_set"|"ack_clear", checkable.name, setTime)
* comment: SHA1(env, "comment_add"|"comment_remove", comment.name)
* downtime: SHA1(env, "downtime_start"|"downtime_end", downtime.name)
* flapping: SHA1(env, "flapping_start"|"flapping_end", checkable.name, startTime)
* notification: SHA1(env, "notification", notification.name, notificationType, sendTime)
* state: SHA1(env, "state_change", checkable.name, changeTime)
2021-11-02 15:00:03 +01:00
Alexander A. Klimov 5c44365c4e Icinga DB: make icinga:history:stream:notification#id deterministic
... i.e. UUID -> SHA1(x..., send time) given that SHA1(x...) = notification id.
Rationale: allow both masters to write the same notification history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
2021-11-02 15:00:03 +01:00
Alexander A. Klimov c2422c56fe Icinga DB: make icinga:history:stream:state#id deterministic
... i.e. UUID -> SHA1(x..., check time) given that SHA1(x...) = checkable id.
Rationale: allow both masters to write the same state history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
2021-11-02 15:00:03 +01:00
Alexander Aleksandrovič Klimov f5f8ccb1f4
Merge pull request #9020 from Icinga/feature/icingaeb-schema-version
Icinga DB: publish Redis schema version via XADD icinga:schema
2021-10-25 13:21:37 +02:00
Alexander A. Klimov d8b4768471 Icinga DB init. dump: flush both buffered states and state checksums
not to dump x states, but only x - (x % bulk) state checksums.
2021-10-21 13:49:24 +02:00
Noah Hilverling a7cbf50674
Merge pull request #9030 from Icinga/Al2Klimov-patch-1
Icinga DB: don't include checkable types in history IDs
2021-10-19 14:52:43 +02:00
Alexander A. Klimov 4b0688047e Icinga DB: stream runtime state updates only to icinga:runtime:state
... where they belong to, not to icinga:runtime.
2021-10-18 18:11:30 +02:00
Alexander Aleksandrovič Klimov 99c5c24a17
InfluxdbCommonWriter#Flush(): fix log message
s/InfluxdbWriter/Influxdb2Writer/

fixes #9035
2021-10-14 12:03:45 +02:00
Alexander Aleksandrovič Klimov e0339c387b
Icinga DB: don't include checkable types in history IDs
... as they’re unnecessary for being distinguish across types.
Services always have a ! in the name, hosts never do.
2021-10-11 16:14:30 +02:00
Alexander Aleksandrovič Klimov 30a5ba3961
Merge pull request #9002 from Icinga/feature/icingadb-remove-usernotification-stream
Icinga DB: remove usernotification history stream
2021-10-08 19:16:26 +02:00
Alexander Aleksandrovič Klimov 4190d58668
Merge pull request #9011 from Icinga/bugfix/icingadb-remove-zone-parent-key
Icinga DB: Remove unused Redis key 'icinga:zone:parent'
2021-10-08 17:19:51 +02:00
Alexander Aleksandrovič Klimov ff60c1af37
Merge pull request #8895 from Icinga/bugfix/typo-8766
Fix typo
2021-10-08 17:19:20 +02:00
Noah Hilverling 750e64b974 Icinga DB: Remove unused Redis key 'icinga:zone:parent' 2021-10-08 12:06:14 +02:00
Julian Brost df84a498f4 IcingaDB: actually write parent to parent_id of zones
This fixes that the code used the wrong variable. Previously, it was written to
Redis that each zone is its own parent (if it has a parent at all).
2021-10-08 11:15:54 +02:00
Alexander A. Klimov 3bf180a341 Fix typo
refs #8766
2021-10-08 10:27:35 +02:00
Alexander Aleksandrovič Klimov ed50a9d529
Merge pull request #9001 from Icinga/feature/icingadb-add-user-ids-to-notification-history
Icinga DB: Write IDs of notified users into notification history stream
2021-10-01 17:42:48 +02:00
Alexander Aleksandrovič Klimov 63fca8faa1
Merge pull request #9000 from haxtibal/feature/journaldlogger
JournaldLogger - log to systemd journal
2021-10-01 17:42:10 +02:00
Alexander A. Klimov 0182d793ac Icinga DB: publish Redis schema version via XADD icinga:schema
... to be able both to subscribe for its change and to just fetch it.
2021-10-01 15:58:57 +02:00
Alexander Aleksandrovič Klimov 6cf0673c11
Merge pull request #9010 from Icinga/feature/icingadb-scheduling_source
Make CheckResult#scheduling_source available to Icinga DB
2021-09-27 16:31:16 +02:00
Tobias Deiminger eb8f67335e Define SD_JOURNAL_SUPPRESS_LOCATION more locally
add_definitions would set SD_JOURNAL_SUPPRESS_LOCATION for all targets
in directory and sub-directories. However, another future target might
want the opposite, so define it as local as possible to journaldlogger.cpp.

To make this work, we must take journaldlogger.cpp out of the unity
build, because all files from a unity of share compiler definitions.
2021-09-23 16:08:39 +02:00
Tobias Deiminger 173caa42aa Add a JournaldLogger
As proposed in #8857, this adds a Logger subclass that writes structured
log messages via journald's native protocol by calling sd_journal_sendv.
The feature therefore depends on the systemd library. sd_journal_sendv is
available since the early days (systemd v38), so a version check is
probably superflous.

We add the following fields to each record:
- MESSAGE: The log message
- PRIORITY (aka severity): Numeric severity as in RFC5424 section 6.2.1
- SYSLOG_FACILITY: Numeric facility as in RFC5424 section 6.2.1
- SYSLOG_IDENTIFIER: If provided, use value from configuration.
  Else use systemd's default behaior, which is to determine the field
  by using libc's program_invocation_short_name, resulting in "icinga2".
- ICINGA2_FACILITY: Facility as in Log::Log(..., String facility, ...),
  e.g. "ApiListener"
- some more fields are added automatically by systemd

Fields are stored indexed, so we can do fast queries for certain field
values. Example:

$ journalctl -t icinga2 ICINGA2_FACILITY=ApiListener -n 5

Syslog compatiblity is ratained because good old tag, severity and facility
is stored along, and systemd can forward to syslog daemons.

See also https://systemd.io/JOURNAL_NATIVE_PROTOCOL/.
2021-09-23 16:08:11 +02:00
Alexander A. Klimov 755fc72a66 Make CheckResult#scheduling_source available to Icinga DB 2021-09-22 16:57:49 +02:00
Julian Brost 6fc15449a8
Merge pull request #8953 from Icinga/bugfix/icinga-checksum-state-growing
Icinga DB: clean up vanished objects from icinga:checksum:*:state
2021-09-17 12:04:41 +02:00
Julian Brost 130b22e939 Icinga DB: remove usernotification history stream
These will be added to the normal notification stream so there is no more need
for this extra stream.
2021-09-15 14:47:25 +02:00
Julian Brost 81e5feeb08 Icinga DB: Write IDs of notified users into notification history stream 2021-09-15 14:45:35 +02:00
Tobias Deiminger de7808e32c Make syslog facility handling reusable
The upcoming JournaldLogger will need the same syslog validation and
conversion logic, so factor it out from SyslogLogger to make it
reusable.

Also explicitely include syslog.h, which defines the syslog()
function.
2021-09-15 10:15:22 +02:00
Julian Brost bb0dcdf0b4 Prevent duplicate donwtimes when combining child_options and all_services 2021-09-03 15:44:01 +02:00
Julian Brost e556d3c489 Fix scheduling of downtimes for all services on child hosts
The loop iterated over the services of the wrong host resulting in duplicate
downtimes scheduled for services of the parent host instead of downtimes for
services of the child host.
2021-09-03 15:19:27 +02:00
Noah Hilverling 95cdc00ad4
Merge pull request from GHSA-cxfm-8j5v-5qr2
Add TLS server certificate validation to ElasticsearchWriter, GelfWriter, InfluxdbWriter and Influxdb2Writer (v2)
2021-08-19 13:52:29 +02:00
Alexander Aleksandrovič Klimov dfc633074e
Merge pull request #8966 from Icinga/feature/scheduled_by
Icinga DB: introduce icinga:history:stream:downtime#scheduled_by
2021-08-16 16:37:08 +02:00
Julian Brost cb09d6833f RedisConnection: remove now redundant setting of TLS verification parameters
This is now done in UnbufferedAsioTlsStream.
2021-08-13 17:24:24 +02:00
Julian Brost 3ab347bfd4 GelfWriter: show error message of exceptions 2021-08-13 17:24:24 +02:00
Julian Brost 8f3f692ecf InfluxdbCommonWriter: actually verify TLS server certificates
And add a new option ssl_insecure_noverify to explicitly disable it if desired.
2021-08-13 17:24:24 +02:00
Julian Brost 29e9df938c GelfWriter: actually verify TLS server certificates
And add a new option insecure_noverify to explicitly disable it if desired.
2021-08-13 17:24:24 +02:00
Julian Brost 5cada85e54 ElasticsearchWriter: actually verify TLS server certificates
And add a new option insecure_noverify to explicitly disable it if desired.
2021-08-13 17:24:24 +02:00
Julian Brost 396f003c69 Enable hostname verification in UnbufferedAsioTlsStream 2021-08-13 10:58:10 +02:00
Alexander A. Klimov 70b4558a62 Icinga DB: introduce icinga:history:stream:downtime#scheduled_by
... with the Downtime#scheduled_by attribute.
2021-08-09 20:07:38 +02:00
Alexander Aleksandrovič Klimov 852d674ec0
Merge pull request #8957 from Icinga/bugfix/apilistener-detect-ipv6-support
ApiListener: Choose bind host default based on OS IPv6 support
2021-08-09 17:32:40 +02:00
Julian Brost ec73b417f2 ApiListener: Choose bind host default based on OS IPv6 support 2021-08-06 12:19:08 +02:00
Alexander Aleksandrovič Klimov 1fbc15bebc
Provide IDO MySQL schema version fix
... for fresh 2.13 installations which didn't make use of previous schema upgrades.
2021-08-05 13:46:07 +02:00
Alexander Aleksandrovič Klimov 815533b334
Fix IDO MySQL schema version
... to match the latest upgrade script.
2021-08-05 13:17:36 +02:00
Julian Brost 782669f13b IDO PgSQL: always use regular string literals
IdoPgsqlConnection::Escape() internally uses PQescapeStringConn() and its
documentation states the following:

  Furthermore, PQescapeStringConn does not generate the single quotes that must
  surround PostgreSQL string literals; they should be provided in the SQL
  command that the result is inserted into.

So it's intended to use the result in 'string' literals, not in E'string'
literals as Icinga did. This results in problems as the behavior of
PQescapeStringConn() depends on how the current connection will interpret
regular single quoted literals, namely on the value of the
standard_conforming_strings variable.

The E'string' literals were initially introduced in
ac6f3f8acf to fix #1206 where PostgreSQL started
warning about escape sequences in string literals not supported by the SQL
standard (but by PostgreSQL depending on the value of
standard_conforming_strings). In the meantime the oldest PostgreSQL version on
any platform supported by Icinga increased to 9.2 (CentOS 7) and starting with
9.1, standard_conforming_strings is enabled by default, so there will be no
warnings about escape sequences (as the warning is only issued if the escape
sequence is actually interpreted by PostgreSQL).
2021-08-05 11:39:32 +02:00
Alexander A. Klimov e3a5d613aa Icinga DB: clean up vanished objects from icinga:checksum:*:state
... not to let it grow non-stop.
2021-08-05 11:32:47 +02:00
Alexander Aleksandrovič Klimov 3aa2289c59
Merge pull request #8946 from Icinga/bugfix/old-packages
ConfigPackageUtility::ValidatePackageName(): always tolerate already existing packages
2021-08-02 20:27:27 +02:00
Alexander A. Klimov 57df803e35 ConfigPackageUtility::ValidatePackageName(): always tolerate already existing packages
... not to require migrating invalid ones.
2021-08-02 15:40:14 +02:00
Alexander A. Klimov c1df4b70f5 ConfigPackageUtility::PackageExists(): accept invalid package names, too 2021-08-02 15:40:14 +02:00
Alexander A. Klimov c666f81361 De-couple package and stage name validation 2021-08-02 15:40:14 +02:00
Alexander Aleksandrovič Klimov 40c186515b
Merge pull request #8942 from Icinga/bugfix/idb-hashes
Icinga DB: keep state checksums consistent
2021-07-29 21:54:58 +02:00
Julian Brost 6fa44c8e4e
Merge pull request #8941 from Icinga/bugfix/icingadb-init-all-connections-before-sync
Icinga DB: ensure all connections are ready on first use
2021-07-29 17:33:29 +02:00
Alexander Aleksandrovič Klimov afca6c001e
Merge pull request #8916 from Icinga/feature/icingadb-last_comment_id
Icinga DB: introduce Checkable#last_comment_id
2021-07-29 17:29:51 +02:00
Alexander A. Klimov 8476627e91 Icinga DB: keep state checksums consistent
I.e. make hashes in hashmaps and stream the same.
2021-07-29 12:43:40 +02:00
Alexander A. Klimov 5c10fffa3b Icinga DB: introduce Checkable#last_comment_id 2021-07-29 12:22:12 +02:00
Alexander A. Klimov 173a93c487 Split IcingaDB#SendStatusUpdate(), separate stream and history 2021-07-29 12:22:12 +02:00
Alexander A. Klimov 2818245e01 Introduce Checkable#GetLastComment() 2021-07-29 12:10:42 +02:00
Alexander Aleksandrovič Klimov 5923950e61
Merge pull request #8919 from Icinga/bugfix/idb-del-state-chksm
Icinga DB: HDEL also icinga:checksum:*:state, not only icinga:*:state
2021-07-29 11:08:33 +02:00
Julian Brost 929ebd0f6c IcingaDB: start initial sync after all child connections are established
Icinga started the initial config sync right after the first Redis connection
was established. If any other connections would take longer to connect than
when it's first needed, queries were discarded.
2021-07-28 15:27:32 +02:00
Julian Brost a50120c399 IcingaDB: start parent connection after children are initialized
The loop in the connected callback of the parent connection uses m_Rcons which
previously was only initialized after that connection was already started.
2021-07-28 15:27:20 +02:00
Julian Brost f4b2bbc7af
Merge pull request #8940 from Icinga/bugfix/redisconnection-reset-callback
RedisConnection: copy callback before calling it
2021-07-28 15:19:26 +02:00
Julian Brost 9d5ae0f6fa
Merge pull request #8899 from Icinga/feature/icingadb-connect_timeout
Introduce IcingaDB#connect_timeout
2021-07-28 13:52:00 +02:00
Julian Brost cc8d3fbedd
Merge pull request #8937 from Icinga/bugfix/timeout-always-unknown
Override exit code on process timeout
2021-07-28 11:56:42 +02:00
Julian Brost 4c7199fd7d RedisConnection: copy callback before calling it
This allows the callback to call RedisConnection::SetConnectedCallback() to set
another callback for this connection. This sets m_ConnectedCallback and thereby
destroys the std::function while it's running resulting in undefined behavior.
By operating on a copy, m_ConnectedCallback can be set without affecting the
currently running callback.
2021-07-28 11:34:17 +02:00
Noah Hilverling ff2abaa687
Merge pull request #8917 from Icinga/bugfix/idb-state-del-prio
Icinga DB: HDEL from *:state with same prio as HSET
2021-07-28 11:08:10 +02:00
Alexander A. Klimov 0919df5aa1 Introduce IcingaDB#connect_timeout 2021-07-27 21:59:09 +02:00
Alexander A. Klimov 504fdda76c Introduce DEFAULT_CONNECT_TIMEOUT 2021-07-27 21:57:02 +02:00
Alexander Aleksandrovič Klimov 9169c805a8
Merge pull request #8933 from Icinga/bugfix/icinga-db-only-start-multiple-redis-connections-after-the-first-one-succeeded-8920
Icinga DB: only start multiple Redis connections after the first one succeeded
2021-07-27 21:52:21 +02:00
Alexander Aleksandrovič Klimov 4d2f694805
Merge pull request #8897 from Icinga/feature/icingadb-pass-db
RedisConnection: AUTH and SELECT
2021-07-27 21:51:46 +02:00
Alexander Aleksandrovič Klimov 2d75bbd8ed
Merge pull request #8915 from Icinga/bugfix/icingadb-prio-state
Icinga DB: priorize state > config
2021-07-27 20:22:26 +02:00
Noah Hilverling dcb5fcc7ba
Merge pull request #8923 from Icinga/bugfix/idb-del-icinga-nextupdate-
Icinga DB: DEL icinga:nextupdate:* along with the other keys to delete
2021-07-27 19:05:43 +02:00
Noah Hilverling 07cb6cd1cb
Merge pull request #8930 from Icinga/bugfix/wq-balance
WorkQueue#ParallelFor(): optionally don't pre-glue items together to chunks of different size
2021-07-27 19:05:26 +02:00
Julian Brost 42eb055c5f
Merge pull request #8921 from Icinga/bugfix/timeperiod-dst
TimePeriod/ScheduledDowntime: improve DST handling
2021-07-27 18:11:34 +02:00
Noah Hilverling 07145d2e61
Merge pull request #8913 from Icinga/feature/remove-child-downtimes
API Action "remove-downtime": Also remove child downtimes
2021-07-27 18:02:15 +02:00
Julian Brost a55939e462 Override exit code on process timeout
As Icinga first sends a SIGTERM to a check plugin on timeout to allow it to
terminate gracefully, this is not really part of the plugin API specification
and we cannot assume that plugins will handle this correctly and still exit
with an exit code that maps to UNKNOWN. Therefore, once Icinga decides to kill
a process, force its exit code to 128 to be sure the state will be UNKNOWN
after a timeout.
2021-07-27 17:57:19 +02:00