Commit Graph

5676 Commits

Author SHA1 Message Date
Edgar Fuß 3c050fcc46 Drop passive check results for unreachable hosts/services
Disregard passive check results while no active checks are being scheduled due to violated dependencies.

This copes with the fact that programs feeding passive check results into Icinga may have no notion of reachability and so drive a checkable into HARD state although dependencies have caused active check scheduling being suspended. This may prevent superflous problem notifications being emitted during recovery.

As disable_checks defaults to false, it was regarded OK (by @Al2Klimov) to make this behaviour (which resembles the active check case) unconditional and not conditionalize it on an additional attribute.

In the description of disable_checks, note that a value of true both disables scheduling of active checks and drops passive check results.
2021-01-19 20:08:38 +01:00
Julian Brost 174f7f75a8 IcingaDB: wait for queries to be executed in inital sync
This delays the log message stating that the initial dump is done until
all queries are actually done and now logs a meaningful duration. In
addition, this delays the return of the function and therefore when
state variables are updated by the caller.
2021-01-19 17:14:42 +01:00
Alexander Aleksandrovič Klimov 3976f256a8
Merge pull request #8593 from Icinga/bugfix/activate-items-assertion
Remove incorrect assertion in ConfigItem::ActivateItems
2021-01-19 13:52:16 +01:00
Alexander Aleksandrovič Klimov cbd0d6ea6e
Merge pull request #8588 from Icinga/bugfix/concurrent-schedule-downtime-delete-host
Fix null pointer dereferences when deleting objects while scheduling downtimes
2021-01-19 13:51:08 +01:00
Julian Brost 509db4ab94 Delay start of IcingaDB until most config objects are activated
This commit sets the activation priority if IcingaDB objects to 100 (the
same value as IDO uses) so that it get's activated after most regular
config objects (hosts, services, ...).

Before (note how Icinga 2 continues to active objects for over a minute
after IcingaDB is started and thinks the initial dump is done):

    [2021-01-19 08:33:19 +0000] information/IcingaDB: 'icingadb' started.
    [2021-01-19 08:34:02 +0000] information/IcingaDB: Initial config/status dump finished in 28.247 seconds.
    [2021-01-19 08:35:49 +0000] information/ConfigItem: Activated all objects.

After (now activation of objects is done right after IcingaDB is
started, as it's one of the last objects to be activated):

    [2021-01-19 08:39:01 +0000] information/IcingaDB: 'icingadb' started.
    [2021-01-19 08:39:02 +0000] information/ConfigItem: Activated all objects.
    [2021-01-19 08:39:38 +0000] information/IcingaDB: Initial config/status dump finished in 21.6606 seconds.
2021-01-19 09:45:47 +01:00
Yonas Habteab 5b0bbd6351 IcingaDB: Check whether or not cr is nullptr 2021-01-18 11:38:31 +01:00
Julian Brost e727675aaf Remove incorrect assertion in ConfigItem::ActivateItems
There is an assertion that after activating items, all these items are
active, which sounds reasonable at first. However, with concurrent API
queries, some of these could already be deleted and therefore be
deactivated again.
2021-01-15 16:40:07 +01:00
Julian Brost 88e5744d54 AddDowntime: return Downtime::Ptr instead of String containing the name
At numerous places in the code, something like this is performed:

    String name = Downtime::AddDowntime(...);
    Downtime::Ptr downtime = Downtime::GetByName(name);

However, `downtime` can be a `nullptr` after this as it is possible that
the downtime is deleted in between.

This commit changes the return type of `Downtime::AddDowntime` to return
a Downtime::Ptr instead of the full name of the downtime. `AddDowntime`
performs the very same `GetByName()` operation internally, but handles
the `nullptr` case correctly and throws an exception.
2021-01-15 16:34:48 +01:00
Alexander Aleksandrovič Klimov 986bedd9a0
Merge pull request #8594 from Icinga/feature/remove-upq-from-activate-items
Remove upq from ConfigItem::ActivateItems
2021-01-15 12:09:57 +01:00
Alexander Aleksandrovič Klimov 4063e39d5f
Merge pull request #8515 from Icinga/feature/update-ssl-context-after-accepting-new-connection-8501
API: Update the ssl context after each accepting incoming connection
2021-01-15 11:21:36 +01:00
Yonas Habteab d27f533e5f ApiListener: Update the ssl cont after each accepting incoming connection 2021-01-14 18:40:20 +01:00
Yonas Habteab 057254695d Utility: Introduce new helper function Utility::GetFileCreationTime() 2021-01-14 18:39:14 +01:00
Alexander Aleksandrovič Klimov d82e4987bd
Merge pull request #8427 from Icinga/bugfix/fork-exit
StartUnixWorker(): don't exit() on fork() failure
2021-01-14 17:49:41 +01:00
Alexander Aleksandrovič Klimov c549a6657e
Merge pull request #8562 from Icinga/bugfix/fix-no-renotification-for-non-ok-state-changes-8545
Fix no re-notification for non OK state changes with time delay
2021-01-14 17:49:29 +01:00
Alexander Aleksandrovič Klimov 70b438a2bf
Merge pull request #8104 from Icinga/bugfix/remove-downtime-returns-wrong-status-7408
API: Display a correct status when removing a downtime
2021-01-14 17:49:00 +01:00
Alexander A. Klimov 931b9307ae StartUnixWorker(): don't exit() on fork() failure
... but let the caller handle the failure.

Not to stop working completely just because of fork() failure during a reload.
2021-01-14 13:40:18 +01:00
Alexander A. Klimov e1bc4d474f On check_timeout first send SIGTERM
... to allow check plugins to terminate gracefully.

refs #6162
2021-01-14 12:00:11 +01:00
Yonas Habteab 997ad86225 Fix no re-notification for non OK state changes with time delay 2021-01-14 11:54:25 +01:00
Alexander Aleksandrovič Klimov 5efe3e662c
Merge pull request #8025 from Icinga/bugfix/downtime-for-host-service-with-long-name-8022
ConfigObjectUtility::GetObjectConfigPath(): hash names of not already existing objects
2021-01-14 10:42:04 +01:00
Julian Brost db30704d14
Merge pull request #8532 from Icinga/bugfix/do-not-override-error-codes-that-are-not-200
HTTP: Do not override status codes that are not 200
2021-01-14 09:34:04 +01:00
Yonas Habteab 066db5ef60 HTTP: Don't override status codes that are not OK 2021-01-13 18:56:56 +01:00
Julian Brost f12666c166
Merge pull request #8157 from Icinga/bugfix/temporary-files-5124
Clean up temp files
2021-01-13 15:45:29 +01:00
Julian Brost 0c6abc817b Remove upq from ConfigItem::ActivateItems
Since commit d9010c7b9f, ActivateItems no
longer uses the WorkQueue upq to perform tasks but instead performs
these locally. One instance of `upq.Join()`/`upq.HasExceptions()`
remained in the function, but I believe this was just missed when
removing the `upq.Enqueue()` call just before.

This commit removes the corresponding parameter and updates all call
sites accordingly.
2021-01-13 15:19:55 +01:00
Alexander Aleksandrovič Klimov 5a0118c6d8
Merge pull request #8442 from Icinga/bugfix/close-ebadf-8437
Close FDs based on /proc/self/fd
2021-01-13 10:59:15 +01:00
Alexander A. Klimov 68a0079c26 ConfigObjectUtility::GetObjectConfigPath(): hash names of not already existing objects
... to avoid too long file names.

refs #8022
2021-01-12 18:03:22 +01:00
Alexander A. Klimov 450b2117d2 Add ".tmp" to state and modified attributes temp files
refs #5124
2021-01-12 17:35:29 +01:00
Alexander A. Klimov 18c2dae941 Clean up temp files
refs #5124
2021-01-12 17:35:29 +01:00
Alexander A. Klimov 26c944125b Close FDs based on /proc/self/fd
... not to waste time with close(2)ing RLIMIT_NOFILE-3 non-existing FDs.

Newer kernel = higher RLIMIT_NOFILE = more time wasted

refs #8437
2021-01-12 17:32:28 +01:00
Julian Brost aea06a27dc Use reference-counted pointer in notification callback
`this` could be deleted after `Notification::BeginExecuteNotification`
exited and before `Notification::ExecuteNotificationHelper` finished.
This is fixed by constructing a `Notification::Ptr` and operate on that
one as it is properly reference-counted.
2021-01-12 17:19:29 +01:00
Alexander Aleksandrovič Klimov a6af5406f7
Merge pull request #8083 from Icinga/feature/Implement-new-API-events-7974
Implement new API event streams response
2021-01-12 12:26:05 +01:00
Yonas Habteab 756abbb2ff ApiEvents: Implement new API event streams response 2021-01-11 14:59:48 +01:00
Alexander Aleksandrovič Klimov d996d1e201
Merge pull request #8580 from bebehei/typo
Fix typo seemless -> seamless
2021-01-11 13:45:08 +01:00
Alexander Aleksandrovič Klimov 635a8c5d4c
Merge pull request #8088 from Icinga/bugfix/log-two-nodes-run-on-different-versions-8075
Display logmessage if two nodes run on different versions
2021-01-11 12:30:30 +01:00
Alexander Aleksandrovič Klimov 862add5f3f
Merge pull request #8512 from Icinga/bugfix/zombie-processes
Revert "icinga2 daemon: reap remaining child processes after reload"
2021-01-11 11:38:20 +01:00
Julian Brost 0276c0b052 Properly handle service downtime referencing a deleted host
Only two out of three cases were handled properly by the code: host
downtimes referencing a deleted host and service downtimes referencing a
deleted service worked fine. However, if a service downtime references a
deleted host, `Host::GetByName()` returns `nullptr` which isn't
accounted for. Use `Service::GetByNamePair()` instead as this performs a
check for the host being null internally.
2021-01-08 11:12:15 +01:00
Benedikt Heine 8a455e8150 Fix typo seemless -> seamless 2020-12-25 23:27:08 +01:00
Julian Brost 00d8703aad
Merge pull request #7847 from Icinga/feature/log-trim-trailing-newlines-7828
Log: trim trailing newlines
2020-12-23 14:20:43 +01:00
Julian Brost eab07a7318 Provide a conversion function from icinga::String to boost::string_view
Boost.Beast changed the signature of
boost::beast::http::basic_fields::set in version 1.74 so that no longer
allows passing an icinga::String instance as value. This adds a
conversion function so that it works again.
2020-12-22 16:27:38 +01:00
Julian Brost 339b37a985 Use content_length method for setting the Content-Length header
Boost.Beast changed the signature of the previously used generic `set`
method so that it no longer accepts integer types, however there is
alreay a more specific method for setting the Content-Length header, so
use this one instead.
2020-12-22 16:27:38 +01:00
Alexander A. Klimov 4051bc9c8f ConfigObjectUtility#CreateObject(): check config objects for duplicates
... not to delete already existing objects during a trial of re-creation.

refs #7726
2020-12-16 16:45:22 +01:00
Yonas Habteab 8eb4f2e062 ApiListener: Display log message if two nodes run on different versions 2020-12-16 16:09:28 +01:00
Noah Hilverling f7e368564f
Merge pull request from GHSA-pcmr-2p2f-r7j6
Verify certificates against CRL before renewing them (2.13)
2020-12-15 12:30:19 +01:00
Alexander Aleksandrovič Klimov 6b04ef6e5d
Merge pull request #7871 from Icinga/feature/more-uoms-for-perfdata-7225
PerfdataValue: add UoMs
2020-12-14 18:42:49 +01:00
Alexander A. Klimov 8c6bfdcf54 Revert "icinga2 daemon: reap remaining child processes after reload"
This reverts commit 91265a5b0e
which isn't needed anymore as Icinga 2 isn't PID 1 anymore.
2020-12-14 13:38:35 +01:00
Alexander A. Klimov f04387a973 FireSuppressedNotifications(const Notification::Ptr&): don't send notifications while suppressed by checkable
... e.g. if a notification enters its time period (not suppressed anymore),
but its checkable has entered a downtime (suppressed).

refs #8509
2020-12-14 13:28:53 +01:00
Alexander A. Klimov 5547488cd5 Introduce Checkable#NotificationReasonSuppressed()
refs #8509
2020-12-14 13:27:58 +01:00
Alexander Aleksandrovič Klimov 915a3c3001
Merge pull request #8436 from Icinga/bugfix/children-recover-too-late
On recovery: re-check children
2020-12-11 15:41:31 +01:00
Alexander Aleksandrovič Klimov 366a97bf19
Merge pull request #8541 from Icinga/bugfix/openssl-error-buffer
Use proper buffer size for OpenSSL error messages
2020-12-09 16:08:19 +01:00
Julian Brost 4c0247c02d Allow specifying a CRL in `icinga2 pki verify` 2020-12-09 12:12:01 +01:00
Julian Brost e86bd24348 Verify certificates against CRL before renewing them
When a CRL is specified in the ApiListener configuration, Icinga 2 only
used it when connections were established so far, but not when a
certificate is requested. This allows a node to automatically renew a
revoked certificate if it meets the other conditions for auto-renewal
(issued before 2017 or expires in less than 30 days).
2020-12-09 12:10:59 +01:00
Julian Brost bbfd1ecfc8 Use ERR_error_string_n() instead of ERR_error_string()
Explicitly pass the actual length of the buffer to avoid overflows.
2020-12-08 13:08:18 +01:00
Julian Brost c0fc9a86c5 Increase size of buffer for OpenSSL error messages
According to man 3 ERR_error_string, "buf must be at least 256 bytes
long", therefore increase the buffer size to 256 everywhere.
2020-12-08 13:08:18 +01:00
Julian Brost 61d7ec4bf7 Remove std::string to_string(const errinfo_openssl_error& e)
The function was never used and it's implementation contains a bug where
a buffer of too small size is used as a paramter to ERR_error_string.
According to the `man 3 ERR_error_info`, the buffer has to be at least
256 bytes in size.

Also the function seems of limited use as it allows to output the tag
object used with additional error information for exceptions in Boost.
However, you boost::get_error_info<>() just returns the value type but
not the full tag object from the exception.
2020-12-08 13:05:38 +01:00
Yonas Habteab dd02e3b6d8 API: Display a correct status code when removing a scheduled downtime 2020-12-07 13:19:41 +01:00
Alexander A. Klimov b8bb8cb946 Configuration.ApiBindHost: default to ::
refs #8183
2020-12-04 16:52:58 +01:00
Julian Brost f2a532de32
Merge pull request #8035 from Icinga/feature/expiry-date-comments-4663
/v1/actions/add-comment: add param expiry
2020-12-04 15:48:50 +01:00
Alexander Aleksandrovič Klimov 6f33c2f90c
Merge pull request #8314 from Icinga/feature/add-support-influxdb-basic-auth-7644
Add support Influxdb basic auth
2020-12-03 11:00:04 +01:00
Yonas Habteab 2ade57bcbb Add support influxdb basic auth
fixes #7644
2020-12-02 16:48:03 +01:00
Alexander A. Klimov 854939a8ce On recovery: re-check children 2020-12-02 12:24:40 +01:00
Alexander A. Klimov 668bf06424 Don't fire suppressed notifications if last parent recovery >= last check result 2020-12-02 12:03:19 +01:00
Alexander Aleksandrovič Klimov bee4ac7f7c
Merge pull request #8040 from Icinga/feature/v1-actions-execute-command-8034
Add API endpoint: /v1/actions/execute-command
2020-12-02 10:53:24 +01:00
Alexander Aleksandrovič Klimov 3f4b09f01c
Merge pull request #8488 from Icinga/feature/improve-config-sync-locking
Improve config sync locking
2020-11-27 17:55:15 +01:00
Alexander Aleksandrovič Klimov 81ed8d5629
Merge pull request #8321 from Icinga/bugfix/cant-create-api-user-w-password-8164
Allow to create API User w/ password
2020-11-25 15:40:07 +01:00
Alexander A. Klimov 1343fd538d Start ApiListener#SyncClient() in the thread pool
... not hosting the coroutines not to block them.

Otherwise a large replay log would block messages sending
until the peer disconnects us.
2020-11-24 17:25:43 +01:00
Alexander Aleksandrovič Klimov 3dcc6c32f3
Merge pull request #8479 from Icinga/bugfix/close-anonymous-connections
Close anonymous connections after 10 seconds
2020-11-24 16:44:09 +01:00
Julian Brost 2a2924855f
Merge pull request #7922 from Icinga/feature/http-status-codes-in-icinga-mainlog-7053
Include HTTP status codes in log
2020-11-24 16:35:58 +01:00
Julian Brost da407660f2
Merge pull request #8500 from Icinga/bugfix/config-sync-only-remove-files-if-timestamp-changed
Config sync: Only remove files, if timestamp changed
2020-11-24 16:34:12 +01:00
Julian Brost c154d4d50e
Merge pull request #8466 from Icinga/feature/one-connection
ApiListener#NewClientHandlerInternal(): reject connections from already connected endpoints
2020-11-24 16:33:15 +01:00
Noah Hilverling 83b4d8e69d Config sync: Only remove files, if timestamp changed 2020-11-24 10:44:38 +01:00
Alexander Aleksandrovič Klimov 39bc1590f6
Merge pull request #8440 from Icinga/bugfix/message-routing-for-global-zones
Fix cluster message routing for global zones
2020-11-24 10:41:17 +01:00
Alexander Aleksandrovič Klimov e84a4a290d
Merge pull request #8450 from Icinga/bugfix/do-not-accept-api-updates-for-unknown-zone
API: Don't accept object updates for unknown global zone
2020-11-24 10:40:20 +01:00
Alexander A. Klimov 5cfac1f643 Fix function and variable names
refs #8034
2020-11-23 16:43:47 +01:00
Alexander A. Klimov fa61711c21 Introduce ReportIdoCheck()
... for code deduplication

refs #8034
2020-11-23 16:40:32 +01:00
Alexander A. Klimov 0ad1ab20aa Fix code style
refs #8034
2020-11-23 16:39:24 +01:00
Julian Brost 3f15963651 Remove SpinLock
No longer needed as its only user now uses std::mutex.
2020-11-17 09:40:34 +01:00
Julian Brost 70c9d49ebc ApiListener: merge new config validation and actication functions
Merge AsyncTryActivateZonesStage and TryActivateZonesStageCallback and
name the result TryActivateZonesStage. The old split was a leftover from
the one being a callback function with no actual meaningful separation.
2020-11-17 09:37:13 +01:00
Noah Hilverling 2d1980c10d
Merge pull request #8476 from Icinga/docs/api-action-api-function
Clarify difference between API actions and functions in their docstrings
2020-11-17 08:17:05 +01:00
Julian Brost e4610e7dbd Use std::mutex instead of Spinlock 2020-11-16 17:38:03 +01:00
Julian Brost 74b65f1642 API filesync: wait for validation process to exit
This avoid having to pass a lock implictly using the captured variables
of a lambda.
2020-11-16 17:10:57 +01:00
Julian Brost 4c8c4c75ec Add Process::WaitForResult to allow waiting for the process to finish 2020-11-16 17:10:26 +01:00
Alexander Aleksandrovič Klimov 5f3b5934fa
Merge pull request #8195 from Icinga/feature/terminate-pretty-json-output-w-n-8194
JsonEncode(): suffix pretty JSON w/ \n
2020-11-13 17:08:46 +01:00
Alexander Aleksandrovič Klimov 66c4dc35a8
Merge pull request #7931 from Icinga/feature/program_version-livestatus-7895
Livestatus: append app name to program_version
2020-11-13 17:08:11 +01:00
Julian Brost d1edcb909c Close anonymous connections after 10 seconds
Anonymous connections are normally only used for requesting a
certificate and are closed after this request is received. However, the
request is only sent if the child has successfully verified the
certificate of its parent so that it is an authenticated connection from
its perspective. In case this verification fails, both ends view it as
an anonymous connection and never actually use it but attempt a
reconnect after 10 seconds leaking the connection. Therefore close it
after a timeout.
2020-11-12 18:01:11 +01:00
Alexander Aleksandrovič Klimov 8ca765d730
Merge pull request #8455 from Icinga/bugfix/replay-object-deletion
Log config object deletions to replay log
2020-11-12 15:08:55 +01:00
Julian Brost 01a278bb5e Clarify difference between API actions and functions in their docstrings 2020-11-12 14:23:41 +01:00
Noah Hilverling 5f6042d92f Fix 'emoving' typo 2020-11-09 16:35:16 +01:00
Julian Brost cb476172ec Fix cluster message routing for global zones
RelayMessageOne used to relay the message only to one other endpoint for
other zones, which is fine, as long as the target zone is a child/parent
zone but breaks if the target zone is a global one. In this case, the
message has to be forwarded within the local zone as well as to one node
in each child zone.
2020-11-09 15:43:43 +01:00
Julian Brost be53b0af9e Log config object deletions to replay log
The initial config object sync for each new connection (in
`ApiListener::SendRuntimeConfigObjects()`) only considers currently
existing objects and has no way to pass the information that objects
were deleted in the meantime.

This commit logs config object deletions to the replay log if required
so that there is a chance that it will be propagated to nodes that were
offline when the deletion happened.

Note that this can only be considered a workaround as the replay log
might be pruned or could even be completely disabled. Also, there still
seems to be a race-condition between the config sync and replay log of
multiple new connections at the same time.
2020-11-09 14:09:44 +01:00
Alexander A. Klimov 29e5d7def7 Include HTTP status codes in log
refs #7053
2020-11-09 10:20:13 +01:00
Noah Hilverling 8ba5f72533 API: Don't accept object updates for unknown zone 2020-11-06 17:27:10 +01:00
Alexander Aleksandrovič Klimov 1450e1bb7f
Merge pull request #8108 from Icinga/bugfix/api-incorrect-response-header-6747
API: Send Content-Type as api response header too
2020-11-03 18:50:31 +01:00
Alexander Aleksandrovič Klimov 939f4591a4
Merge pull request #8087 from Icinga/bugfix/log-cout-permission-error-8086
Display Logmessage if an permission error occurs
2020-11-03 17:23:06 +01:00
Yonas Habteab 488e6bfb67 HTTP Request: Log an exception message if an error occurs 2020-11-02 15:01:48 +01:00
Alexander Aleksandrovič Klimov 4f6fecc74c
Merge pull request #8101 from Icinga/bugfix/timestamps-checkresult-differ-across-nodes-8092
State timestamps set by the same check result differ across nodes
2020-10-30 17:24:15 +01:00
Alexander Aleksandrovič Klimov 1b9f161aea
Merge pull request #8123 from MEschenbacher/confirmingstrings
ido_pgsql: do not set standard_conforming_strings to off
2020-10-29 17:06:46 +01:00
Alexander Aleksandrovič Klimov 9c232e942b
Merge pull request #8085 from Icinga/bugfix/not-set-lcnumeric-twice-7563
Fix LC_NUMERIC set twice and use a wrong value
2020-10-29 16:47:43 +01:00
Maximilian Eschenbacher d8089560dd ido_pgsql: do not set standard_conforming_strings to off
Before postgres 9.1, this setting defaulted to off and icinga2 code was
making heavy use of this feature. Since postgres 9.1, this settings
defaults to on. During the adoption of postgres >= 9.1, the icinga2
postgres ido code maintained compatibility by setting it to off
explicitly.

In the mean time, the postgres ido code has been converted to using the
`E'...'` escape literal syntax exclusively.

The last remaining step is now to no longer force the setting to off
because no query is using the feature any longer.

Closes github issue #8122.
2020-10-29 16:28:18 +01:00
Alexander Aleksandrovič Klimov a226db4f17
Merge pull request #8245 from Ragnra/bugfix/opentsdb-tag-doublespace-8244
Fixes an issue with opentsdb and a double space
2020-10-29 16:23:01 +01:00
Alexander Aleksandrovič Klimov b5b1ee715b
Merge pull request #8184 from sbraz/boost
Fix ‘fs::copy_option’ has not been declared with boost 1.74.0
2020-10-29 16:19:07 +01:00
Alexander Aleksandrovič Klimov 1e281b060a
Merge pull request #7952 from Icinga/fix/SO_REUSEPORT-optional
apilistener: Make SO_REUSEPORT optional
2020-10-29 15:56:56 +01:00