Commit Graph

5923 Commits

Author SHA1 Message Date
Alexander Aleksandrovič Klimov 1b50d912a0
Merge pull request #9137 from Icinga/bugfix/influxdb-writer-synchronization
Fix unsafe concurrent access to m_DataBuffer in InfluxdbCommonWriter
2022-01-04 17:37:28 +01:00
Alexander Aleksandrovič Klimov 80663cf5e6
Merge pull request #9048 from Icinga/bugfix/timeperiod-dst-2.0
LegacyTimePeriod::ScriptFunc: fix DST edge-cases
2022-01-03 18:11:32 +01:00
Julian Brost 33781496da InfluxdbCommonWriter: use atomic_size_t to data buffer size from stats function
m_DataBuffer may be modified concurrently while StatsFunc() is called, thus
it's unsafe to call size() on it. As write access to m_DataBuffer is already
synchronized by only modifying it from the single work queue thread, instead of
adding a mutex, this commit adds a new std::atomic_size_t which is additionally
updated when modifying m_DataBuffer and can safely be accessed in StatsFunc().
2022-01-03 12:24:26 +01:00
Julian Brost e6300aacf9 InfluxdbCommonWriter: only flush from work queue
There is no explicit synchronization of access to m_DataBuffer which is fine if
it is only accessed from the single-threaded work queue. However, Stop() also
called Flush() in another thread, leading to concurrent write access to
m_DataBuffer which can result in a crash due to use after free/double free.

Changes in this commit:
* Flush() is renamed to FlushWQ() to show that it should only be called from
  the work queue. Additionally, it now asserts that it is running on the work
  queue.
* Visibility of some data members is changed from protected to private. No
  other classes have to access these at the moment. By this change, accidental
  concurrent access from derived classes in the future is prevented.
* Stop() now flushes by posting FlushWQ() to the work queue and joining it.
2022-01-03 12:24:26 +01:00
Julian Brost 13ea635188 Don't trigger a fixed downtime like a flexible one
When creating a fixed downtime that starts immediately while the checkable is
in a non-OK state, previously the code path for flexible downtimes was used to
trigger this downtime. This is fixed by this commit which resolves two issued:

1. Missing downtime start notification: notifications work differently for
   fixed and flexible downtimes. This resulted in missing downtime start
   notifications under the conditions described above.
2. Incorrect downtime trigger time: this code path would incorrectly assume the
   timestamp of the last checkable as the trigger time which is incorrect for
   fixed downtimes.
2021-12-14 11:02:40 +01:00
Julian Brost c71029f2e8 Set downtime trigger time deterministically
When triggering a downtime, the time of the causing event is now passed on as
the trigger time. That time is:

* For fixed downtimes: the later one of start and entry time.
* If a check result triggers the downtime: The execution end of the check
  result.
* If another downtime triggers the downtime: The trigger time of the first
  downtime.

This is done so two nodes in a HA setup can write consistent Icinga DB downtime
history streams.

refs #9101
2021-12-08 14:15:50 +01:00
Alexander Aleksandrovič Klimov 577cf94b59
Merge pull request #8956 from Icinga/Al2Klimov-patch-3
Fix IDO MySQL schema version
2021-12-07 15:31:00 +01:00
Alexander Aleksandrovič Klimov 31c564182a
Merge pull request #8990 from Icinga/bugfix/downtime-all-services-on-child-hosts
Fix scheduling of downtimes for all services on child hosts
2021-12-07 12:48:01 +01:00
Julian Brost 596fcdc123 Downtime::DowntimesExpireTimerHandler: don't copy vector
`ConfigType::GetObjectsByType<Downtime>()` already returns a
`std::vector<Downtime::Ptr>` so there is no point in copying it into another
vector of the same type just to then iterate the copied vector instead of the
original one.
2021-12-01 13:05:23 +01:00
Yonas Habteab 361807f7a9
Adjust incosistent pki log messages (#8965) 2021-11-22 16:06:55 +01:00
Julian Brost d09925189a
Merge pull request #9037 from Icinga/Al2Klimov-patch-4
InfluxdbCommonWriter#Flush(): fix log message
2021-11-19 17:09:05 +01:00
Julian Brost 2ad0a4b8c3 Add missing include to fix non-unity builds
This commit fixes the following build error:

    [ 55%] Building CXX object lib/icinga/CMakeFiles/icinga.dir/usergroup.cpp.o
    lib/icinga/usergroup.cpp:79:24: error: incomplete type ‘icinga::Notification’ used in nested name specifier
       79 | std::set<Notification::Ptr> UserGroup::GetNotifications() const
          |                        ^~~
2021-11-17 16:11:15 +01:00
Julian Brost a740b1d66c LegacyTimePeriod::ScriptFunc: fix DST edge-cases
This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
  full days, but ScriptFunc may have called them with some random timestamp
  during the day. This is fixed by always using midnight of the day as
  reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
  On days with DST changes, this could either still be on the same day (a day
  may have 25 hours) or skip an entire day (a day may have 23 hours). This is
  fixed by using a struct tm to advance the time to the next day.
2021-11-17 13:09:10 +01:00
Noah Hilverling 4d3b1709fd
Merge pull request #9009 from Icinga/bugfix/icingadb-runtime-updates-delete-relationships
Icinga DB: Make sure object relationships are handled correctly during runtime updates
2021-11-12 17:52:59 +01:00
Julian Brost b9e6273ba0 Icinga DB: only log queries at debug level 2021-11-12 15:41:17 +01:00
Noah Hilverling 7a0796061a IcingaDB::AddObjectDataToRuntimeUpdates(): Copy data before modifying 2021-11-12 13:34:57 +01:00
Noah Hilverling 10bde2075a Dictionary: Make sure underlaying map is ordered 2021-11-12 13:34:57 +01:00
Noah Hilverling 73e0d6e61b Icinga DB: Make sure object relationships are handled correctly 2021-11-12 13:34:57 +01:00
Noah Hilverling 4e79eb080c
Merge pull request #9058 from Icinga/bugfix/icingadb-prefix-command_id
IcingaDB: Prefix command_id with command type
2021-11-11 11:50:26 +01:00
Noah Hilverling c1098bef35
Merge pull request #9061 from Icinga/add-downtime-duration-and-service-state-host-id-streams
Icinga DB: Add `downtime.duration` & `service_state.host_id` to Redis
2021-11-11 10:19:47 +01:00
Noah Hilverling a9c2304c61 IcingaDB: Prefix command_id with command type 2021-11-09 12:26:30 +01:00
Eric Lippmann 35053ac1dd Icinga DB: Sync groups earlier
Host and service groups are structural information that are used
for Web filters and should therefore be synchronized as soon as
possible.
2021-11-09 11:17:01 +01:00
Alexander A. Klimov 07c8440fd2 Icinga DB: sync checkables along with their states first
`WorkQueue#ParallelFor(x, false, y)` will enqueue x's items in FIFO order,
so x has to start with host and service.
2021-11-09 11:17:01 +01:00
Yonas Habteab fe5aa1e18d Icinga DB: Add `service_state.host_id` to Redis 2021-11-09 11:08:22 +01:00
Yonas Habteab 5dc45baebb Icinga DB: Add `downtime.duration` & `scheduled_duration` to Redis 2021-11-09 11:08:22 +01:00
Julian Brost 848f1ae167
Merge pull request #8998 from Icinga/bugfix/icingadb-program-start-milliseconds
Icinga DB: set value in milliseconds for program_start in stats/heartbeat
2021-11-08 18:18:19 +01:00
Julian Brost 524fe92a1d
Merge pull request #9028 from Icinga/bugfix/icingadb-zone-parent
IcingaDB: actually write parent to parent_id of zones
2021-11-08 18:08:48 +01:00
Julian Brost e46d83b6be Icinga DB: set value in milliseconds for program_start in stats/heartbeat 2021-11-08 14:37:08 +01:00
Noah Hilverling 0b9317a5bf IcingaDB: Remove GetObjectIdentifiersWithoutEnv()
Having the command type be a part of the command ID isn't needed anywhere. Removing this simplifies the way we generate IDs in general, because we don't need Prepend() anymore.

The command type was only needed to prevent ID collisions within the command_envvar and command_argument tables. Those tables have since been separated into {check,event,notification}command_envvar and {check,event,notification}command_argument tables.
2021-11-05 17:01:40 +01:00
Julian Brost 3c8672b4dc Icinga DB: increase Redis schema version
PR #9036 introduces some incompatible changes to the Redis schema, most
importantly where Icinga DB has to read the environment from: now it has to use
a new top-level key of the icinga:stats message instead of a value in the
IcingaApplication part of that message.
2021-11-05 14:14:37 +01:00
Julian Brost 6007848146 IcingaDB: export environment_id via API
Primarily required for Icinga DB integration tests at the moment, but could
also be helpful in other situations.
2021-11-05 14:14:37 +01:00
Julian Brost 4ade4c757b IcingaDB: write new environment to icinga:stats stream 2021-11-05 14:14:37 +01:00
Julian Brost 525dd50859 IcingaDB: introduce a new environment ID derived from the CA public key
In order to avoid changes to the environment ID, it is now no longer derived
from the Environment constant but instead from the public key of the CA
certificate. This ensures that it is different between clusters by default, so
no additional changes have to be done to allow two clusters to use Icinga DB to
write into the same database.

To prevent the ID from changing when the CA certificate is replaced, it is also
persisted into the file /var/lib/icinga2/icingadb.env, so if that file exists,
it takes precedence over the CA certificate.
2021-11-05 14:14:37 +01:00
Julian Brost 6cd3a483a0 tlsutility: move hex encoding into a separate function BinaryToHex 2021-11-05 14:14:37 +01:00
Julian Brost f976e351f4
Merge pull request #9044 from Icinga/bugfix/idb-dump-buf-lost
Icinga DB init. dump: flush both buffered states and state checksums
2021-11-04 12:26:28 +01:00
Alexander A. Klimov 0ff7d0a06e Icinga DB: raise icinga:schema 1 -> 2 2021-11-02 15:00:55 +01:00
Alexander A. Klimov b1714a10c2 Icinga DB: make icinga:history:stream:*#event_id deterministic
... i.e. UUID -> SHA1(env, eventType, x...) given that SHA1(env, x...) = type-specific ID.
Rationale: allow both masters to write the same history concurrently (while not
in split-brain), so that REPLACE INTO deduplicates the same events written twice.

* ack: SHA1(env, "ack_set"|"ack_clear", checkable.name, setTime)
* comment: SHA1(env, "comment_add"|"comment_remove", comment.name)
* downtime: SHA1(env, "downtime_start"|"downtime_end", downtime.name)
* flapping: SHA1(env, "flapping_start"|"flapping_end", checkable.name, startTime)
* notification: SHA1(env, "notification", notification.name, notificationType, sendTime)
* state: SHA1(env, "state_change", checkable.name, changeTime)
2021-11-02 15:00:03 +01:00
Alexander A. Klimov 5c44365c4e Icinga DB: make icinga:history:stream:notification#id deterministic
... i.e. UUID -> SHA1(x..., send time) given that SHA1(x...) = notification id.
Rationale: allow both masters to write the same notification history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
2021-11-02 15:00:03 +01:00
Alexander A. Klimov c2422c56fe Icinga DB: make icinga:history:stream:state#id deterministic
... i.e. UUID -> SHA1(x..., check time) given that SHA1(x...) = checkable id.
Rationale: allow both masters to write the same state history concurrently (while
not in split-brain), so that REPLACE INTO deduplicates the same events written twice.
2021-11-02 15:00:03 +01:00
Alexander Aleksandrovič Klimov f5f8ccb1f4
Merge pull request #9020 from Icinga/feature/icingaeb-schema-version
Icinga DB: publish Redis schema version via XADD icinga:schema
2021-10-25 13:21:37 +02:00
Alexander A. Klimov d8b4768471 Icinga DB init. dump: flush both buffered states and state checksums
not to dump x states, but only x - (x % bulk) state checksums.
2021-10-21 13:49:24 +02:00
Noah Hilverling a7cbf50674
Merge pull request #9030 from Icinga/Al2Klimov-patch-1
Icinga DB: don't include checkable types in history IDs
2021-10-19 14:52:43 +02:00
Alexander A. Klimov 4b0688047e Icinga DB: stream runtime state updates only to icinga:runtime:state
... where they belong to, not to icinga:runtime.
2021-10-18 18:11:30 +02:00
Alexander Aleksandrovič Klimov 99c5c24a17
InfluxdbCommonWriter#Flush(): fix log message
s/InfluxdbWriter/Influxdb2Writer/

fixes #9035
2021-10-14 12:03:45 +02:00
Alexander Aleksandrovič Klimov e0339c387b
Icinga DB: don't include checkable types in history IDs
... as they’re unnecessary for being distinguish across types.
Services always have a ! in the name, hosts never do.
2021-10-11 16:14:30 +02:00
Alexander Aleksandrovič Klimov 30a5ba3961
Merge pull request #9002 from Icinga/feature/icingadb-remove-usernotification-stream
Icinga DB: remove usernotification history stream
2021-10-08 19:16:26 +02:00
Alexander Aleksandrovič Klimov 4190d58668
Merge pull request #9011 from Icinga/bugfix/icingadb-remove-zone-parent-key
Icinga DB: Remove unused Redis key 'icinga:zone:parent'
2021-10-08 17:19:51 +02:00
Alexander Aleksandrovič Klimov ff60c1af37
Merge pull request #8895 from Icinga/bugfix/typo-8766
Fix typo
2021-10-08 17:19:20 +02:00
Noah Hilverling 750e64b974 Icinga DB: Remove unused Redis key 'icinga:zone:parent' 2021-10-08 12:06:14 +02:00
Julian Brost df84a498f4 IcingaDB: actually write parent to parent_id of zones
This fixes that the code used the wrong variable. Previously, it was written to
Redis that each zone is its own parent (if it has a parent at all).
2021-10-08 11:15:54 +02:00