This change fixes two problems:
* The internal functions used by ScriptFunc more or less expect to operate on
full days, but ScriptFunc may have called them with some random timestamp
during the day. This is fixed by always using midnight of the day as
reference time.
* Previously, the code advanced a timestamp to the next day by adding 24 hours.
On days with DST changes, this could either still be on the same day (a day
may have 25 hours) or skip an entire day (a day may have 23 hours). This is
fixed by using a struct tm to advance the time to the next day.
m_DataBuffer may be modified concurrently while StatsFunc() is called, thus
it's unsafe to call size() on it. As write access to m_DataBuffer is already
synchronized by only modifying it from the single work queue thread, instead of
adding a mutex, this commit adds a new std::atomic_size_t which is additionally
updated when modifying m_DataBuffer and can safely be accessed in StatsFunc().
There is no explicit synchronization of access to m_DataBuffer which is fine if
it is only accessed from the single-threaded work queue. However, Stop() also
called Flush() in another thread, leading to concurrent write access to
m_DataBuffer which can result in a crash due to use after free/double free.
Changes in this commit:
* Flush() is renamed to FlushWQ() to show that it should only be called from
the work queue. Additionally, it now asserts that it is running on the work
queue.
* Visibility of some data members is changed from protected to private. No
other classes have to access these at the moment. By this change, accidental
concurrent access from derived classes in the future is prevented.
* Stop() now flushes by posting FlushWQ() to the work queue and joining it.
Without this commit, children and parents of a checkable were rescheduled on a
state change while holding the lock for the current checkable. If both ends of
a dependency are checked at the same time and both change state, they could end
up in a deadlock waiting for each other.
This commit fixes this problem by changing the code so that other checkables
are rescheduled only after releasing the lock for the current checkable.
When a comment or downtime is removed manually, the name of the requestor and
timestamp have to be synced to other nodes in the cluster to allow all of them
to generate a consistent Icinga DB history stream.
refs #9101
The config parser requires *Command#arguments#order to be a Number, i.e. 42,
4.2 or even "4.2". That's int-casted where needed, now also for Icinga DB.
Before:
```
object CheckCommand "9117" {
command = [ "true" ]
arguments = {
"4.2" = { order = "4.2" }
}
}
```
2022-01-03T13:25:07.166+0100 FATAL icingadb json: cannot unmarshal string into Go value of type int64
The Windows image provided by GitHub already includes most of our dependencies,
so the installation of all Chocolatey packages except winflexbison3 was
redundant. Visual Studio is provided in the Enterprise version instead of
Community, so that has to be added to the search path as well.
i.e. the confusion of the state file deserializator with e.g. `"type":32` on startup.
That would unexpectedly restore (the now ignored) null (not `{"type":32}`) as there's no type "32".
refs #8186
The loop iterated over the services of the wrong host resulting in duplicate
downtimes scheduled for services of the parent host instead of downtimes for
services of the child host.
When creating a fixed downtime that starts immediately while the checkable is
in a non-OK state, previously the code path for flexible downtimes was used to
trigger this downtime. This is fixed by this commit which resolves two issued:
1. Missing downtime start notification: notifications work differently for
fixed and flexible downtimes. This resulted in missing downtime start
notifications under the conditions described above.
2. Incorrect downtime trigger time: this code path would incorrectly assume the
timestamp of the last checkable as the trigger time which is incorrect for
fixed downtimes.
When triggering a downtime, the time of the causing event is now passed on as
the trigger time. That time is:
* For fixed downtimes: the later one of start and entry time.
* If a check result triggers the downtime: The execution end of the check
result.
* If another downtime triggers the downtime: The trigger time of the first
downtime.
This is done so two nodes in a HA setup can write consistent Icinga DB downtime
history streams.
refs #9101
In the 2.12.6 release, the full schema file sets the version to 1.14.3, whereas
the latest available upgrade file 2.11.0.sql sets it to 1.15.0. Therefore, ship
a new upgrade file 2.12.7.sql for all users who imported their schema with
version 2.11.0 or later and never performed an upgrade since then. Their
databases incorrectly state schema version 1.14.3 and is bumped to the correct
version 1.15.0 by the upgrade.
In the 2.13.2 release, the full schema file sets the version to 1.15.0, whereas
the latest available upgrade file 2.13.0.sql sets it to 1.15.1. Therefore,
rename the incorrectly named upgrade file 2.13.1.sql (it was not shipped in
this or any other release so far) to 2.13.3.sql for users who imported their
schema with version 2.13.0 or later and never performed an upgrade since then.
Their databases incorrectly state schema version 1.15.0 and are bumped to the
correct version 1.15.1 by the upgrade. Additionally, the version number in the
full schema is also bumped to the correct version 1.15.1.
not to confuse the state file deserializator with e.g. `"type":32` on startup.
That would unexpectedly restore null (not `{"type":32}`) as there's no type "32".
refs #8186
This commit fixes the following build error:
[ 55%] Building CXX object lib/icinga/CMakeFiles/icinga.dir/usergroup.cpp.o
lib/icinga/usergroup.cpp:79:24: error: incomplete type ‘icinga::Notification’ used in nested name specifier
79 | std::set<Notification::Ptr> UserGroup::GetNotifications() const
| ^~~