Commit Graph

86 Commits

Author SHA1 Message Date
Michael Friedrich d075665d1b Merge pull request #5486 from Icinga/feature/remove-deprecated-graphite-legacy-mode
Graphite: Remove deprecated legacy schema mode
2017-08-17 20:06:47 +02:00
Simon Murray abc3652b00 Fix TLS Race Connecting to InfluxDB
Rather than leaving stale connections about we tried to poll for data coming in
from InfluxDB and timeout if it didn't repond in a timely manner.  This introduced
a race where the timeout triggers, a context switch occurs where data is actually
available and the TlsStream spins trying to asynchronously notify that data is
available, but which never gets read.  Not only does this use up 100% of a core,
but it also slowly starves the system of handler threads at which point metrics
stop being delivered.

This basically removes the poll and timeout, any TLS socket erros should be
detected by TCP keep-alives.

Fixes #5460 #5469
2017-08-14 16:20:49 +01:00
Michael Friedrich eb5e299c4b Graphite: Remove deprecated legacy schema mode
This commit includes some code cleanup too.

fixes #4992
2017-08-09 18:52:35 +02:00
Michael Friedrich 2a4359d7e8 Windows build fix for InfluxdbWriter
refs #5219
fixes #5334
2017-06-07 14:16:15 +02:00
Michael Friedrich 89ac5b2fff GelfWriter: Add 'check_command' to CHECK RESULT/* NOTIFICATION/STATE CHANGE messages
This allows for much more easy filtering in Graylog web
similar to Graphite or InfluxDB and their template dashboards.
2017-06-06 20:23:26 +02:00
Michael Friedrich 41a400f552 Merge pull request #5330 from Icinga/feature/graphite-stats
GraphiteWriter: Add 'connected' to stats; fix reconnect exceptions
2017-06-06 20:13:33 +02:00
Michael Friedrich f42b820007 GraphiteWriter: Add 'connected' to stats; fix reconnect exceptions 2017-06-06 19:50:37 +02:00
Michael Friedrich f10815efa2 GelfWriter: Use async work queue and add feature metric stats
fixes #4532
2017-06-06 19:48:23 +02:00
Gunnar Beutner 1fd2695e02 Fix compiler warnings
refs #5287
2017-05-29 09:13:19 +02:00
Michael Friedrich dab2522acc InfluxDB: Optimize work queue event handling
refs #5219
2017-05-26 17:11:13 +02:00
Michael Friedrich 28395b32f0 GraphiteWriter: Use a workqueue for event processing
This also adds reconnect handling and exceptions.

refs #5132
refs #5133
refs #5280
2017-05-26 15:18:14 +02:00
Michael Friedrich 647d82094f InfluxDB: Remove obsolete logger, now implemented in WorkQueue class
refs #5280
refs #5133
2017-05-24 17:01:46 +02:00
Michael Friedrich d366a63510 Add API & Cluster stats to /v1/status & icinga check performance metrics
refs #5133
2017-05-24 16:21:05 +02:00
Michael Friedrich 52d986d02b Revert "Add LogstashWriter feature"
This reverts commit f5a971f5b0.

refs #4054
2017-05-23 12:05:01 +02:00
Michael Friedrich 4c7660190f Revert "Review LogstashWriter feature implementation"
This reverts commit bd5ff814f2.

refs #4054
2017-05-23 12:04:08 +02:00
Michael Friedrich 79dcb789c2 Move PerfdataValue() class into base library
This is required for libremote and ApiListener stats in #5133
2017-05-15 16:32:29 +02:00
Michael Friedrich 2338db5f6a Fix performance data processing in GelfWriter feature
Includes fixes for possible crashes on empty check results.

fixes #4666
2017-05-15 13:46:43 +02:00
Simon Murray fc2c2d9a29 Verbose InfluxDB Error Logging
On a non 204 response we parse the HTTP response until complete e.g. do the headers
and body, not just the header.  A new interface is added to the response to allow us
to determine the body size so that it may be read out and buffered.  The body is
parsed and any error message printed out.  In the event that the parsing fails the
raw body is dumped out; better than nothing!

fixes #4411

Signed-off-by: Michael Friedrich <michael.friedrich@icinga.com>
2017-05-11 12:13:41 +02:00
Gunnar Beutner c611a31670 Fix code style issues
refs #5219
2017-05-09 09:01:08 +02:00
Gunnar Beutner f9b34cca30 Fix compiler warning
refs #5219
2017-05-08 08:47:27 +02:00
Michael Friedrich 3649a5a0d7 InfluxdbWriter: Use a work queue for async message processing; add stats log/api 2017-05-05 17:56:51 +02:00
Michael Friedrich bd5ff814f2 Review LogstashWriter feature implementation
refs #4054
2017-03-20 14:30:03 +01:00
Kai Goller f5a971f5b0 Add LogstashWriter feature
This adds the UdpSocket class.

refs #4054
2017-03-20 14:30:03 +01:00
Michael Friedrich 3993276b74 Add a removal note for enable_legacy_mode for GraphiteWriter
refs #4992
2017-02-10 11:33:32 +01:00
Michael Friedrich e5f5284838 Add logging for started/stopped features
fixes #3557
2017-02-08 15:40:27 +01:00
Simon Murray 041772fb28 PerfData: Server Timeouts for InfluxDB Writer
Exposes the TCP socket used to communicate with the InfluxDB server.  When we are
expecing a response we can now call poll() on the socket to wait for data to become
available.  If it doesn't in a user configurable timeout period we abort the request.

fixes #4927
fixes #4941

Signed-off-by: Michael Friedrich <michael.friedrich@icinga.com>
2017-02-07 17:06:46 +01:00
Michael Friedrich b7caf0820d Ensure that *.icinga.com is used everywhere
fixes #13897
fixes #13277
2017-01-10 17:19:12 +01:00
Simon Murray 2c37a00daf InfluxDB: Always Write Out Metadata
Previously the logic would just bail out if no performance data was associated with a
check, the problem being that check metadata was skipped too.  This rearranges the code
to dump out performance metrics if they exist, then dump out metadata if requested.  This
also fixes an issue whereby metadata was being sent for every performance data in the
check result, rather than just once, so we save a bit of bandwidth as a result.

fixes #12276

Signed-off-by: Michael Friedrich <michael.friedrich@icinga.com>
2016-11-09 16:10:21 +01:00
Gunnar Beutner 30762e5330 Set versions for all internal libraries
fixes #12552
2016-08-25 17:56:18 +02:00
Gunnar Beutner 288413f046 Replace BOOST_FOREACH with range-based for loops
fixes #12538
2016-08-25 06:46:17 +02:00
Gunnar Beutner 429d11daa8 Fix compiler warnings
fixes #12534
2016-08-24 20:33:34 +02:00
Gunnar Beutner ae1ab5f865 Implement unit tests for state changes
fixes #12530
2016-08-24 19:45:52 +02:00
Rune Darrud 5c0b3c58bd Do not escape backslashes and separators twice
fixes #12227

Signed-off-by: Gunnar Beutner <gunnar.beutner@netways.de>
2016-08-17 06:10:41 +02:00
Marius Sturm 451cd73749 GelfWriter: Use raw CheckResult output for full_message attribute
fixes #10903

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2016-08-09 09:51:17 +02:00
Simon Murray 84ea0065b2 Fix InfluxdbWriter Trailing Backslash
Backslashes escape spaces or commas (and evidently equals), given tags are
separated by commas, tag keys and values are separated by equals and tags
are separated from fields by a space we need to take action when these end
in a backslash e.g. 'C:\'.  Also discovered a bug whereby the metric tag was
missing out on escaping.

fixes #12227

Signed-off-by: Gunnar Beutner <gunnar.beutner@netways.de>
2016-07-29 06:51:33 +02:00
Simon Murray c6add53152 Fix InfluxDB Writer Key Escaping
The escaping wasn't being performed on measuerments, keys or tag values.  The
escape function was returning the input and not the modified ouput, so this
has been fixed

refs #12047

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2016-06-27 12:07:07 +02:00
Michael Friedrich 7077ca1a53 Add acknowledgement type to Graphite, InfluxDB, OpenTSDB metadata
fixes #12018
2016-06-23 13:04:23 +02:00
Simon Murray 2e8c8809ea Add service metadata to InfluxDB Writer
Adds a new configuration variable in keeping with the graphite writer
which defaults to false to save network bandwidth.  All metrics currently
supported by graphite are now available to InfluxDB.  I added in some
formatting functions, to handle integers and booleans as we know and
control their types, and the supporting regexes in the sanity checker.

Updating to InfluxDB 0.13.X started giving 400 errors due to the missing
Host header in HTTP/1.1 requests.  HttpRequest has been updated to auto-
magically add the host and port to these requests if not explicitly
stated by the client code.

The exception code has been cleaned up to break out of the function
early if such a condition is raised, this avoids unnecessarily executing
code which will ultimately fail.

fixes #11912

Signed-off-by: Gunnar Beutner <gunnar.beutner@netways.de>
2016-06-08 13:23:52 +02:00
Simon Murray 899592c8ad Update InfluxDB line formatting
Fixes a couple issues to do with line formatting of influx DB data points.  All
keys and values need commas and white space escaping.  Values are also checked
for type.  If a numeric or scientific value is detected this is output as an
Influx floating point/scientific number.  Booleans are detected and output in
a canonical format.  All other values are strings, which have double quotes
escaped and the entire string is wrapped in double quotes.  The handling of
thresholds has changed before this becomes officially released.  These values
if available are passed to the accumulation function in a dictionary, said
dictionary builds a single data point with multiple fields, rather than the
existing 5 data points, thus saving bandwidth costs.

fixes #11904

Signed-off-by: Gunnar Beutner <gunnar.beutner@netways.de>
2016-06-08 11:01:23 +02:00
Gunnar Beutner bb69540b32 Fix exception in PerfdataWriter::RotateFile
fixes #11801
2016-05-18 14:01:32 +02:00
Jason Young 7dbd66535a Throw exception if PerfdataWriter::RotateFile fails to rename from host_temp_path to host_perfdata_path (and same for service)
This can happen if the two paths are not on the same mount-point.

fixes #9236

Signed-off-by: Gunnar Beutner <gunnar.beutner@netways.de>
2016-05-11 09:29:32 +02:00
Michael Friedrich ba82d2eb20 Move CalculateExecutionTime and CalculateLatency into the CheckResult class
fixes #11751
2016-05-10 12:16:49 +02:00
Simon Murray 79c1e883d1 Add InfluxDB Writer
Adds an Icinga2 object to directly interface with InfluxDB's native HTTP API.
This supports optional basic authorization, and TLS transport.  InfluxDB didn't
appear to like having the TLS stream kept open, so instead this object buffers
data points which are then flushed to InfluxDB as a batch write, either driven
by a configurable timeout or threshold.

As InfluxDB is a schema-less database the host and service templates are user
configurable allowing both the measurement field and tags to be set by the
end user via macro expansion.  This allows access to tag fields from arbitrary
data associated with host.vars or service.vars.  If a particular value is
unable to be resolved, the tag will be dropped and not transmitted to InfluxDB.

Also alters URL handling to omit array brackets when only a single value is
attached to a key, otherwise InfluxDB has a strop with non-standard syntax.

fixes #10480

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2016-05-03 14:12:51 +02:00
Marius Sturm 15cb9c1c1a Use check_result timestamp for GELF log messages
fixes #9184

Signed-off-by: Michael Friedrich <michael.friedrich@netways.de>
2016-04-11 14:48:16 +02:00
Gunnar Beutner 599929b0f6 Update copyright headers for 2016 2016-01-12 08:29:59 +01:00
Daniil Yaroslavtsev d739675799 GelfWriter: Add additional fields for 'CHECK RESULT' events
fixes #9858
2015-12-18 11:05:38 +01:00
Gunnar Beutner 050c520b2a Convert Comment/Downtime to config objects
fixes #9777
2015-10-28 17:56:29 +01:00
Gunnar Beutner b77c9edca0 Remove unnecessary default values
refs #9461
refs #8149
2015-10-20 08:06:25 +02:00
Jean-Marcel Flach 4ef9761fee Implement status api handler
Global statistics, features, etc.

fixes #10116
2015-09-23 16:59:07 +02:00
Tobias von der Krone da8613acf9 Add timestamp support for OpenTSDB
fixes #9183
2015-09-15 15:37:15 +02:00