2019-07-20 14:51:24 +02:00
# Distributed Monitoring with Master, Satellites and Agents <a id="distributed-monitoring"></a>
2016-08-13 15:59:06 +02:00
This chapter will guide you through the setup of a distributed monitoring
2016-08-20 14:17:18 +02:00
environment, including high-availability clustering and setup details
2019-07-20 12:36:24 +02:00
for Icinga masters, satellites and agents.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
## Roles: Master, Satellites and Agents <a id="distributed-monitoring-roles"></a>
2016-08-13 15:59:06 +02:00
Icinga 2 nodes can be given names for easier understanding:
2016-08-20 14:51:05 +02:00
* A `master` node which is on top of the hierarchy.
* A `satellite` node which is a child of a `satellite` or `master` node.
2019-07-20 12:36:24 +02:00
* An `agent` node which is connected to `master` and/or `satellite` nodes.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Roles ](images/distributed-monitoring/icinga2_distributed_monitoring_roles.png )
2016-08-20 14:51:05 +02:00
Rephrasing this picture into more details:
* A `master` node has no parent node.
2019-07-20 15:59:59 +02:00
* A `master` node is where you usually install Icinga Web 2.
* A `master` node can combine executed checks from child nodes into backends and notifications.
2016-08-20 14:51:05 +02:00
* A `satellite` node has a parent and a child node.
2019-07-20 15:59:59 +02:00
* A `satellite` node may execute checks on its own or delegate check execution to child nodes.
* A `satellite` node can receive configuration for hosts/services, etc. from the parent node.
* A `satellite` node continues to run even if the master node is temporarily unavailable.
2019-07-20 12:36:24 +02:00
* An `agent` node only has a parent node.
2019-07-20 15:59:59 +02:00
* An `agent` node will either run its own configured checks or receive command execution events from the parent node.
2019-07-20 12:36:24 +02:00
A client can be a secondary master, a satellite or an agent. It
typically requests something from the primary master or parent node.
2016-08-20 14:51:05 +02:00
2016-08-20 14:17:18 +02:00
The following sections will refer to these roles and explain the
differences and the possibilities this kind of setup offers.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
> **Note**
>
> Previous versions of this documentation used the term `Icinga client`.
> This has been refined into `Icinga agent` and is visible in the docs,
> backends and web interfaces.
2016-08-23 20:20:15 +02:00
**Tip**: If you just want to install a single master node that monitors several hosts
2019-07-20 12:36:24 +02:00
(i.e. Icinga agents), continue reading -- we'll start with
2016-08-23 20:20:15 +02:00
simple examples.
In case you are planning a huge cluster setup with multiple levels and
2019-07-20 12:36:24 +02:00
lots of satellites and agents, read on -- we'll deal with these cases later on.
2016-08-13 15:59:06 +02:00
2022-02-16 10:29:27 +01:00
The installation on each system is the same: Follow the [installation instructions ](02-installation.md )
for the Icinga 2 package and the required check plugins.
2016-08-20 14:51:05 +02:00
The required configuration steps are mostly happening
2017-07-12 20:46:12 +02:00
on the command line. You can also [automate the setup ](06-distributed-monitoring.md#distributed-monitoring-automation ).
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
The first thing you need learn about a distributed setup is the hierarchy of the single components.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
## Zones <a id="distributed-monitoring-zones"></a>
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
The Icinga 2 hierarchy consists of so-called [zone ](09-object-types.md#objecttype-zone ) objects.
2016-08-20 14:17:18 +02:00
Zones depend on a parent-child relationship in order to trust each other.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Zones ](images/distributed-monitoring/icinga2_distributed_monitoring_zones.png )
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Have a look at this example for the `satellite` zones which have the `master` zone as a parent zone:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
object Zone "master" {
//...
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "satellite region 1" {
parent = "master"
//...
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "satellite region 2" {
parent = "master"
//...
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
There are certain limitations for child zones, e.g. their members are not allowed
to send configuration commands to the parent zone members. Vice versa, the
2016-08-13 15:59:06 +02:00
trust hierarchy allows for example the `master` zone to send
configuration files to the `satellite` zone. Read more about this
2017-07-12 20:46:12 +02:00
in the [security section ](06-distributed-monitoring.md#distributed-monitoring-security ).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
`agent` nodes also have their own unique zone. By convention you
must use the FQDN for the zone name.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
## Endpoints <a id="distributed-monitoring-endpoints"></a>
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
Nodes which are a member of a zone are so-called [Endpoint ](09-object-types.md#objecttype-endpoint ) objects.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Endpoints ](images/distributed-monitoring/icinga2_distributed_monitoring_endpoints.png )
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Here is an example configuration for two endpoints in different zones:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
object Endpoint "icinga2-master1.localdomain" {
host = "192.168.56.101"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-satellite1.localdomain" {
host = "192.168.56.105"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain" ]
parent = "master"
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
All endpoints in the same zone work as high-availability setup. For
example, if you have two nodes in the `master` zone, they will load-balance the check execution.
2016-08-13 15:59:06 +02:00
Endpoint objects are important for specifying the connection
2019-07-20 12:36:24 +02:00
information, e.g. if the master should actively try to connect to an agent.
2016-08-13 15:59:06 +02:00
The zone membership is defined inside the `Zone` object definition using
the `endpoints` attribute with an array of `Endpoint` names.
2018-10-12 10:49:03 +02:00
> **Note**
>
> There is a known [problem](https://github.com/Icinga/icinga2/issues/3533)
> with >2 endpoints in a zone and a message routing loop.
> The config validation will log a warning to let you know about this too.
2016-08-22 09:59:44 +02:00
If you want to check the availability (e.g. ping checks) of the node
2017-07-12 20:46:12 +02:00
you still need a [Host ](09-object-types.md#objecttype-host ) object.
2016-08-22 09:59:44 +02:00
2017-07-12 20:46:12 +02:00
## ApiListener <a id="distributed-monitoring-apilistener"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
In case you are using the CLI commands later, you don't have to write
2016-08-20 14:51:05 +02:00
this configuration from scratch in a text editor.
2018-08-09 12:22:55 +02:00
The [ApiListener ](09-object-types.md#objecttype-apilistener ) object is
used to load the TLS certificates and specify restrictions, e.g.
2016-08-20 14:17:18 +02:00
for accepting configuration commands.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
It is also used for the [Icinga 2 REST API ](12-icinga2-api.md#icinga2-api ) which shares
2016-08-13 15:59:06 +02:00
the same host and port with the Icinga 2 Cluster protocol.
2016-08-21 12:43:28 +02:00
The object configuration is stored in the `/etc/icinga2/features-enabled/api.conf`
file. Depending on the configuration mode the attributes `accept_commands`
2016-08-20 14:51:05 +02:00
and `accept_config` can be configured here.
2016-08-13 15:59:06 +02:00
In order to use the `api` feature you need to enable it and restart Icinga 2.
2020-12-09 12:32:09 +01:00
```bash
2019-03-07 19:56:49 +01:00
icinga2 feature enable api
```
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
## Conventions <a id="distributed-monitoring-conventions"></a>
2016-08-13 15:59:06 +02:00
By convention all nodes should be configured using their FQDN.
2016-08-20 14:17:18 +02:00
Furthermore, you must ensure that the following names
are exactly the same in all configuration files:
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
* Host certificate common name (CN).
* Endpoint configuration object for the host.
* NodeName constant for the local host.
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
Setting this up on the command line will help you to minimize the effort.
Just keep in mind that you need to use the FQDN for endpoints and for
common names when asked.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
## Security <a id="distributed-monitoring-security"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
While there are certain mechanisms to ensure a secure communication between all
nodes (firewalls, policies, software hardening, etc.), Icinga 2 also provides
additional security:
2016-08-13 15:59:06 +02:00
2019-07-22 14:44:47 +02:00
* TLS v1.2+ is required.
* TLS cipher lists are hardened [by default ](09-object-types.md#objecttype-apilistener ).
* TLS certificates are mandatory for communication between nodes. The CLI command wizards
help you create these certificates.
2016-08-20 14:51:05 +02:00
* Child zones only receive updates (check results, commands, etc.) for their configured objects.
2017-03-10 18:19:22 +01:00
* Child zones are not allowed to push configuration updates to parent zones.
2017-01-23 14:34:47 +01:00
* Zones cannot interfere with other zones and influence each other. Each checkable host or service object is assigned to **one zone** only.
2016-08-13 15:59:06 +02:00
* All nodes in a zone trust each other.
2017-07-12 20:46:12 +02:00
* [Config sync ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync ) and [remote command endpoint execution ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint ) is disabled by default.
2016-08-13 15:59:06 +02:00
2017-03-10 18:19:22 +01:00
The underlying protocol uses JSON-RPC event notifications exchanged by nodes.
The connection is secured by TLS. The message protocol uses an internal API,
and as such message types and names may change internally and are not documented.
2016-08-20 14:51:05 +02:00
2017-10-20 17:22:51 +02:00
Zones build the trust relationship in a distributed environment. If you do not specify
2019-07-20 12:36:24 +02:00
a zone for an agent/satellite and specify the parent zone, its zone members e.g. the master instance
won't trust the agent/satellite.
2017-10-20 17:22:51 +02:00
Building this trust is key in your distributed environment. That way the parent node
knows that it is able to send messages to the child zone, e.g. configuration objects,
configuration in global zones, commands to be executed in this zone/for this endpoint.
It also receives check results from the child zone for checkable objects (host/service).
2019-07-20 12:36:24 +02:00
Vice versa, the agent/satellite trusts the master and accepts configuration and commands if enabled
in the api feature. If the agent/satellite would send configuration to the parent zone, the parent nodes
will deny it. The parent zone is the configuration entity, and does not trust agents/satellites in this matter.
An agent/satellite could attempt to modify a different agent/satellite for example, or inject a check command
2017-10-20 17:22:51 +02:00
with malicious code.
2019-07-20 12:36:24 +02:00
While it may sound complicated for agent/satellite setups, it removes the problem with different roles
and configurations for a master and child nodes. Both of them work the same way, are configured
2017-10-20 17:22:51 +02:00
in the same way (Zone, Endpoint, ApiListener), and you can troubleshoot and debug them in just one go.
2018-09-13 15:13:03 +02:00
## Versions and Upgrade <a id="distributed-monitoring-versions-upgrade"></a>
It generally is advised to use the newest releases with the same version on all instances.
Prior to upgrading, make sure to plan a maintenance window.
The Icinga project aims to allow the following compatibility:
```
2019-07-20 12:36:24 +02:00
master (2.11) >= satellite (2.10) >= agent (2.9)
2018-09-13 15:13:03 +02:00
```
2019-07-20 12:36:24 +02:00
Older agent versions may work, but there's no guarantee. Always keep in mind that
2018-09-13 15:13:03 +02:00
older versions are out of support and can contain bugs.
In terms of an upgrade, ensure that the master is upgraded first, then
2019-07-20 12:36:24 +02:00
involved satellites, and last the Icinga agents. If you are on v2.10
2019-04-08 14:08:14 +02:00
currently, first upgrade the master instance(s) to 2.11, and then proceed
2018-09-13 15:13:03 +02:00
with the satellites. Things are getting easier with any sort of automation
tool (Puppet, Ansible, etc.).
Releases and new features may require you to upgrade master/satellite instances at once,
this is highlighted in the [upgrading docs ](16-upgrading-icinga-2.md#upgrading-icinga-2 ) if needed.
2018-09-13 16:19:38 +02:00
One example is the CA Proxy and on-demand signing feature
2018-09-13 15:13:03 +02:00
available since v2.8 where all involved instances need this version
to function properly.
2017-07-12 20:46:12 +02:00
## Master Setup <a id="distributed-monitoring-setup-master"></a>
2016-08-13 15:59:06 +02:00
This section explains how to install a central single master node using
2016-08-20 14:17:18 +02:00
the `node wizard` command. If you prefer to do an automated installation, please
2017-07-12 20:46:12 +02:00
refer to the [automated setup ](06-distributed-monitoring.md#distributed-monitoring-automation ) section.
2016-08-13 15:59:06 +02:00
2022-02-16 10:29:27 +01:00
Follow the [installation instructions ](02-installation.md ) for the Icinga 2 package and the required
check plugins if you haven't done so already.
2016-08-20 14:51:05 +02:00
2016-08-23 20:20:15 +02:00
**Note**: Windows is not supported for a master node setup.
2016-08-20 14:51:05 +02:00
The next step is to run the `node wizard` CLI command. Prior to that
ensure to collect the required information:
2016-08-13 15:59:06 +02:00
Parameter | Description
--------------------|--------------------
Common name (CN) | **Required.** By convention this should be the host's FQDN. Defaults to the FQDN.
2018-04-06 20:19:43 +02:00
Master zone name | **Optional.** Allows to specify the master zone name. Defaults to `master` .
2018-02-27 21:22:29 +01:00
Global zones | **Optional.** Allows to specify more global zones in addition to `global-templates` and `director-global` . Defaults to `n` .
2016-08-20 14:17:18 +02:00
API bind host | **Optional.** Allows to specify the address the ApiListener is bound to. For advanced usage only.
API bind port | **Optional.** Allows to specify the port the ApiListener is bound to. For advanced usage only (requires changing the default port 5665 everywhere).
2018-05-08 16:31:06 +02:00
Disable conf.d | **Optional.** Allows to disable the `include_recursive "conf.d"` directive except for the `api-users.conf` file in the `icinga2.conf` file. Defaults to `y` . Configuration on the master is discussed below.
2016-08-13 15:59:06 +02:00
The setup wizard will ensure that the following steps are taken:
2016-08-20 14:17:18 +02:00
* Enable the `api` feature.
* Generate a new certificate authority (CA) in `/var/lib/icinga2/ca` if it doesn't exist.
2017-09-07 19:00:11 +02:00
* Create a certificate for this node signed by the CA key.
2019-07-18 16:34:36 +02:00
* Update the [zones.conf ](04-configuration.md#zones-conf ) file with the new zone hierarchy.
* Update the [ApiListener ](06-distributed-monitoring.md#distributed-monitoring-apilistener ) and [constants ](04-configuration.md#constants-conf ) configuration.
* Update the [icinga2.conf ](04-configuration.md#icinga2-conf ) to disable the `conf.d` inclusion, and add the `api-users.conf` file inclusion.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Here is an example of a master setup for the `icinga2-master1.localdomain` node on CentOS 7:
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
```
[root@icinga2-master1.localdomain /]# icinga2 node wizard
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
Welcome to the Icinga 2 Setup Wizard!
We will guide you through all required configuration details.
2019-07-20 12:36:24 +02:00
Please specify if this is a satellite/agent setup ('n' installs a master setup) [Y/n]: n
2017-09-07 19:00:11 +02:00
Starting the Master setup routine...
Please specify the common name (CN) [icinga2-master1.localdomain]: icinga2-master1.localdomain
Reconfiguring Icinga...
2018-02-27 21:22:29 +01:00
Checking for existing certificates for common name 'icinga2-master1.localdomain'...
Certificates not yet generated. Running 'api setup' now.
2017-09-07 19:00:11 +02:00
Generating master configuration for Icinga 2.
2018-02-27 21:22:29 +01:00
Enabling feature api. Make sure to restart Icinga 2 for these changes to take effect.
2017-09-07 19:00:11 +02:00
2018-04-06 20:19:43 +02:00
Master zone name [master]:
2018-05-08 16:06:10 +02:00
Default global zones: global-templates director-global
2018-02-27 21:22:29 +01:00
Do you want to specify additional global zones? [y/N]: N
2018-05-08 16:06:10 +02:00
2017-09-07 19:00:11 +02:00
Please specify the API bind host/port (optional):
Bind Host []:
Bind Port []:
2016-08-13 15:59:06 +02:00
2018-05-08 16:31:06 +02:00
Do you want to disable the inclusion of the conf.d directory [Y/n]:
Disabling the inclusion of the conf.d directory...
Checking if the api-users.conf file exists...
2017-09-07 19:00:11 +02:00
Done.
Now restart your Icinga 2 daemon to finish the installation!
```
You can verify that the CA public and private keys are stored in the `/var/lib/icinga2/ca` directory.
2022-02-16 10:29:27 +01:00
Keep this path secure and include it in your backups.
2016-08-21 12:43:28 +02:00
2019-07-20 12:36:24 +02:00
In case you lose the CA private key you have to generate a new CA for signing new agent/satellite
2016-08-21 12:43:28 +02:00
certificate requests. You then have to also re-create new signed certificates for all
existing nodes.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
Once the master setup is complete, you can also use this node as primary [CSR auto-signing ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing )
2016-08-20 14:17:18 +02:00
master. The following section will explain how to use the CLI commands in order to fetch their
2016-08-13 15:59:06 +02:00
signed certificate from this master node.
2017-09-07 19:00:11 +02:00
## Signing Certificates on the Master <a id="distributed-monitoring-setup-sign-certificates-master"></a>
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
All certificates must be signed by the same certificate authority (CA). This ensures
that all nodes trust each other in a distributed monitoring environment.
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
This CA is generated during the [master setup ](06-distributed-monitoring.md#distributed-monitoring-setup-master )
and should be the same on all master instances.
2017-11-15 11:10:52 +01:00
You can avoid signing and deploying certificates [manually ](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-certificates-manual )
2017-09-07 19:00:11 +02:00
by using built-in methods for auto-signing certificate signing requests (CSR):
2019-07-20 12:36:24 +02:00
* [CSR Auto-Signing ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ) which uses a client (an agent or a satellite) ticket generated on the master as trust identifier.
2017-09-07 19:00:11 +02:00
* [On-Demand CSR Signing ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing ) which allows to sign pending certificate requests on the master.
Both methods are described in detail below.
> **Note**
>
> [On-Demand CSR Signing](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing) is available in Icinga 2 v2.8+.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
### CSR Auto-Signing <a id="distributed-monitoring-setup-csr-auto-signing"></a>
2016-08-15 14:32:41 +02:00
2019-07-20 12:36:24 +02:00
A client can be a secondary master, a satellite or an agent. It sends a certificate signing request (CSR)
and must authenticate itself in a trusted way. The master generates a client ticket which is included in this request.
2017-09-07 19:00:11 +02:00
That way the master can verify that the request matches the previously trusted ticket
and sign the request.
> **Note**
>
2019-04-08 14:08:14 +02:00
> Icinga 2 v2.8 added the possibility to forward signing requests on a satellite
2018-09-13 16:19:38 +02:00
> to the master node. This is called `CA Proxy` in blog posts and design drafts.
2019-07-20 12:36:24 +02:00
> This functionality helps with the setup of [three level clusters](06-distributed-monitoring.md#distributed-monitoring-scenarios-master-satellite-agents)
2017-09-07 19:00:11 +02:00
> and more.
Advantages:
2019-07-20 12:36:24 +02:00
* Nodes (secondary master, satellites, agents) can be installed by different users who have received the client ticket.
2017-09-07 19:00:11 +02:00
* No manual interaction necessary on the master node.
* Automation tools like Puppet, Ansible, etc. can retrieve the pre-generated ticket in their client catalog
and run the node setup directly.
2016-08-15 14:32:41 +02:00
2017-09-07 19:00:11 +02:00
Disadvantages:
* Tickets need to be generated on the master and copied to client setup wizards.
* No central signing management.
2019-07-30 15:16:23 +02:00
#### CSR Auto-Signing: Preparation <a id="distributed-monitoring-setup-csr-auto-signing-preparation"></a>
Prior to using this mode, ensure that the following steps are taken on
the signing master:
* The [master setup ](06-distributed-monitoring.md#distributed-monitoring-setup-master ) was run successfully. This includes:
* Generated a CA key pair
* Generated a private ticket salt stored in the `TicketSalt` constant, set as `ticket_salt` attribute inside the [api ](09-object-types.md#objecttype-apilistener ) feature.
* Restart of the master instance.
#### CSR Auto-Signing: On the master <a id="distributed-monitoring-setup-csr-auto-signing-master"></a>
2017-09-07 19:00:11 +02:00
2019-07-20 12:36:24 +02:00
Setup wizards for agent/satellite nodes will ask you for this specific client ticket.
2016-08-15 14:32:41 +02:00
There are two possible ways to retrieve the ticket:
2016-08-21 12:43:28 +02:00
* [CLI command ](11-cli-commands.md#cli-command-pki ) executed on the master node.
* [REST API ](12-icinga2-api.md#icinga2-api ) request against the master node.
2016-08-13 15:59:06 +02:00
2019-07-30 15:16:23 +02:00
2016-08-13 15:59:06 +02:00
Required information:
Parameter | Description
--------------------|--------------------
2019-07-20 12:36:24 +02:00
Common name (CN) | **Required.** The common name for the agent/satellite. By convention this should be the FQDN.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The following example shows how to generate a ticket on the master node `icinga2-master1.localdomain` for the agent `icinga2-agent1.localdomain` :
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /]# icinga2 pki ticket --cn icinga2-agent1.localdomain
2019-03-07 19:56:49 +01:00
```
2016-08-15 14:32:41 +02:00
Querying the [Icinga 2 API ](12-icinga2-api.md#icinga2-api ) on the master requires an [ApiUser ](12-icinga2-api.md#icinga2-api-authentication )
2017-09-07 19:00:11 +02:00
object with at least the `actions/generate-ticket` permission.
2016-08-15 14:32:41 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/conf.d/api-users.conf
2016-08-15 14:32:41 +02:00
2019-03-07 19:56:49 +01:00
object ApiUser "client-pki-ticket" {
password = "bea11beb7b810ea9ce6ea" //change this
permissions = [ "actions/generate-ticket" ]
}
2016-08-15 14:32:41 +02:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
2016-08-15 14:32:41 +02:00
2016-08-31 13:25:57 +02:00
Retrieve the ticket on the master node `icinga2-master1.localdomain` with `curl` , for example:
2016-08-15 14:32:41 +02:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /]# curl -k -s -u client-pki-ticket:bea11beb7b810ea9ce6ea -H 'Accept: application/json' \
2019-07-19 14:44:14 +02:00
-X POST 'https://localhost:5665/v1/actions/generate-ticket' -d '{ "cn": "icinga2-agent1.localdomain" }'
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-30 15:16:23 +02:00
Store that ticket number for the [agent/satellite setup ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ) below.
2016-08-13 15:59:06 +02:00
2018-04-06 20:19:43 +02:00
> **Note**
>
> Never expose the ticket salt and/or ApiUser credentials to your client nodes.
> Example: Retrieve the ticket on the Puppet master node and send the compiled catalog
> to the authorized Puppet agent node which will invoke the
> [automated setup steps](06-distributed-monitoring.md#distributed-monitoring-automation-cli-node-setup).
2016-08-31 13:25:57 +02:00
2019-07-30 15:16:23 +02:00
2017-09-07 19:00:11 +02:00
### On-Demand CSR Signing <a id="distributed-monitoring-setup-on-demand-csr-signing"></a>
2019-07-20 12:36:24 +02:00
The client can be a secondary master, satellite or agent.
It sends a certificate signing request to specified parent node without any
ticket. The admin on the primary master is responsible for reviewing and signing the requests
2018-09-13 16:33:27 +02:00
with the private CA key.
2017-09-07 19:00:11 +02:00
This could either be directly the master, or a satellite which forwards the request
to the signing master.
Advantages:
* Central certificate request signing management.
* No pre-generated ticket is required for client setups.
Disadvantages:
* Asynchronous step for automated deployments.
* Needs client verification on the master.
2019-07-30 15:16:23 +02:00
#### On-Demand CSR Signing: Preparation <a id="distributed-monitoring-setup-on-demand-csr-signing-preparation"></a>
Prior to using this mode, ensure that the following steps are taken on
the signing master:
* The [master setup ](06-distributed-monitoring.md#distributed-monitoring-setup-master ) was run successfully. This includes:
* Generated a CA key pair
* Restart of the master instance.
#### On-Demand CSR Signing: On the master <a id="distributed-monitoring-setup-on-demand-csr-signing-master"></a>
2017-09-07 19:00:11 +02:00
2018-10-09 17:40:04 +02:00
You can list pending certificate signing requests with the `ca list` CLI command.
2017-09-07 19:00:11 +02:00
```
[root@icinga2-master1.localdomain /]# icinga2 ca list
Fingerprint | Timestamp | Signed | Subject
-----------------------------------------------------------------|---------------------|--------|--------
2019-07-19 14:44:14 +02:00
71700c28445109416dd7102038962ac3fd421fbb349a6e7303b6033ec1772850 | 2017/09/06 17:20:02 | | CN = icinga2-agent2.localdomain
2018-10-09 17:40:04 +02:00
```
In order to show all requests, use the `--all` parameter.
```
[root@icinga2-master1.localdomain /]# icinga2 ca list --all
Fingerprint | Timestamp | Signed | Subject
-----------------------------------------------------------------|---------------------|--------|--------
2019-07-19 14:44:14 +02:00
403da5b228df384f07f980f45ba50202529cded7c8182abf96740660caa09727 | 2017/09/06 17:02:40 | * | CN = icinga2-agent1.localdomain
71700c28445109416dd7102038962ac3fd421fbb349a6e7303b6033ec1772850 | 2017/09/06 17:20:02 | | CN = icinga2-agent2.localdomain
2017-09-07 19:00:11 +02:00
```
**Tip**: Add `--json` to the CLI command to retrieve the details in JSON format.
If you want to sign a specific request, you need to use the `ca sign` CLI command
and pass its fingerprint as argument.
```
[root@icinga2-master1.localdomain /]# icinga2 ca sign 71700c28445109416dd7102038962ac3fd421fbb349a6e7303b6033ec1772850
2019-07-19 14:44:14 +02:00
information/cli: Signed certificate for 'CN = icinga2-agent2.localdomain'.
2017-09-07 19:00:11 +02:00
```
2019-03-25 10:59:55 +01:00
> **Note**
>
> `ca list` cannot be used as historical inventory. Certificate
> signing requests older than 1 week are automatically deleted.
2018-07-28 00:25:33 +02:00
You can also remove an undesired CSR using the `ca remove` command using the
syntax as the `ca sign` command.
```
[root@pym ~]# icinga2 ca remove 5c31ca0e2269c10363a97e40e3f2b2cd56493f9194d5b1852541b835970da46e
information/cli: Certificate 5c31ca0e2269c10363a97e40e3f2b2cd56493f9194d5b1852541b835970da46e removed.
```
2018-08-08 18:09:41 +02:00
If you want to restore a certificate you have removed, you can use `ca restore` .
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-setup-satellite-client" > < / a >
2018-07-28 00:25:33 +02:00
2019-07-20 12:36:24 +02:00
## Agent/Satellite Setup <a id="distributed-monitoring-setup-agent-satellite"></a>
2017-09-07 19:00:11 +02:00
2019-07-20 12:36:24 +02:00
This section describes the setup of an agent or satellite connected to an
2017-09-07 19:00:11 +02:00
existing master node setup. If you haven't done so already, please [run the master setup ](06-distributed-monitoring.md#distributed-monitoring-setup-master ).
Icinga 2 on the master node must be running and accepting connections on port `5665` .
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-setup-client-linux" > < / a >
2017-09-07 19:00:11 +02:00
2019-07-20 12:36:24 +02:00
### Agent/Satellite Setup on Linux <a id="distributed-monitoring-setup-agent-satellite-linux"></a>
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Please ensure that you've run all the steps mentioned in the [agent/satellite section ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ).
2016-08-13 15:59:06 +02:00
2022-02-16 10:29:27 +01:00
Follow the [installation instructions ](02-installation.md ) for the Icinga 2 package and the required
check plugins if you haven't done so already.
2016-08-20 14:51:05 +02:00
2017-09-07 19:00:11 +02:00
The next step is to run the `node wizard` CLI command.
2019-07-20 12:36:24 +02:00
In this example we're generating a ticket on the master node `icinga2-master1.localdomain` for the agent `icinga2-agent1.localdomain` :
2017-09-07 19:00:11 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /]# icinga2 pki ticket --cn icinga2-agent1.localdomain
2019-03-07 19:56:49 +01:00
4f75d2ecd253575fe9180938ebff7cbca262f96e
```
2017-09-07 19:00:11 +02:00
Note: You don't need this step if you have chosen to use [On-Demand CSR Signing ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing ).
2019-07-20 12:36:24 +02:00
Start the wizard on the agent `icinga2-agent1.localdomain` :
2017-09-07 19:00:11 +02:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# icinga2 node wizard
2017-09-07 19:00:11 +02:00
Welcome to the Icinga 2 Setup Wizard!
We will guide you through all required configuration details.
```
2019-07-20 12:36:24 +02:00
Press `Enter` or add `y` to start a satellite or agent setup.
2017-09-07 19:00:11 +02:00
```
2019-07-20 12:36:24 +02:00
Please specify if this is an agent/satellite setup ('n' installs a master setup) [Y/n]:
2017-09-07 19:00:11 +02:00
```
Press `Enter` to use the proposed name in brackets, or add a specific common name (CN). By convention
this should be the FQDN.
```
2019-07-20 14:51:24 +02:00
Starting the Agent/Satellite setup routine...
2017-09-07 19:00:11 +02:00
2019-07-19 14:44:14 +02:00
Please specify the common name (CN) [icinga2-agent1.localdomain]: icinga2-agent1.localdomain
2017-09-07 19:00:11 +02:00
```
Specify the direct parent for this node. This could be your primary master `icinga2-master1.localdomain`
or a satellite node in a multi level cluster scenario.
```
Please specify the parent endpoint(s) (master or satellite) where this node should connect to:
Master/Satellite Common Name (CN from your master/satellite node): icinga2-master1.localdomain
```
Press `Enter` or choose `y` to establish a connection to the parent node.
```
Do you want to establish a connection to the parent node from this node? [Y/n]:
```
> **Note:**
>
> If this node cannot connect to the parent node, choose `n`. The setup
> wizard will provide instructions for this scenario -- signing questions are disabled then.
Add the connection details for `icinga2-master1.localdomain` .
```
Please specify the master/satellite connection information:
Master/Satellite endpoint host (IP address or FQDN): 192.168.56.101
Master/Satellite endpoint port [5665]: 5665
```
You can add more parent nodes if necessary. Press `Enter` or choose `n`
if you don't want to add any. This comes in handy if you have more than one
parent node, e.g. two masters or two satellites.
```
Add more master/satellite endpoints? [y/N]:
```
Verify the parent node's certificate:
```
Parent certificate information:
Subject: CN = icinga2-master1.localdomain
Issuer: CN = Icinga CA
Valid From: Sep 7 13:41:24 2017 GMT
Valid Until: Sep 3 13:41:24 2032 GMT
Fingerprint: AC 99 8B 2B 3D B0 01 00 E5 21 FA 05 2E EC D5 A9 EF 9E AA E3
Is this information correct? [y/N]: y
```
The setup wizard fetches the parent node's certificate and ask
you to verify this information. This is to prevent MITM attacks or
any kind of untrusted parent relationship.
2020-09-09 12:29:44 +02:00
You can verify the fingerprint by running the following command on the node to connect to:
```bash
openssl x509 -noout -fingerprint -sha256 -in \
"/var/lib/icinga2/certs/$(hostname --fqdn).crt"
```
2017-09-07 19:00:11 +02:00
Note: The certificate is not fetched if you have chosen not to connect
to the parent node.
Proceed with adding the optional client ticket for [CSR auto-signing ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ):
```
Please specify the request ticket generated on your Icinga 2 master (optional).
2019-07-19 14:44:14 +02:00
(Hint: # icinga2 pki ticket --cn 'icinga2-agent1.localdomain'):
2017-09-07 19:00:11 +02:00
4f75d2ecd253575fe9180938ebff7cbca262f96e
```
In case you've chosen to use [On-Demand CSR Signing ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing )
you can leave the ticket question blank.
Instead, Icinga 2 tells you to approve the request later on the master node.
```
No ticket was specified. Please approve the certificate signing request manually
on the master (see 'icinga2 ca list' and 'icinga2 ca sign --help' for details).
```
You can optionally specify a different bind host and/or port.
```
Please specify the API bind host/port (optional):
Bind Host []:
Bind Port []:
```
The next step asks you to accept configuration (required for [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync ))
and commands (required for [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )).
```
Accept config from parent node? [y/N]: y
Accept commands from parent node? [y/N]: y
```
2018-04-06 20:19:43 +02:00
Next you can optionally specify the local and parent zone names. This will be reflected
in the generated zone configuration file.
Set the local zone name to something else, if you are installing a satellite or secondary master instance.
```
2019-07-19 14:44:14 +02:00
Local zone name [icinga2-agent1.localdomain]:
2018-04-06 20:19:43 +02:00
```
2019-07-20 12:36:24 +02:00
Set the parent zone name to something else than `master` if this agents connects to a satellite instance instead of the master.
2018-04-06 20:19:43 +02:00
```
Parent zone name [master]:
```
You can add more global zones in addition to `global-templates` and `director-global` if necessary.
Press `Enter` or choose `n` , if you don't want to add any additional.
2017-09-07 19:00:11 +02:00
```
Reconfiguring Icinga...
2018-05-08 16:06:10 +02:00
Default global zones: global-templates director-global
2018-02-27 21:22:29 +01:00
Do you want to specify additional global zones? [y/N]: N
```
2018-05-08 16:31:06 +02:00
Last but not least the wizard asks you whether you want to disable the inclusion of the local configuration
2019-07-20 12:36:24 +02:00
directory in `conf.d` , or not. Defaults to disabled, as agents either are checked via command endpoint, or
2018-05-08 16:31:06 +02:00
they receive configuration synced from the parent zone.
```
Do you want to disable the inclusion of the conf.d directory [Y/n]: Y
Disabling the inclusion of the conf.d directory...
```
2018-02-27 21:22:29 +01:00
The wizard proceeds and you are good to go.
```
2017-09-07 19:00:11 +02:00
Done.
Now restart your Icinga 2 daemon to finish the installation!
```
> **Note**
>
> If you have chosen not to connect to the parent node, you cannot start
> Icinga 2 yet. The wizard asked you to manually copy the master's public
> CA certificate file into `/var/lib/icinga2/certs/ca.crt`.
>
2020-02-13 16:15:50 +01:00
> You need to [manually sign the CSR on the master node](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing-master).
2017-09-07 19:00:11 +02:00
Restart Icinga 2 as requested.
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# systemctl restart icinga2
2017-09-07 19:00:11 +02:00
```
Here is an overview of all parameters in detail:
2016-08-13 15:59:06 +02:00
Parameter | Description
--------------------|--------------------
Common name (CN) | **Required.** By convention this should be the host's FQDN. Defaults to the FQDN.
Master common name | **Required.** Use the common name you've specified for your master node before.
2017-09-07 19:00:11 +02:00
Establish connection to the parent node | **Optional.** Whether the node should attempt to connect to the parent node or not. Defaults to `y` .
2019-07-20 12:36:24 +02:00
Master/Satellite endpoint host | **Required if the the agent needs to connect to the master/satellite.** The parent endpoint's IP address or FQDN. This information is included in the `Endpoint` object configuration in the `zones.conf` file.
Master/Satellite endpoint port | **Optional if the the agent needs to connect to the master/satellite.** The parent endpoints's listening port. This information is included in the `Endpoint` object configuration.
2017-09-07 19:00:11 +02:00
Add more master/satellite endpoints | **Optional.** If you have multiple master/satellite nodes configured, add them here.
Parent Certificate information | **Required.** Verify that the connecting host really is the requested master node.
Request ticket | **Optional.** Add the [ticket ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ) generated on the master.
2016-08-20 14:17:18 +02:00
API bind host | **Optional.** Allows to specify the address the ApiListener is bound to. For advanced usage only.
API bind port | **Optional.** Allows to specify the port the ApiListener is bound to. For advanced usage only (requires changing the default port 5665 everywhere).
2017-07-12 20:46:12 +02:00
Accept config | **Optional.** Whether this node accepts configuration sync from the master node (required for [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync )). For [security reasons ](06-distributed-monitoring.md#distributed-monitoring-security ) this defaults to `n` .
Accept commands | **Optional.** Whether this node accepts command execution messages from the master node (required for [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )). For [security reasons ](06-distributed-monitoring.md#distributed-monitoring-security ) this defaults to `n` .
2019-07-20 12:36:24 +02:00
Local zone name | **Optional.** Allows to specify the name for the local zone. This comes in handy when this instance is a satellite, not an agent. Defaults to the FQDN.
Parent zone name | **Optional.** Allows to specify the name for the parent zone. This is important if the agent has a satellite instance as parent, not the master. Defaults to `master` .
2018-02-27 21:22:29 +01:00
Global zones | **Optional.** Allows to specify more global zones in addition to `global-templates` and `director-global` . Defaults to `n` .
2018-05-08 16:31:06 +02:00
Disable conf.d | **Optional.** Allows to disable the inclusion of the `conf.d` directory which holds local example configuration. Clients should retrieve their configuration from the parent node, or act as command endpoint execution bridge. Defaults to `y` .
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
The setup wizard will ensure that the following steps are taken:
* Enable the `api` feature.
* Create a certificate signing request (CSR) for the local node.
2020-11-09 14:19:35 +01:00
* Request a signed certificate (optional with the provided ticket number) on the master node.
2017-09-07 19:00:11 +02:00
* Allow to verify the parent node's certificate.
2019-07-20 12:36:24 +02:00
* Store the signed agent/satellite certificate and ca.crt in `/var/lib/icinga2/certs` .
2016-08-21 12:43:28 +02:00
* Update the `zones.conf` file with the new zone hierarchy.
* Update `/etc/icinga2/features-enabled/api.conf` (`accept_config`, `accept_commands` ) and `constants.conf` .
2018-05-08 16:31:06 +02:00
* Update `/etc/icinga2/icinga2.conf` and comment out `include_recursive "conf.d"` .
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
You can verify that the certificate files are stored in the `/var/lib/icinga2/certs` directory.
2016-10-10 11:42:18 +02:00
2017-09-07 19:00:11 +02:00
> **Note**
>
2019-07-20 12:36:24 +02:00
> If the agent is not directly connected to the certificate signing master,
> signing requests and responses might need some minutes to fully update the agent certificates.
2017-09-07 19:00:11 +02:00
>
> If you have chosen to use [On-Demand CSR Signing](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing)
2017-11-03 17:07:16 +01:00
> certificates need to be signed on the master first. Ticket-less setups require at least Icinga 2 v2.8+ on all involved instances.
2016-08-21 12:43:28 +02:00
2019-07-20 12:36:24 +02:00
Now that you've successfully installed a Linux/Unix agent/satellite instance, please proceed to
2017-07-12 20:46:12 +02:00
the [configuration modes ](06-distributed-monitoring.md#distributed-monitoring-configuration-modes ).
2016-08-13 15:59:06 +02:00
2017-09-07 19:00:11 +02:00
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-setup-client-windows" > < / a >
2017-09-07 19:00:11 +02:00
2019-07-20 12:36:24 +02:00
### Agent Setup on Windows <a id="distributed-monitoring-setup-agent-windows"></a>
2016-08-13 15:59:06 +02:00
2019-08-30 14:11:51 +02:00
The supported Windows agent versions are listed [here ](https://icinga.com/subscription/support-details/ ).
2016-08-13 15:59:06 +02:00
Requirements:
2019-08-30 14:11:51 +02:00
* [Microsoft .NET Framework 4.6 ](https://www.microsoft.com/en-US/download/details.aspx?id=53344 ) or higher. This is the default on Windows Server 2016 or later.
* [Universal C Runtime for Windows ](https://support.microsoft.com/en-us/help/2999226/update-for-universal-c-runtime-in-windows ) for Windows Server 2012 and older.
2016-08-13 15:59:06 +02:00
2019-08-30 14:11:51 +02:00
#### Agent Setup on Windows: Installer <a id="distributed-monitoring-setup-agent-windows-installer"></a>
Download the MSI-Installer package from [https://packages.icinga.com/windows/ ](https://packages.icinga.com/windows/ ).
The preferred flavor is `x86_64` for modern Windows systems.
The Windows package provides native [monitoring plugin binaries ](06-distributed-monitoring.md#distributed-monitoring-windows-plugins )
2016-08-31 13:25:57 +02:00
to get you started more easily.
2019-08-30 14:11:51 +02:00
The installer package also includes the [NSClient++ ](https://www.nsclient.org/ ) package
to allow using its built-in plugins. You can find more details in
[this chapter ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp ).
2016-08-31 13:25:57 +02:00
2017-11-03 17:07:16 +01:00
> **Note**
>
2019-07-20 12:36:24 +02:00
> Please note that Icinga 2 was designed to run as light-weight agent on Windows.
2017-11-03 17:07:16 +01:00
> There is no support for satellite instances.
2016-08-13 15:59:06 +02:00
2016-12-08 16:32:47 +01:00
Run the MSI-Installer package and follow the instructions shown in the screenshots.
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_installer_01.png )
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_installer_02.png )
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_installer_03.png )
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_installer_04.png )
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_installer_05.png )
2019-07-31 15:01:51 +02:00
The graphical installer offers to run the [Icinga Agent setup wizard ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-windows-configuration-wizard )
after the installation. Select the check box to proceed.
2017-11-03 17:07:16 +01:00
> **Tip**
>
2019-07-31 15:01:51 +02:00
> You can also run the Icinga agent setup wizard from the Start menu later.
#### Agent Setup on Windows: Configuration Wizard <a id="distributed-monitoring-setup-agent-windows-configuration-wizard"></a>
2017-11-03 17:07:16 +01:00
On a fresh installation the setup wizard guides you through the initial configuration.
2020-06-22 14:20:08 +02:00
It also provides a mechanism to send a certificate request to the [CSR signing master ](06-distributed-monitoring.md#distributed-monitoring-setup-sign-certificates-master ).
2016-08-13 15:59:06 +02:00
2017-11-03 17:07:16 +01:00
The following configuration details are required:
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Parameter | Description
--------------------|--------------------
2017-11-03 17:07:16 +01:00
Instance name | **Required.** By convention this should be the host's FQDN. Defaults to the FQDN.
Setup ticket | **Optional.** Paste the previously generated [ticket number ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ). If left blank, the certificate request must be [signed on the master node ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing ).
2016-08-13 15:59:06 +02:00
Fill in the required information and click `Add` to add a new master connection.
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_01.png )
Add the following details:
2017-11-03 17:07:16 +01:00
Parameter | Description
-------------------------------|-------------------------------
2019-07-20 12:36:24 +02:00
Instance name | **Required.** The master/satellite endpoint name where this agent is a direct child of.
2017-11-03 17:07:16 +01:00
Master/Satellite endpoint host | **Required.** The master or satellite's IP address or FQDN. This information is included in the `Endpoint` object configuration in the `zones.conf` file.
Master/Satellite endpoint port | **Optional.** The master or satellite's listening port. This information is included in the `Endpoint` object configuration.
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_02.png )
2018-03-26 19:30:37 +02:00
When needed you can add an additional global zone (the zones `global-templates` and `director-global` are added by default):
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_02_global_zone.png )
2017-11-03 17:07:16 +01:00
Optionally enable the following settings:
2016-08-13 15:59:06 +02:00
2019-07-31 15:01:51 +02:00
Parameter | Description
--------------------------------------------------------|----------------------------------
Accept commands from master/satellite instance(s) | **Optional.** Whether this node accepts command execution messages from the master node (required for [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )). For [security reasons ](06-distributed-monitoring.md#distributed-monitoring-security ) this is disabled by default.
Accept config updates from master/satellite instance(s) | **Optional.** Whether this node accepts configuration sync from the master node (required for [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync )). For [security reasons ](06-distributed-monitoring.md#distributed-monitoring-security ) this is disabled by default.
Run Icinga 2 service as this user | **Optional.** Specify a different Windows user. This defaults to `NT AUTHORITY\Network Service` and is required for more privileged service checks.
Install/Update bundled NSClient++ | **Optional.** The Windows installer bundles the NSClient++ installer for additional [plugin checks ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp ).
Disable including local 'conf.d' directory | **Optional.** Allows to disable the `include_recursive "conf.d"` directive except for the `api-users.conf` file in the `icinga2.conf` file. Defaults to `true` .
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_03.png )
2017-11-03 17:07:16 +01:00
Verify the certificate from the master/satellite instance where this node should connect to.
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_04.png )
2017-11-03 17:07:16 +01:00
2019-07-20 12:36:24 +02:00
#### Bundled NSClient++ Setup <a id="distributed-monitoring-setup-agent-windows-nsclient"></a>
2016-12-08 16:32:47 +01:00
2017-11-03 17:07:16 +01:00
If you have chosen to install/update the NSClient++ package, the Icinga 2 setup wizard asks
2016-08-13 15:59:06 +02:00
you to do so.
2016-12-08 16:32:47 +01:00
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_01.png )
Choose the `Generic` setup.
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_02.png )
Choose the `Custom` setup type.
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_03.png )
NSClient++ does not install a sample configuration by default. Change this as shown in the screenshot.
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_04.png )
Generate a secure password and enable the web server module. **Note** : The webserver module is
2017-11-03 17:07:16 +01:00
available starting with NSClient++ 0.5.0. Icinga 2 v2.6+ is required which includes this version.
2016-12-08 16:32:47 +01:00
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_05.png )
Finish the installation.
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_06.png )
Open a web browser and navigate to `https://localhost:8443` . Enter the password you've configured
during the setup. In case you lost it, look into the `C:\Program Files\NSClient++\nsclient.ini`
configuration file.
![Icinga 2 Windows Setup NSClient++ ](images/distributed-monitoring/icinga2_windows_setup_wizard_05_nsclient_07.png )
2017-11-03 17:07:16 +01:00
The NSClient++ REST API can be used to query metrics. [check_nscp_api ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp-check-api )
uses this transport method.
2016-12-08 16:32:47 +01:00
2019-07-20 14:51:24 +02:00
#### Finish Windows Agent Setup <a id="distributed-monitoring-setup-agent-windows-finish"></a>
2016-08-13 15:59:06 +02:00
2017-11-03 17:07:16 +01:00
Finish the Windows setup wizard.
2016-08-13 15:59:06 +02:00
2017-11-03 17:07:16 +01:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_06_finish_with_ticket.png )
2016-08-13 15:59:06 +02:00
2017-11-03 17:07:16 +01:00
If you did not provide a setup ticket, you need to sign the certificate request on the master.
The setup wizards tells you to do so. The Icinga 2 service is running at this point already
and will automatically receive and update a signed client certificate.
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_06_finish_no_ticket.png )
Icinga 2 is automatically started as a Windows service.
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_running_service.png )
2016-08-20 14:17:18 +02:00
The Icinga 2 configuration is stored inside the `C:\ProgramData\icinga2` directory.
2017-11-03 17:07:16 +01:00
Click `Examine Config` in the setup wizard to open a new Explorer window.
2016-08-13 15:59:06 +02:00
![Icinga 2 Windows Setup ](images/distributed-monitoring/icinga2_windows_setup_wizard_examine_config.png )
2019-07-20 12:36:24 +02:00
The configuration files can be modified with your favorite editor e.g. Notepad++ or vim in Powershell (via chocolatey).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
In order to use the [top down ](06-distributed-monitoring.md#distributed-monitoring-top-down ) agent
2016-12-08 16:32:47 +01:00
configuration prepare the following steps.
2019-07-20 12:36:24 +02:00
You don't need any local configuration on the agent except for
2016-12-08 16:32:47 +01:00
CheckCommand definitions which can be synced using the global zone
above. Therefore disable the inclusion of the `conf.d` directory
in the `icinga2.conf` file.
2019-07-20 12:36:24 +02:00
2016-12-08 16:32:47 +01:00
Navigate to `C:\ProgramData\icinga2\etc\icinga2` and open
the `icinga2.conf` file in your preferred editor. Remove or comment (`//`)
the following line:
2018-05-08 16:31:06 +02:00
```
2019-07-20 12:36:24 +02:00
// Commented out, not required on an agent with top down mode
2018-05-08 16:31:06 +02:00
//include_recursive "conf.d"
```
> **Note**
>
> Packages >= 2.9 provide an option in the setup wizard to disable this.
> Defaults to disabled.
2016-12-08 16:32:47 +01:00
2019-08-30 14:04:05 +02:00
Validate the configuration on Windows open an administrative Powershell
2016-12-08 16:32:47 +01:00
and run the following command:
2018-05-08 16:31:06 +02:00
```
2019-08-30 14:04:05 +02:00
C:\> cd C:\Program Files\ICINGA2\sbin
C:\Program Files\ICINGA2\sbin> .\icinga2.exe daemon -C
2018-05-08 16:31:06 +02:00
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
**Note**: You have to run this command in a shell with `administrator` privileges.
2016-08-13 15:59:06 +02:00
2019-08-30 14:04:05 +02:00
Now you need to restart the Icinga 2 service. Run `services.msc` from the start menu and restart the `icinga2` service.
Alternatively open an administrative Powershell and run the following commands:
```
C:\> Restart-Service icinga2
C:\> Get-Service icinga2
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
Now that you've successfully installed a Windows agent, please proceed to
2017-07-12 20:46:12 +02:00
the [detailed configuration modes ](06-distributed-monitoring.md#distributed-monitoring-configuration-modes ).
2016-12-08 16:32:47 +01:00
2019-07-20 12:36:24 +02:00
2017-07-12 20:46:12 +02:00
## Configuration Modes <a id="distributed-monitoring-configuration-modes"></a>
2016-08-13 15:59:06 +02:00
There are different ways to ensure that the Icinga 2 cluster nodes execute
2016-08-20 14:51:05 +02:00
checks, send notifications, etc.
2016-08-13 15:59:06 +02:00
2017-09-20 11:32:26 +02:00
The preferred method is to configure monitoring objects on the master
2019-07-20 12:36:24 +02:00
and distribute the configuration to satellites and agents.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The following chapters explain this in detail with hands-on manual configuration
2017-09-20 11:32:26 +02:00
examples. You should test and implement this once to fully understand how it works.
2016-08-13 15:59:06 +02:00
2017-09-20 11:32:26 +02:00
Once you are familiar with Icinga 2 and distributed monitoring, you
can start with additional integrations to manage and deploy your
configuration:
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
* [Icinga Director ](https://icinga.com/docs/director/latest/ ) provides a web interface to manage configuration and also allows to sync imported resources (CMDB, PuppetDB, etc.)
2019-09-17 11:34:01 +02:00
* [Ansible Roles ](https://icinga.com/products/integrations/ )
* [Puppet Module ](https://icinga.com/products/integrations/puppet/ )
* [Chef Cookbook ](https://icinga.com/products/integrations/chef/ )
2016-08-13 15:59:06 +02:00
2017-10-09 21:09:12 +02:00
More details can be found [here ](13-addons.md#configuration-tools ).
2016-08-13 15:59:06 +02:00
2017-09-20 11:32:26 +02:00
### Top Down <a id="distributed-monitoring-top-down"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
There are two different behaviors with check execution:
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
* Send a command execution event remotely: The scheduler still runs on the parent node.
* Sync the host/service objects directly to the child node: Checks are executed locally.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Again, technically it does not matter whether this is an `agent` or a `satellite`
2016-08-13 15:59:06 +02:00
which is receiving configuration or command execution events.
2017-07-12 20:46:12 +02:00
### Top Down Command Endpoint <a id="distributed-monitoring-top-down-command-endpoint"></a>
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
This mode forces the Icinga 2 node to execute commands remotely on a specified endpoint.
The host/service object configuration is located on the master/satellite and the agent only
needs the CheckCommand object definitions available.
2016-08-13 15:59:06 +02:00
2018-01-29 14:40:40 +01:00
Every endpoint has its own remote check queue. The amount of checks executed simultaneously
2019-07-18 16:34:36 +02:00
can be limited on the endpoint with the `MaxConcurrentChecks` constant defined in [constants.conf ](04-configuration.md#constants-conf ). Icinga 2 may discard check requests,
2018-01-29 14:40:40 +01:00
if the remote check queue is full.
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Top Down Command Endpoint ](images/distributed-monitoring/icinga2_distributed_monitoring_agent_checks_command_endpoint.png )
2016-08-13 15:59:06 +02:00
Advantages:
2019-07-20 12:36:24 +02:00
* No local checks need to be defined on the child node (agent).
2016-08-21 12:43:28 +02:00
* Light-weight remote check execution (asynchronous events).
2017-07-12 20:46:12 +02:00
* No [replay log ](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-command-endpoint-log-duration ) is necessary for the child node.
2016-08-20 14:17:18 +02:00
* Pin checks to specific endpoints (if the child zone consists of 2 endpoints).
2016-08-13 15:59:06 +02:00
Disadvantages:
2016-08-20 14:17:18 +02:00
* If the child node is not connected, no more checks are executed.
* Requires additional configuration attribute specified in host/service objects.
2017-07-12 20:46:12 +02:00
* Requires local `CheckCommand` object configuration. Best practice is to use a [global config zone ](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync ).
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
To make sure that all nodes involved will accept configuration and/or
commands, you need to configure the `Zone` and `Endpoint` hierarchy
2016-08-13 15:59:06 +02:00
on all nodes.
* `icinga2-master1.localdomain` is the configuration master in this scenario.
2019-07-20 12:36:24 +02:00
* `icinga2-agent1.localdomain` acts as agent which receives command execution messages via command endpoint from the master. In addition, it receives the global check command configuration from the master.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Include the endpoint and zone configuration on **both** nodes in the file `/etc/icinga2/zones.conf` .
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
The endpoint configuration could look like this, for example:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-11-23 15:33:28 +01:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
host = "192.168.56.101"
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
host = "192.168.56.111"
2019-07-20 12:36:24 +02:00
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Next, you need to define two zones. There is no naming convention, best practice is to either use `master` , `satellite` /`agent-fqdn` or to choose region names for example `Europe` , `USA` and `Asia` , though.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
**Note**: Each agent requires its own zone and endpoint configuration. Best practice
is to use the agent's FQDN for all object names.
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
The `master` zone is a parent of the `icinga2-agent1.localdomain` zone:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-11-23 15:33:28 +01:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ] //array with endpoint names
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
parent = "master" //establish zone hierarchy
}
2019-03-28 09:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
You don't need any local configuration on the agent except for
2016-12-03 13:42:22 +01:00
CheckCommand definitions which can be synced using the global zone
above. Therefore disable the inclusion of the `conf.d` directory
in `/etc/icinga2/icinga2.conf` .
2018-05-08 16:31:06 +02:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/icinga2.conf
2018-05-08 16:31:06 +02:00
2019-07-20 12:36:24 +02:00
// Commented out, not required on an agent as command endpoint
2018-05-08 16:31:06 +02:00
//include_recursive "conf.d"
```
2016-12-03 13:42:22 +01:00
2018-05-08 16:31:06 +02:00
> **Note**
>
> Packages >= 2.9 provide an option in the setup wizard to disable this.
> Defaults to disabled.
2016-12-03 13:42:22 +01:00
2016-08-20 14:17:18 +02:00
Now it is time to validate the configuration and to restart the Icinga 2 daemon
2016-08-13 15:59:06 +02:00
on both nodes.
Example on CentOS 7:
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# icinga2 daemon -C
[root@icinga2-agent1.localdomain /]# systemctl restart icinga2
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Once the agents have successfully connected, you are ready for the next step: **execute
a remote check on the agent using the command endpoint**.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Include the host and service object configuration in the `master` zone
-- this will help adding a secondary master for high-availability later.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/master
```
2016-08-13 15:59:06 +02:00
Add the host and service objects you want to monitor. There is
2016-08-20 14:17:18 +02:00
no limitation for files and directories -- best practice is to
sort things by type.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
By convention a master/satellite/agent host object should use the same name as the endpoint object.
You can also add multiple hosts which execute checks against remote services/agents.
The following example adds the `agent_endpoint` custom variable to the
host and stores its name (FQDN). _Versions older than 2.11
used the `client_endpoint` custom variable._
This custom variable serves two purposes: 1) Service apply rules can match against it.
2) Apply rules can retrieve its value and assign it to the `command_endpoint` attribute.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive" //check is executed on the master
address = "192.168.56.111"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Given that you are monitoring a Linux agent, add a remote [disk ](10-icinga-template-library.md#plugin-check-command-disk )
2016-08-13 15:59:06 +02:00
check.
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "disk" {
check_command = "disk"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Specify the remote agent as command execution endpoint, fetch the host custom variable
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Only assign where a host is marked as agent endpoint
assign where host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
If you have your own custom `CheckCommand` definition, add it to the global zone:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/global-templates
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-templates/commands.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object CheckCommand "my-cmd" {
//...
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Save the changes and validate the configuration on the master node:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
```
2016-08-13 15:59:06 +02:00
Restart the Icinga 2 daemon (example for CentOS 7):
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
The following steps will happen:
2016-08-13 15:59:06 +02:00
* Icinga 2 validates the configuration on `icinga2-master1.localdomain` and restarts.
* The `icinga2-master1.localdomain` node schedules and executes the checks.
2019-07-19 14:44:14 +02:00
* The `icinga2-agent1.localdomain` node receives the execute command event with additional command parameters.
* The `icinga2-agent1.localdomain` node maps the command parameters to the local check command, executes the check locally, and sends back the check result message.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
As you can see, no interaction from your side is required on the agent itself, and it's not necessary to reload the Icinga 2 service on the agent.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
You have learned the basics about command endpoint checks. Proceed with
2017-07-12 20:46:12 +02:00
the [scenarios ](06-distributed-monitoring.md#distributed-monitoring-scenarios )
2016-08-20 14:17:18 +02:00
section where you can find detailed information on extending the setup.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
### Top Down Config Sync <a id="distributed-monitoring-top-down-config-sync"></a>
2016-08-13 15:59:06 +02:00
This mode syncs the object configuration files within specified zones.
2016-08-20 14:17:18 +02:00
It comes in handy if you want to configure everything on the master node
2016-08-13 15:59:06 +02:00
and sync the satellite checks (disk, memory, etc.). The satellites run their
own local scheduler and will send the check result messages back to the master.
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Top Down Config Sync ](images/distributed-monitoring/icinga2_distributed_monitoring_satellite_config_sync.png )
2016-08-13 15:59:06 +02:00
Advantages:
* Sync the configuration files from the parent zone to the child zones.
2016-08-20 14:17:18 +02:00
* No manual restart is required on the child nodes, as syncing, validation, and restarts happen automatically.
2016-08-13 15:59:06 +02:00
* Execute checks directly on the child node's scheduler.
* Replay log if the connection drops (important for keeping the check history in sync, e.g. for SLA reports).
* Use a global zone for syncing templates, groups, etc.
Disadvantages:
* Requires a config directory on the master node with the zone name underneath `/etc/icinga2/zones.d` .
2016-08-20 14:17:18 +02:00
* Additional zone and endpoint configuration needed.
2016-08-21 12:43:28 +02:00
* Replay log is replicated on reconnect after connection loss. This might increase the data transfer and create an overload on the connection.
2016-08-13 15:59:06 +02:00
2019-08-02 11:30:16 +02:00
> **Note**
>
> This mode only supports **configuration text files** for Icinga. Do not abuse
> this for syncing binaries, this is not supported and may harm your production
> environment. The config sync uses checksums to detect changes, binaries may
> trigger reload loops.
>
> This is a fair warning. If you want to deploy plugin binaries, create
> packages for dependency management and use infrastructure lifecycle tools
> such as Foreman, Puppet, Ansible, etc.
2016-08-20 14:17:18 +02:00
To make sure that all involved nodes accept configuration and/or
commands, you need to configure the `Zone` and `Endpoint` hierarchy
2016-08-13 15:59:06 +02:00
on all nodes.
* `icinga2-master1.localdomain` is the configuration master in this scenario.
2019-07-20 12:36:24 +02:00
* `icinga2-satellite1.localdomain` acts as satellite which receives configuration from the master. Checks are scheduled locally.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Include the endpoint and zone configuration on **both** nodes in the file `/etc/icinga2/zones.conf` .
2016-08-13 15:59:06 +02:00
The endpoint configuration could look like this:
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-satellite1.localdomain /]# vim /etc/icinga2/zones.conf
2016-11-23 15:33:28 +01:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
host = "192.168.56.101"
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Endpoint "icinga2-satellite1.localdomain" {
host = "192.168.56.105"
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Next, you need to define two zones. There is no naming convention, best practice is to either use `master` , `satellite` /`agent-fqdn` or to choose region names for example `Europe` , `USA` and `Asia` , though.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The `master` zone is a parent of the `satellite` zone:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent2.localdomain /]# vim /etc/icinga2/zones.conf
2016-11-23 15:33:28 +01:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ] //array with endpoint names
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain" ]
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
parent = "master" //establish zone hierarchy
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Edit the `api` feature on the satellite `icinga2-satellite1.localdomain` in
2016-08-20 14:17:18 +02:00
the `/etc/icinga2/features-enabled/api.conf` file and set
2016-08-13 15:59:06 +02:00
`accept_config` to `true` .
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-satellite1.localdomain /]# vim /etc/icinga2/features-enabled/api.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object ApiListener "api" {
//...
accept_config = true
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Now it is time to validate the configuration and to restart the Icinga 2 daemon
2016-08-13 15:59:06 +02:00
on both nodes.
Example on CentOS 7:
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-satellite1.localdomain /]# icinga2 daemon -C
[root@icinga2-satellite1.localdomain /]# systemctl restart icinga2
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
**Tip**: Best practice is to use a [global zone ](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync )
2016-08-20 14:51:05 +02:00
for common configuration items (check commands, templates, groups, etc.).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Once the satellite(s) have connected successfully, it's time for the next step: **execute
a local check on the satellite using the configuration sync**.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Navigate to `/etc/icinga2/zones.d` on your master node
2016-08-13 15:59:06 +02:00
`icinga2-master1.localdomain` and create a new directory with the same
2019-07-20 12:36:24 +02:00
name as your satellite/agent zone name:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/satellite
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
Add the host and service objects you want to monitor. There is
2016-08-20 14:17:18 +02:00
no limitation for files and directories -- best practice is to
sort things by type.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
By convention a master/satellite/agent host object should use the same name as the endpoint object.
You can also add multiple hosts which execute checks against remote services/agents via [command endpoint ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )
checks.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/satellite
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Host "icinga2-satellite1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.112"
2019-07-20 12:36:24 +02:00
zone = "master" //optional trick: sync the required host object to the satellite, but enforce the "master" zone to execute the check
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Given that you are monitoring a Linux satellite add a local [disk ](10-icinga-template-library.md#plugin-check-command-disk )
2016-08-13 15:59:06 +02:00
check.
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Service "disk" {
2019-07-20 12:36:24 +02:00
host_name = "icinga2-satellite1.localdomain"
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
check_command = "disk"
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Save the changes and validate the configuration on the master node:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
```
2016-08-13 15:59:06 +02:00
Restart the Icinga 2 daemon (example for CentOS 7):
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
The following steps will happen:
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
* Icinga 2 validates the configuration on `icinga2-master1.localdomain` .
* Icinga 2 copies the configuration into its zone config store in `/var/lib/icinga2/api/zones` .
2016-08-13 15:59:06 +02:00
* The `icinga2-master1.localdomain` node sends a config update event to all endpoints in the same or direct child zones.
2019-07-20 12:36:24 +02:00
* The `icinga2-satellite1.localdomain` node accepts config and populates the local zone config store with the received config files.
* The `icinga2-satellite1.localdomain` node validates the configuration and automatically restarts.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Again, there is no interaction required on the satellite itself.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
You can also use the config sync inside a high-availability zone to
2016-08-13 15:59:06 +02:00
ensure that all config objects are synced among zone members.
2016-08-23 20:20:15 +02:00
**Note**: You can only have one so-called "config master" in a zone which stores
the configuration in the `zones.d` directory.
Multiple nodes with configuration files in the `zones.d` directory are
**not supported**.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Now that you've learned the basics about the configuration sync, proceed with
2017-07-12 20:46:12 +02:00
the [scenarios ](06-distributed-monitoring.md#distributed-monitoring-scenarios )
2016-08-20 14:17:18 +02:00
section where you can find detailed information on extending the setup.
2016-08-13 15:59:06 +02:00
2016-11-23 15:33:28 +01:00
If you are eager to start fresh instead you might take a look into the
2018-10-18 09:50:53 +02:00
[Icinga Director ](https://icinga.com/docs/director/latest/ ).
2016-11-23 15:33:28 +01:00
2017-07-12 20:46:12 +02:00
## Scenarios <a id="distributed-monitoring-scenarios"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
The following examples should give you an idea on how to build your own
2016-08-13 15:59:06 +02:00
distributed monitoring environment. We've seen them all in production
2019-05-08 18:16:54 +02:00
environments and received feedback from our [community ](https://community.icinga.com/ )
2018-10-18 09:59:33 +02:00
and [partner support ](https://icinga.com/support/ ) channels:
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
* [Single master with agents ](06-distributed-monitoring.md#distributed-monitoring-master-agents ).
* [HA master with agents as command endpoint ](06-distributed-monitoring.md#distributed-monitoring-scenarios-ha-master-agents )
* [Three level cluster ](06-distributed-monitoring.md#distributed-monitoring-scenarios-master-satellite-agents ) with config HA masters, satellites receiving config sync, and agents checked using command endpoint.
2019-05-08 18:16:54 +02:00
You can also extend the cluster tree depth to four levels e.g. with 2 satellite levels.
Just keep in mind that multiple levels become harder to debug in case of errors.
You can also start with a single master setup, and later add a secondary
master endpoint. This requires an extra step with the [initial sync ](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-initial-sync )
2019-07-20 12:36:24 +02:00
for cloning the runtime state. This is described in detail [here ](06-distributed-monitoring.md#distributed-monitoring-scenarios-ha-master-agents ).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatiblity -->
< a id = "distributed-monitoring-master-clients" > < / a >
### Master with Agents <a id="distributed-monitoring-master-agents"></a>
2016-08-13 15:59:06 +02:00
2019-05-08 18:16:54 +02:00
In this scenario, a single master node runs the check scheduler, notifications
and IDO database backend and uses the [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )
2019-07-20 12:36:24 +02:00
to execute checks on the remote agents.
2019-05-08 18:16:54 +02:00
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Master with Agents ](images/distributed-monitoring/icinga2_distributed_monitoring_scenarios_master_with_agents.png )
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
* `icinga2-master1.localdomain` is the primary master node.
2019-07-20 12:36:24 +02:00
* `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` are two child nodes as agents.
2016-08-13 15:59:06 +02:00
Setup requirements:
2017-07-12 20:46:12 +02:00
* Set up `icinga2-master1.localdomain` as [master ](06-distributed-monitoring.md#distributed-monitoring-setup-master ).
2019-07-20 12:36:24 +02:00
* Set up `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` as [agent ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ).
2016-08-13 15:59:06 +02:00
Edit the `zones.conf` configuration file on the master:
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.111" // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.112" // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The two agent nodes do not need to know about each other. The only important thing
2016-08-20 14:17:18 +02:00
is that they know about the parent zone and their endpoint members (and optionally the global zone).
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
If you specify the `host` attribute in the `icinga2-master1.localdomain` endpoint object,
2019-07-20 12:36:24 +02:00
the agent will actively try to connect to the master node. Since you've specified the agent
endpoint's attribute on the master node already, you don't want the agents to connect to the
2017-07-12 20:46:12 +02:00
master. **Choose one [connection direction](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-connection-direction).**
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
```
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent2.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Now it is time to define the two agent hosts and apply service checks using
2016-08-20 14:17:18 +02:00
the command endpoint execution method on them. Note: You can also use the
2016-08-13 15:59:06 +02:00
config sync mode here.
2016-08-20 14:17:18 +02:00
Create a new configuration directory on the master node:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/master
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Add the two agent nodes as host objects:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.111"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent2.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.112"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Add services using command endpoint checks:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "ping4" {
check_command = "ping4"
2019-07-20 12:36:24 +02:00
2019-03-07 19:56:49 +01:00
//check is executed on the master node
assign where host.address
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "disk" {
check_command = "disk"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Execute the check on the remote command endpoint
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Assign the service onto an agent
assign where host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
Validate the configuration and restart Icinga 2 on the master node `icinga2-master1.localdomain` .
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2017-02-14 12:53:04 +01:00
2019-07-20 12:36:24 +02:00
Open Icinga Web 2 and check the two newly created agent hosts with two new services
2016-08-13 15:59:06 +02:00
-- one executed locally (`ping4`) and one using command endpoint (`disk`).
2019-07-20 12:36:24 +02:00
> **Note**
>
> You don't necessarily need to add the agent endpoint/zone configuration objects
> into the master's zones.conf file. Instead, you can put them into `/etc/icinga2/zones.d/master`
> either in `hosts.conf` shown above, or in a new file called `agents.conf`.
> **Tip**:
>
> It's a good idea to add [health checks](06-distributed-monitoring.md#distributed-monitoring-health-checks)
to make sure that your cluster notifies you in case of failure.
In terms of health checks, consider adding the following for this scenario:
- Master node(s) check the connection to the agents
- Optional: Add dependencies for the agent host to prevent unwanted notifications when agents are unreachable
Proceed in [this chapter ](06-distributed-monitoring.md#distributed-monitoring-health-checks-master-agents ).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-scenarios-ha-master-clients" > < / a >
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
### High-Availability Master with Agents <a id="distributed-monitoring-scenarios-ha-master-agents"></a>
2019-07-20 12:36:24 +02:00
This scenario is similar to the one in the [previous section ](06-distributed-monitoring.md#distributed-monitoring-master-agents ). The only difference is that we will now set up two master nodes in a high-availability setup.
2016-08-20 14:17:18 +02:00
These nodes must be configured as zone and endpoints objects.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed High Availability Master with Agents ](images/distributed-monitoring/icinga2_distributed_monitoring_scenario_ha_masters_with_agents.png )
2019-05-08 18:16:54 +02:00
2016-08-20 14:17:18 +02:00
The setup uses the capabilities of the Icinga 2 cluster. All zone members
2019-07-20 12:36:24 +02:00
replicate cluster events between each other. In addition to that, several Icinga 2
2019-05-08 18:16:54 +02:00
features can enable [HA functionality ](06-distributed-monitoring.md#distributed-monitoring-high-availability-features ).
Best practice is to run the database backend on a dedicated server/cluster and
only expose a virtual IP address to Icinga and the IDO feature. By default, only one
endpoint will actively write to the backend then. Typical setups for MySQL clusters
2019-05-09 09:25:07 +02:00
involve Master-Master-Replication (Master-Slave-Replication in both directions) or Galera,
more tips can be found on our [community forums ](https://community.icinga.com/ ).
2019-07-08 12:18:41 +02:00
The IDO object must have the same `instance_name` on all master nodes.
2016-08-13 15:59:06 +02:00
2016-08-23 20:20:15 +02:00
**Note**: All nodes in the same zone require that you enable the same features for high-availability (HA).
2016-08-13 15:59:06 +02:00
Overview:
2016-08-20 14:17:18 +02:00
* `icinga2-master1.localdomain` is the config master master node.
* `icinga2-master2.localdomain` is the secondary master master node without config in `zones.d` .
2019-07-20 12:36:24 +02:00
* `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` are two child nodes as agents.
2016-08-13 15:59:06 +02:00
Setup requirements:
2017-07-12 20:46:12 +02:00
* Set up `icinga2-master1.localdomain` as [master ](06-distributed-monitoring.md#distributed-monitoring-setup-master ).
2019-07-20 12:36:24 +02:00
* Set up `icinga2-master2.localdomain` as [satellite ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ) (**we will modify the generated configuration**).
* Set up `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` as [agents ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ) (when asked for adding multiple masters, set to `y` and add the secondary master `icinga2-master2.localdomain` ).
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
In case you don't want to use the CLI commands, you can also manually create and sync the
2019-07-20 12:36:24 +02:00
required TLS certificates. We will modify and discuss all the details of the automatically generated configuration here.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Since there are now two nodes in the same zone, we must consider the
2017-07-12 20:46:12 +02:00
[high-availability features ](06-distributed-monitoring.md#distributed-monitoring-high-availability-features ).
2016-08-13 15:59:06 +02:00
2017-10-30 17:02:42 +01:00
* Checks and notifications are balanced between the two master nodes. That's fine, but it requires check plugins and notification scripts to exist on both nodes.
2016-08-20 14:17:18 +02:00
* The IDO feature will only be active on one node by default. Since all events are replicated between both nodes, it is easier to just have one central database.
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
One possibility is to use a dedicated MySQL cluster VIP (external application cluster)
2016-08-21 12:43:28 +02:00
and leave the IDO feature with enabled HA capabilities. Alternatively,
you can disable the HA feature and write to a local database on each node.
Both methods require that you configure Icinga Web 2 accordingly (monitoring
backend, IDO database, used transports, etc.).
2016-08-13 15:59:06 +02:00
2019-05-08 18:16:54 +02:00
> **Note**
>
2019-07-20 12:36:24 +02:00
> You can also start with a single master shown [here](06-distributed-monitoring.md#distributed-monitoring-master-agents) and later add
2019-05-08 18:16:54 +02:00
> the second master. This requires an extra step with the [initial sync](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-initial-sync)
> for cloning the runtime state after done. Once done, proceed here.
2019-07-20 12:36:24 +02:00
In this scenario, we are not adding the agent configuration immediately
to the `zones.conf` file but will establish the hierarchy later.
The first master looks like this:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.102" // Actively connect to the secondary master
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
/* sync global commands */
object Zone "global-templates" {
global = true
2019-03-07 19:56:49 +01:00
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
2019-03-07 19:56:49 +01:00
}
2019-07-20 12:36:24 +02:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The secondary master waits for connection attempts from the first master,
and therefore does not try to connect to it again.
2016-09-29 17:58:31 +02:00
2019-07-20 12:36:24 +02:00
```
2020-04-24 16:31:59 +02:00
[root@icinga2-master2.localdomain /]# vim /etc/icinga2/zones.conf
2019-07-20 12:36:24 +02:00
object Endpoint "icinga2-master1.localdomain" {
2020-04-24 16:31:59 +02:00
// The first master already connects to us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Endpoint "icinga2-master2.localdomain" {
2020-04-24 16:31:59 +02:00
// That's us
2019-07-20 12:36:24 +02:00
}
2016-09-29 17:58:31 +02:00
2019-07-20 12:36:24 +02:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Restart both masters and ensure the initial connection and TLS handshake works.
The two agent nodes do not need to know about each other. The only important thing
2016-08-20 14:17:18 +02:00
is that they know about the parent zone and their endpoint members (and optionally about the global zone).
2016-08-13 15:59:06 +02:00
If you specify the `host` attribute in the `icinga2-master1.localdomain` and `icinga2-master2.localdomain`
2019-07-20 12:36:24 +02:00
endpoint objects, the agent will actively try to connect to the master node. Since we've specified the agent
endpoint's attribute on the master node already, we don't want the agent to connect to the
2017-07-12 20:46:12 +02:00
master nodes. **Choose one [connection direction](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-connection-direction).**
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent2.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
//That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2019-07-20 12:36:24 +02:00
object Zone "director-global" {
global = true
}
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Now it is time to define the two agent hosts and apply service checks using
the command endpoint execution method.
2016-08-13 15:59:06 +02:00
Create a new configuration directory on the master node `icinga2-master1.localdomain` .
**Note**: The secondary master node `icinga2-master2.localdomain` receives the
2017-07-12 20:46:12 +02:00
configuration using the [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync ).
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/master
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Add the two agent nodes with their zone/endpoint and host object configuration.
> **Note**
>
> In order to keep things in sync between the two HA masters,
> keep the `zones.conf` file as small as possible.
>
> You can create the agent zone and endpoint objects inside the
> master zone and have them synced to the secondary master.
> The cluster config sync enforces a reload allowing the secondary
> master to connect to the agents as well.
Edit the `zones.conf` file and ensure that the agent zone/endpoint objects
are **not** specified in there.
Then navigate into `/etc/icinga2/zones.d/master` and create a new file `agents.conf` .
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim agents.conf
//-----------------------------------------------
// Endpoints
object Endpoint "icinga2-agent1.localdomain" {
host = "192.168.56.111" // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
}
object Endpoint "icinga2-agent2.localdomain" {
host = "192.168.56.112" // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
}
//-----------------------------------------------
// Zones
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
parent = "master"
}
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
parent = "master"
}
```
Whenever you need to add an agent again, edit the mentioned files.
Next, create the corresponding host objects for the agents. Use the same names
for host and endpoint objects.
```
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.111"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent2.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.112"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Add services using command endpoint checks:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "ping4" {
check_command = "ping4"
2019-07-20 12:36:24 +02:00
// Check is executed on the master node
2019-03-07 19:56:49 +01:00
assign where host.address
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "disk" {
check_command = "disk"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Check is executed on the remote command endpoint
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
assign where host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
Validate the configuration and restart Icinga 2 on the master node `icinga2-master1.localdomain` .
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2017-02-14 12:53:04 +01:00
2019-07-20 12:36:24 +02:00
Open Icinga Web 2 and check the two newly created agent hosts with two new services
2016-08-13 15:59:06 +02:00
-- one executed locally (`ping4`) and one using command endpoint (`disk`).
2019-07-20 12:36:24 +02:00
> **Tip**:
>
> It's a good idea to add [health checks](06-distributed-monitoring.md#distributed-monitoring-health-checks)
2016-08-23 20:20:15 +02:00
to make sure that your cluster notifies you in case of failure.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
In terms of health checks, consider adding the following for this scenario:
- Master node(s) check the connection to the agents
- Optional: Add dependencies for the agent host to prevent unwanted notifications when agents are unreachable
Proceed in [this chapter ](06-distributed-monitoring.md#distributed-monitoring-health-checks-master-agents ).
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-scenarios-master-satellite-client" > < / a >
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
### Three Levels with Masters, Satellites and Agents <a id="distributed-monitoring-scenarios-master-satellite-agents"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
This scenario combines everything you've learned so far: High-availability masters,
2019-07-20 12:36:24 +02:00
satellites receiving their configuration from the master zone, and agents checked via command
2016-08-13 15:59:06 +02:00
endpoint from the satellite zones.
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Master and Satellites with Agents ](images/distributed-monitoring/icinga2_distributed_monitoring_scenarios_master_satellites_agents.png )
2019-05-08 18:16:54 +02:00
2018-09-13 16:19:38 +02:00
> **Tip**:
>
> It can get complicated, so grab a pen and paper and bring your thoughts to life.
> Play around with a test setup before using it in a production environment!
2016-08-13 15:59:06 +02:00
2020-11-09 16:06:53 +01:00
There are various reasons why you might want to have satellites in your environment. The following list explains the more common ones.
* Monitor remote locations. Besides reducing connections and traffic between different locations this setup also helps when the network connection to the remote network is lost. Satellites will keep checking and collecting data on their own and will send their check results when the connection is restored.
* Reduce connections between security zones. Satellites in a different zone (e.g. DMZ) than your masters will help reduce connections through firewalls.
* Offload resource hungry checks to other hosts. In very big setups running lots of plugins on your masters or satellites might have a significant impact on the performance during times of high load. You can introduce another level of satellites just to run these plugins and send their results to the upstream hosts.
2019-05-08 18:16:54 +02:00
Best practice is to run the database backend on a dedicated server/cluster and
only expose a virtual IP address to Icinga and the IDO feature. By default, only one
endpoint will actively write to the backend then. Typical setups for MySQL clusters
2019-05-09 09:25:07 +02:00
involve Master-Master-Replication (Master-Slave-Replication in both directions) or Galera,
more tips can be found on our [community forums ](https://community.icinga.com/ ).
2019-05-08 18:16:54 +02:00
2016-08-13 15:59:06 +02:00
Overview:
2017-04-10 17:24:39 +02:00
* `icinga2-master1.localdomain` is the configuration master master node.
2016-08-20 14:17:18 +02:00
* `icinga2-master2.localdomain` is the secondary master master node without configuration in `zones.d` .
2018-09-13 16:19:38 +02:00
* `icinga2-satellite1.localdomain` and `icinga2-satellite2.localdomain` are satellite nodes in a `master` child zone. They forward CSR signing requests to the master zone.
2019-07-20 12:36:24 +02:00
* `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` are two child nodes as agents.
2016-08-13 15:59:06 +02:00
Setup requirements:
2017-07-12 20:46:12 +02:00
* Set up `icinga2-master1.localdomain` as [master ](06-distributed-monitoring.md#distributed-monitoring-setup-master ).
2019-07-20 12:36:24 +02:00
* Set up `icinga2-master2.localdomain` , `icinga2-satellite1.localdomain` and `icinga2-satellite2.localdomain` as [agents ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ) (we will modify the generated configuration).
* Set up `icinga2-agent1.localdomain` and `icinga2-agent2.localdomain` as [agents ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-satellite ).
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
When being asked for the parent endpoint providing CSR auto-signing capabilities,
please add one of the satellite nodes. **Note** : This requires Icinga 2 v2.8+
2019-07-20 12:36:24 +02:00
and the `CA Proxy` on all master, satellite and agent nodes.
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
Example for `icinga2-agent1.localdomain` :
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
Please specify the parent endpoint(s) (master or satellite) where this node should connect to:
```
Parent endpoint is the first satellite `icinga2-satellite1.localdomain` :
```
Master/Satellite Common Name (CN from your master/satellite node): icinga2-satellite1.localdomain
Do you want to establish a connection to the parent node from this node? [Y/n]: y
Please specify the master/satellite connection information:
Master/Satellite endpoint host (IP address or FQDN): 192.168.56.105
Master/Satellite endpoint port [5665]: 5665
```
Add the second satellite `icinga2-satellite2.localdomain` as parent:
```
Add more master/satellite endpoints? [y/N]: y
Master/Satellite Common Name (CN from your master/satellite node): icinga2-satellite2.localdomain
Do you want to establish a connection to the parent node from this node? [Y/n]: y
Please specify the master/satellite connection information:
Master/Satellite endpoint host (IP address or FQDN): 192.168.56.106
Master/Satellite endpoint port [5665]: 5665
Add more master/satellite endpoints? [y/N]: n
```
The specified parent nodes will forward the CSR signing request to the master instances.
Proceed with adding the optional client ticket for [CSR auto-signing ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ):
```
Please specify the request ticket generated on your Icinga 2 master (optional).
2019-07-19 14:44:14 +02:00
(Hint: # icinga2 pki ticket --cn 'icinga2-agent1.localdomain'):
2018-09-13 16:19:38 +02:00
4f75d2ecd253575fe9180938ebff7cbca262f96e
```
In case you've chosen to use [On-Demand CSR Signing ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing )
you can leave the ticket question blank.
Instead, Icinga 2 tells you to approve the request later on the master node.
```
No ticket was specified. Please approve the certificate signing request manually
on the master (see 'icinga2 ca list' and 'icinga2 ca sign --help' for details).
```
You can optionally specify a different bind host and/or port.
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
Please specify the API bind host/port (optional):
Bind Host []:
Bind Port []:
```
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
The next step asks you to accept configuration (required for [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync ))
and commands (required for [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )).
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
Accept config from parent node? [y/N]: y
Accept commands from parent node? [y/N]: y
```
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
Next you can optionally specify the local and parent zone names. This will be reflected
in the generated zone configuration file.
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
2019-07-19 14:44:14 +02:00
Local zone name [icinga2-agent1.localdomain]: icinga2-agent1.localdomain
2018-09-13 16:19:38 +02:00
```
2019-07-20 12:36:24 +02:00
Set the parent zone name to `satellite` for this agent.
2018-09-13 16:19:38 +02:00
```
Parent zone name [master]: satellite
```
You can add more global zones in addition to `global-templates` and `director-global` if necessary.
Press `Enter` or choose `n` , if you don't want to add any additional.
```
Reconfiguring Icinga...
Default global zones: global-templates director-global
Do you want to specify additional global zones? [y/N]: N
```
Last but not least the wizard asks you whether you want to disable the inclusion of the local configuration
2019-07-20 12:36:24 +02:00
directory in `conf.d` , or not. Defaults to disabled, since agents are checked via command endpoint and the example
configuration would collide with this mode.
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
Do you want to disable the inclusion of the conf.d directory [Y/n]: Y
Disabling the inclusion of the conf.d directory...
```
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
**We'll discuss the details of the required configuration below. Most of this
configuration can be rendered by the setup wizards.**
2016-08-13 15:59:06 +02:00
The zone hierarchy can look like this. We'll define only the directly connected zones here.
2018-09-13 16:19:38 +02:00
The master instances should actively connect to the satellite instances, therefore
the configuration on `icinga2-master1.localdomain` and `icinga2-master2.localdomain`
must include the `host` attribute for the satellite endpoints:
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2018-09-13 16:19:38 +02:00
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.102" // Actively connect to the second master.
2018-09-13 16:19:38 +02:00
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Endpoint "icinga2-satellite1.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.105" // Actively connect to the satellites.
2018-09-13 16:19:38 +02:00
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Endpoint "icinga2-satellite2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.106" // Actively connect to the satellites.
2018-09-13 16:19:38 +02:00
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
}
2019-07-20 12:36:24 +02:00
```
The endpoint configuration on the secondary master looks similar,
but changes the connection attributes - the first master already
tries to connect, there is no need for a secondary attempt.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
```
[root@icinga2-master2.localdomain /]# vim /etc/icinga2/zones.conf
object Endpoint "icinga2-master1.localdomain" {
// First master already connects to us
}
object Endpoint "icinga2-master2.localdomain" {
// That's us
}
object Endpoint "icinga2-satellite1.localdomain" {
host = "192.168.56.105" // Actively connect to the satellites.
}
object Endpoint "icinga2-satellite2.localdomain" {
host = "192.168.56.106" // Actively connect to the satellites.
}
```
The zone configuration on both masters looks the same. Add this
to the corresponding `zones.conf` entries for the endpoints.
```
2018-09-13 16:19:38 +02:00
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain", "icinga2-satellite2.localdomain" ]
2016-09-29 17:58:31 +02:00
2018-09-13 16:19:38 +02:00
parent = "master"
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2016-08-13 15:59:06 +02:00
2018-09-13 16:19:38 +02:00
object Zone "director-global" {
global = true
}
```
In contrast to that, the satellite instances `icinga2-satellite1.localdomain`
and `icinga2-satellite2.localdomain` should not actively connect to the master
instances.
```
[root@icinga2-satellite1.localdomain /]# vim /etc/icinga2/zones.conf
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// This endpoint will connect to us
}
object Endpoint "icinga2-master2.localdomain" {
// This endpoint will connect to us
}
object Endpoint "icinga2-satellite1.localdomain" {
// That's us
}
object Endpoint "icinga2-satellite2.localdomain" {
host = "192.168.56.106" // Actively connect to the secondary satellite
}
```
Again, only one side is required to establish the connection inside the HA zone.
Since satellite1 already connects to satellite2, leave out the `host` attribute
for `icinga2-satellite1.localdomain` on satellite2.
```
[root@icinga2-satellite2.localdomain /]# vim /etc/icinga2/zones.conf
object Endpoint "icinga2-master1.localdomain" {
// This endpoint will connect to us
2018-09-13 16:19:38 +02:00
}
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
// This endpoint will connect to us
2018-09-13 16:19:38 +02:00
}
object Endpoint "icinga2-satellite1.localdomain" {
2019-07-20 12:36:24 +02:00
// First satellite already connects to us
2018-09-13 16:19:38 +02:00
}
object Endpoint "icinga2-satellite2.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2018-09-13 16:19:38 +02:00
}
2019-07-20 12:36:24 +02:00
```
2018-09-13 16:19:38 +02:00
2019-07-20 12:36:24 +02:00
The zone configuration on both satellites looks the same. Add this
to the corresponding `zones.conf` entries for the endpoints.
```
2018-09-13 16:19:38 +02:00
object Zone "master" {
endpoints = [ "icinga2-master1.localdomain", "icinga2-master2.localdomain" ]
}
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain", "icinga2-satellite2.localdomain" ]
parent = "master"
}
/* sync global commands */
object Zone "global-templates" {
global = true
}
object Zone "director-global" {
global = true
}
```
2019-07-20 12:36:24 +02:00
2018-09-13 16:19:38 +02:00
Keep in mind to control the endpoint [connection direction ](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-connection-direction )
using the `host` attribute, also for other endpoints in the same zone.
2017-07-12 20:46:12 +02:00
Since we want to use [top down command endpoint ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint ) checks,
2019-07-20 12:36:24 +02:00
we must configure the agent endpoint and zone objects.
In order to minimize the effort, we'll sync the agent zone and endpoint configuration to the
satellites where the connection information is needed as well. Note: This only works with satellite
and agents, since there already is a trust relationship between the master and the satellite zone.
The cluster config sync to the satellite invokes an automated reload causing the agent connection attempts.
`icinga2-master1.localdomain` is the configuration master where everything is stored:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/{master,satellite,global-templates}
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/satellite
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim icinga2-agent1.localdomain.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.111" // The satellite actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "satellite"
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim icinga2-agent2.localdomain.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.112" // The satellite actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "satellite"
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
The two agent nodes do not need to know about each other. The only important thing
2017-02-14 12:53:04 +01:00
is that they know about the parent zone (the satellite) and their endpoint members (and optionally the global zone).
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
> **Tipp**
>
> In the example above we've specified the `host` attribute in the agent endpoint configuration. In this mode,
> the satellites actively connect to the agents. This costs some resources on the satellite -- if you prefer to
> offload the connection attempts to the agent, or your DMZ requires this, you can also change the **[connection direction](06-distributed-monitoring.md#distributed-monitoring-advanced-hints-connection-direction).**
>
> 1) Don't set the `host` attribute for the agent endpoints put into `zones.d/satellite`.
> 2) Modify each agent's zones.conf file and add the `host` attribute to all parent satellites. You can automate this with using the `node wizard/setup` CLI commands.
The agents are waiting for the satellites to connect, therefore they don't specify
the `host` attribute in the endpoint objects locally.
2017-04-10 17:24:39 +02:00
2019-07-19 14:44:14 +02:00
Example for `icinga2-agent1.localdomain` :
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-satellite1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the satellite by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-satellite2.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the satellite by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain", "icinga2-satellite2.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "satellite"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "director-global" {
global = true
}
```
2018-09-13 16:19:38 +02:00
2019-07-19 14:44:14 +02:00
Example for `icinga2-agent2.localdomain` :
2017-04-10 17:24:39 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent2.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-satellite1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the satellite by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-satellite2.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the satellite by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
// That's us
2019-03-07 19:56:49 +01:00
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "satellite" {
endpoints = [ "icinga2-satellite1.localdomain", "icinga2-satellite2.localdomain" ]
}
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent2.localdomain" {
endpoints = [ "icinga2-agent2.localdomain" ]
2016-09-29 17:58:31 +02:00
2019-03-07 19:56:49 +01:00
parent = "satellite"
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
/* sync global commands */
object Zone "global-templates" {
global = true
}
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object Zone "director-global" {
global = true
}
```
2018-09-13 16:19:38 +02:00
2019-07-20 12:36:24 +02:00
Now it is time to define the two agents hosts on the master, sync them to the satellites
2016-08-20 14:51:05 +02:00
and apply service checks using the command endpoint execution method to them.
2019-07-20 12:36:24 +02:00
Add the two agent nodes as host objects to the `satellite` zone.
2016-08-13 15:59:06 +02:00
2017-02-14 12:53:04 +01:00
We've already created the directories in `/etc/icinga2/zones.d` including the files for the
2019-07-20 12:36:24 +02:00
zone and endpoint configuration for the agents.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/satellite
```
2017-04-10 17:24:39 +02:00
2019-07-20 12:36:24 +02:00
Add the host object configuration for the `icinga2-agent1.localdomain` agent. You should
2017-04-10 17:24:39 +02:00
have created the configuration file in the previous steps and it should contain the endpoint
and zone object configuration already.
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim icinga2-agent1.localdomain.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.111"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name // Follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Add the host object configuration for the `icinga2-agent2.localdomain` agent configuration file:
2017-04-10 17:24:39 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim icinga2-agent2.localdomain.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent2.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.112"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name // Follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2017-04-10 17:24:39 +02:00
Add a service object which is executed on the satellite nodes (e.g. `ping4` ). Pin the apply rule to the `satellite` zone only.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "ping4" {
check_command = "ping4"
2019-07-20 12:36:24 +02:00
// Check is executed on the satellite node
2019-03-07 19:56:49 +01:00
assign where host.zone == "satellite" & & host.address
}
```
2016-08-13 15:59:06 +02:00
2016-11-24 16:35:29 +01:00
Add services using command endpoint checks. Pin the apply rules to the `satellite` zone only.
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim services.conf
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
apply Service "disk" {
check_command = "disk"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Execute the check on the remote command endpoint
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
assign where host.zone == "satellite" & & host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
Validate the configuration and restart Icinga 2 on the master node `icinga2-master1.localdomain` .
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2017-02-14 12:53:04 +01:00
2019-07-20 12:36:24 +02:00
Open Icinga Web 2 and check the two newly created agent hosts with two new services
2016-08-13 15:59:06 +02:00
-- one executed locally (`ping4`) and one using command endpoint (`disk`).
2018-09-13 16:19:38 +02:00
> **Tip**:
>
> It's a good idea to add [health checks](06-distributed-monitoring.md#distributed-monitoring-health-checks)
2016-08-23 20:20:15 +02:00
to make sure that your cluster notifies you in case of failure.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
In terms of health checks, consider adding the following for this scenario:
- Master nodes check whether the satellite zone is connected
- Satellite nodes check the connection to the agents
- Optional: Add dependencies for the agent host to prevent unwanted notifications when agents are unreachable
Proceed in [this chapter ](06-distributed-monitoring.md#distributed-monitoring-health-checks-master-satellite-agent ).
2017-07-12 20:46:12 +02:00
## Best Practice <a id="distributed-monitoring-best-practice"></a>
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
We've put together a collection of configuration examples from community feedback.
2018-10-18 09:32:14 +02:00
If you like to share your tips and tricks with us, please join the [community channels ](https://icinga.com/community/ )!
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
### Global Zone for Config Sync <a id="distributed-monitoring-global-zone-config-sync"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Global zones can be used to sync generic configuration objects
to all nodes depending on them. Common examples are:
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
* Templates which are imported into zone specific objects.
* Command objects referenced by Host, Service, Notification objects.
2018-04-27 16:16:59 +02:00
* Apply rules for services, notifications and dependencies.
2016-08-21 12:43:28 +02:00
* User objects referenced in notifications.
* Group objects.
* TimePeriod objects.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Plugin scripts and binaries must not be synced, this is for Icinga 2
2016-08-20 14:17:18 +02:00
configuration files only. Use your preferred package repository
and/or configuration management tool (Puppet, Ansible, Chef, etc.)
2019-07-20 12:36:24 +02:00
for keeping packages and scripts uptodate.
2016-08-20 14:17:18 +02:00
2016-08-23 20:20:15 +02:00
**Note**: Checkable objects (hosts and services) cannot be put into a global
2019-07-20 12:36:24 +02:00
zone. The configuration validation will terminate with an error. Apply rules
work as they are evaluated locally on each endpoint.
2016-08-13 15:59:06 +02:00
The zone object configuration must be deployed on all nodes which should receive
2016-08-20 14:17:18 +02:00
the global configuration files:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-12-03 13:42:22 +01:00
2019-03-07 19:56:49 +01:00
object Zone "global-commands" {
global = true
}
```
2016-08-13 15:59:06 +02:00
2018-09-13 16:33:27 +02:00
The default global zones generated by the setup wizards are called `global-templates` and `director-global` .
2017-06-02 10:52:47 +02:00
2019-10-22 10:57:22 +02:00
While you can and should use `global-templates` for your global configuration, `director-global` is reserved for use
2019-05-31 15:49:14 +02:00
by [Icinga Director ](https://icinga.com/docs/director/latest/ ). Please don't
place any configuration in it manually.
2016-08-13 15:59:06 +02:00
Similar to the zone configuration sync you'll need to create a new directory in
2016-08-20 14:17:18 +02:00
`/etc/icinga2/zones.d` :
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/global-commands
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Next, add a new check command, for example:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-commands/web.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
object CheckCommand "webinject" {
//...
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Restart the endpoints(s) which should receive the global zone before
2016-08-21 12:43:28 +02:00
before restarting the parent master/satellite nodes.
2016-08-13 15:59:06 +02:00
Then validate the configuration on the master node and restart Icinga 2.
2016-08-23 20:20:15 +02:00
**Tip**: You can copy the example configuration files located in `/etc/icinga2/conf.d`
2018-09-13 16:33:27 +02:00
into the default global zone `global-templates` .
2016-08-20 14:17:18 +02:00
Example:
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/conf.d
[root@icinga2-master1.localdomain /etc/icinga2/conf.d]# cp {commands,groups,notifications,services,templates,timeperiods,users}.conf /etc/icinga2/zones.d/global-templates
```
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
### Health Checks <a id="distributed-monitoring-health-checks"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
In case of network failures or other problems, your monitoring might
2016-08-13 15:59:06 +02:00
either have late check results or just send out mass alarms for unknown
checks.
2016-08-20 14:17:18 +02:00
In order to minimize the problems caused by this, you should configure
2016-08-13 15:59:06 +02:00
additional health checks.
2019-07-20 12:36:24 +02:00
#### cluster-zone with Masters and Agents <a id="distributed-monitoring-health-checks-master-agents"></a>
The `cluster-zone` check will test whether the configured target zone is currently
connected or not. This example adds a health check for the [ha master with agents scenario ](06-distributed-monitoring.md#distributed-monitoring-scenarios-ha-master-agents ).
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/services.conf
apply Service "agent-health" {
check_command = "cluster-zone"
display_name = "cluster-health-" + host.name
/* This follows the convention that the agent zone name is the FQDN which is the same as the host object name. */
vars.cluster_zone = host.name
assign where host.vars.agent_endpoint
}
```
In order to prevent unwanted notifications, add a service dependency which gets applied to
all services using the command endpoint mode.
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/dependencies.conf
apply Dependency "agent-health-check" to Service {
parent_service_name = "agent-health"
states = [ OK ] // Fail if the parent service state switches to NOT-OK
disable_notifications = true
assign where host.vars.agent_endpoint // Automatically assigns all agent endpoint checks as child services on the matched host
ignore where service.name == "agent-health" // Avoid a self reference from child to parent
}
```
#### cluster-zone with Masters, Satellites and Agents <a id="distributed-monitoring-health-checks-master-satellite-agent"></a>
This example adds health checks for the [master, satellites and agents scenario ](06-distributed-monitoring.md#distributed-monitoring-scenarios-master-satellite-agents ).
Whenever the connection between the master and satellite zone breaks,
you may encounter late check results in Icinga Web. In order to view
this failure and also send notifications, add the following configuration:
First, add the two masters as host objects to the master zone, if not already
existing.
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/hosts.conf
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
object Host "icinga2-master1.localdomain" {
check_command = "hostalive"
2019-07-20 12:36:24 +02:00
2019-03-07 19:56:49 +01:00
address = "192.168.56.101"
}
2016-11-24 16:35:29 +01:00
2019-07-20 12:36:24 +02:00
object Host "icinga2-master2.localdomain" {
check_command = "hostalive"
2016-11-24 16:35:29 +01:00
2019-07-20 12:36:24 +02:00
address = "192.168.56.102"
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Add service health checks against the satellite zone.
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/health.conf
apply Service "satellite-zone-health" {
check_command = "cluster-zone"
check_interval = 30s
retry_interval = 10s
vars.cluster_zone = "satellite"
assign where match("icinga2-master*.localdomain", host.name)
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
**Don't forget to create notification apply rules for these services.**
Next are health checks for agents connected to the satellite zone.
Navigate into the satellite directory in `zones.d` :
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/satellite
```
You should already have configured agent host objects following [the master, satellite, agents scenario ](06-distributed-monitoring.md#distributed-monitoring-scenarios-master-satellite-agents ).
Add a new configuration file where all the health checks are defined.
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim health.conf
2016-11-24 16:35:29 +01:00
2019-07-20 12:36:24 +02:00
apply Service "agent-health" {
2019-03-07 19:56:49 +01:00
check_command = "cluster-zone"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
display_name = "agent-health-" + host.name
2016-11-24 16:35:29 +01:00
2019-07-20 12:36:24 +02:00
// This follows the convention that the agent zone name is the FQDN which is the same as the host object name.
2019-03-07 19:56:49 +01:00
vars.cluster_zone = host.name
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
// Create this health check for agent hosts in the satellite zone
assign where host.zone == "satellite" & & host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
In order to prevent unwanted notifications, add a service dependency which gets applied to
all services using the command endpoint mode.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/satellite]# vim health.conf
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
apply Dependency "agent-health-check" to Service {
parent_service_name = "agent-health"
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
states = [ OK ] // Fail if the parent service state switches to NOT-OK
disable_notifications = true
assign where host.zone == "satellite" & & host.vars.agent_endpoint // Automatically assigns all agent endpoint checks as child services on the matched host
ignore where service.name == "agent-health" // Avoid a self reference from child to parent
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
This is all done on the configuration master, and requires the scenario to be fully up and running.
#### Cluster Check
The `cluster` check will check if all endpoints in the current zone and the directly
connected zones are working properly. The disadvantage of using this check is that
you cannot monitor 3 or more cluster levels with it.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-master1.localdomain.conf
2016-11-24 16:35:29 +01:00
2019-07-20 12:36:24 +02:00
object Host "icinga2-master1.localdomain" {
check_command = "hostalive"
address = "192.168.56.101"
}
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/cluster.conf
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
object Service "cluster" {
check_command = "cluster"
check_interval = 5s
retry_interval = 1s
host_name = "icinga2-master1.localdomain"
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
### Pin Checks in a Zone <a id="distributed-monitoring-pin-checks-zone"></a>
2016-11-24 16:35:29 +01:00
In case you want to pin specific checks to their endpoints in a given zone you'll need to use
the `command_endpoint` attribute. This is reasonable if you want to
2017-11-11 15:01:17 +01:00
execute a local disk check in the `master` Zone on a specific endpoint then.
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# mkdir -p /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-master1.localdomain.conf
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
object Host "icinga2-master1.localdomain" {
check_command = "hostalive"
address = "192.168.56.101"
}
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/services.conf
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
apply Service "disk" {
check_command = "disk"
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
command_endpoint = host.name //requires a host object matching the endpoint object name e.g. icinga2-master1.localdomain
2016-11-24 16:35:29 +01:00
2019-03-07 19:56:49 +01:00
assign where host.zone == "master" & & match("icinga2-master*", host.name)
}
```
2016-11-24 16:35:29 +01:00
2016-11-25 13:40:42 +01:00
The `host.zone` attribute check inside the expression ensures that
the service object is only created for host objects inside the `master`
zone. In addition to that the [match ](18-library-reference.md#global-functions-match )
function ensures to only create services for the master nodes.
2017-07-12 20:46:12 +02:00
### Windows Firewall <a id="distributed-monitoring-windows-firewall"></a>
2016-09-21 14:04:20 +02:00
2017-09-18 16:25:29 +02:00
#### ICMP Requests <a id="distributed-monitoring-windows-firewall-icmp"></a>
2016-09-21 14:04:20 +02:00
By default ICMP requests are disabled in the Windows firewall. You can
change that by [adding a new rule ](https://support.microsoft.com/en-us/kb/947709 ).
2019-03-07 19:56:49 +01:00
```
2019-08-30 14:04:05 +02:00
C:\> netsh advfirewall firewall add rule name="ICMP Allow incoming V4 echo request" protocol=icmpv4:8,any dir=in action=allow
2019-03-07 19:56:49 +01:00
```
2016-09-21 14:04:20 +02:00
2017-09-18 16:25:29 +02:00
#### Icinga 2 <a id="distributed-monitoring-windows-firewall-icinga2"></a>
2019-07-20 12:36:24 +02:00
If your master/satellite nodes should actively connect to the Windows agent
2016-09-21 14:04:20 +02:00
you'll also need to ensure that port `5665` is enabled.
2019-03-07 19:56:49 +01:00
```
2019-08-30 14:04:05 +02:00
C:\> netsh advfirewall firewall add rule name="Open port 5665 (Icinga 2)" dir=in action=allow protocol=TCP localport=5665
2019-03-07 19:56:49 +01:00
```
2016-09-21 14:04:20 +02:00
2017-09-18 16:25:29 +02:00
#### NSClient++ API <a id="distributed-monitoring-windows-firewall-nsclient-api"></a>
If the [check_nscp_api ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp-check-api )
2018-03-20 11:50:52 +01:00
plugin is used to query NSClient++, you need to ensure that its port is enabled.
2017-09-18 16:25:29 +02:00
2019-03-07 19:56:49 +01:00
```
2019-08-30 14:04:05 +02:00
C:\> netsh advfirewall firewall add rule name="Open port 8443 (NSClient++ API)" dir=in action=allow protocol=TCP localport=8443
2019-03-07 19:56:49 +01:00
```
2016-09-21 14:04:20 +02:00
2018-03-20 11:50:52 +01:00
For security reasons, it is advised to enable the NSClient++ HTTP API for local
2019-07-20 12:36:24 +02:00
connection from the Icinga agent only. Remote connections to the HTTP API
2018-03-20 11:50:52 +01:00
are not recommended with using the legacy HTTP API.
2019-07-20 14:51:24 +02:00
### Windows Agent and Plugins <a id="distributed-monitoring-windows-plugins"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
The Icinga 2 package on Windows already provides several plugins.
Detailed [documentation ](10-icinga-template-library.md#windows-plugins ) is available for all check command definitions.
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
Based on the [master with agents ](06-distributed-monitoring.md#distributed-monitoring-master-agents )
2016-08-13 15:59:06 +02:00
scenario we'll now add a local disk check.
2019-07-20 12:36:24 +02:00
First, add the agent node as host object:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent2.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.112"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
vars.os_type = "windows"
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Next, add the disk check using command endpoint checks (details in the
[disk-windows ](10-icinga-template-library.md#windows-plugins-disk-windows ) documentation):
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "disk C:" {
check_command = "disk-windows"
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
vars.disk_win_path = "C:"
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
//specify where the check is executed
2019-07-20 12:36:24 +02:00
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
assign where host.vars.os_type == "windows" & & host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
Validate the configuration and restart Icinga 2.
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-21 12:43:28 +02:00
2016-08-20 14:51:05 +02:00
Open Icinga Web 2 and check your newly added Windows disk check :)
2019-07-20 14:51:24 +02:00
![Icinga Windows Agent ](images/distributed-monitoring/icinga2_distributed_windows_client_disk_icingaweb2.png )
2016-08-20 14:51:05 +02:00
2017-07-12 20:46:12 +02:00
If you want to add your own plugins please check [this chapter ](05-service-monitoring.md#service-monitoring-requirements )
2016-08-31 13:25:57 +02:00
for the requirements.
2016-08-13 15:59:06 +02:00
2019-07-20 14:51:24 +02:00
### Windows Agent and NSClient++ <a id="distributed-monitoring-windows-nscp"></a>
2016-08-13 15:59:06 +02:00
2016-10-10 11:42:18 +02:00
There are two methods available for querying NSClient++:
2019-07-20 12:36:24 +02:00
* Query the [HTTP API ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp-check-api ) locally from an Icinga agent (requires a running NSClient++ service)
2017-07-12 20:46:12 +02:00
* Run a [local CLI check ](06-distributed-monitoring.md#distributed-monitoring-windows-nscp-check-local ) (does not require NSClient++ as a service)
2016-10-10 11:42:18 +02:00
Both methods have their advantages and disadvantages. One thing to
note: If you rely on performance counter delta calculations such as
CPU utilization, please use the HTTP API instead of the CLI sample call.
2017-07-12 20:46:12 +02:00
#### NSCLient++ with check_nscp_api <a id="distributed-monitoring-windows-nscp-check-api"></a>
2016-10-10 11:42:18 +02:00
2019-07-20 12:36:24 +02:00
The [Windows setup ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-windows ) already allows
2016-10-10 11:42:18 +02:00
you to install the NSClient++ package. In addition to the Windows plugins you can
use the [nscp_api command ](10-icinga-template-library.md#nscp-check-api ) provided by the Icinga Template Library (ITL).
The initial setup for the NSClient++ API and the required arguments
is the described in the ITL chapter for the [nscp_api ](10-icinga-template-library.md#nscp-check-api ) CheckCommand.
2019-07-20 12:36:24 +02:00
Based on the [master with agents ](06-distributed-monitoring.md#distributed-monitoring-master-agents )
2016-10-10 11:42:18 +02:00
scenario we'll now add a local nscp check which queries the NSClient++ API to check the free disk space.
2019-07-19 14:44:14 +02:00
Define a host object called `icinga2-agent2.localdomain` on the master. Add the `nscp_api_password`
2019-07-19 14:14:34 +02:00
custom variable and specify the drives to check.
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-10-10 11:42:18 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
check_command = "hostalive"
address = "192.168.56.111"
vars.agent_endpoint = name //follows the convention that host name == endpoint name
vars.os_type = "Windows"
vars.nscp_api_password = "icinga"
vars.drives = [ "C:", "D:" ]
2019-03-07 19:56:49 +01:00
}
```
2016-10-10 11:42:18 +02:00
2017-07-12 20:46:12 +02:00
The service checks are generated using an [apply for ](03-monitoring-basics.md#using-apply-for )
2016-10-10 11:42:18 +02:00
rule based on `host.vars.drives` :
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
apply Service "nscp-api-" for (drive in host.vars.drives) {
import "generic-service"
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
check_command = "nscp_api"
2019-07-20 12:36:24 +02:00
command_endpoint = host.vars.agent_endpoint
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
//display_name = "nscp-drive-" + drive
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
vars.nscp_api_host = "localhost"
vars.nscp_api_query = "check_drivesize"
vars.nscp_api_password = host.vars.nscp_api_password
vars.nscp_api_arguments = [ "drive=" + drive ]
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
ignore where host.vars.os_type != "Windows"
}
```
2016-10-10 11:42:18 +02:00
Validate the configuration and restart Icinga 2.
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-10-10 11:42:18 +02:00
Two new services ("nscp-drive-D:" and "nscp-drive-C:") will be visible in Icinga Web 2.
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Monitoring Windows Agent with NSClient++ nscp-api ](images/distributed-monitoring/icinga2_distributed_windows_nscp_api_drivesize_icingaweb2.png )
2016-10-10 11:42:18 +02:00
Note: You can also omit the `command_endpoint` configuration to execute
the command on the master. This also requires a different value for `nscp_api_host`
which defaults to `host.address` .
2019-03-07 19:56:49 +01:00
```
2019-07-20 12:36:24 +02:00
//command_endpoint = host.vars.agent_endpoint
2016-10-10 11:42:18 +02:00
2019-03-07 19:56:49 +01:00
//vars.nscp_api_host = "localhost"
```
2016-10-10 11:42:18 +02:00
You can verify the check execution by looking at the `Check Source` attribute
in Icinga Web 2 or the REST API.
2017-09-18 16:56:03 +02:00
If you want to monitor specific Windows services, you could use the following example:
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2017-09-18 16:56:03 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
check_command = "hostalive"
address = "192.168.56.111"
vars.agent_endpoint = name //follows the convention that host name == endpoint name
vars.os_type = "Windows"
vars.nscp_api_password = "icinga"
vars.services = [ "Windows Update", "wscsvc" ]
2019-03-07 19:56:49 +01:00
}
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
apply Service "nscp-api-" for (svc in host.vars.services) {
import "generic-service"
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
check_command = "nscp_api"
2019-07-20 12:36:24 +02:00
command_endpoint = host.vars.agent_endpoint
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
//display_name = "nscp-service-" + svc
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
vars.nscp_api_host = "localhost"
vars.nscp_api_query = "check_service"
vars.nscp_api_password = host.vars.nscp_api_password
vars.nscp_api_arguments = [ "service=" + svc ]
2017-09-18 16:56:03 +02:00
2019-03-07 19:56:49 +01:00
ignore where host.vars.os_type != "Windows"
}
```
2017-09-18 16:56:03 +02:00
2017-07-12 20:46:12 +02:00
#### NSCLient++ with nscp-local <a id="distributed-monitoring-windows-nscp-check-local"></a>
2016-10-10 11:42:18 +02:00
2019-07-20 12:36:24 +02:00
The [Windows setup ](06-distributed-monitoring.md#distributed-monitoring-setup-agent-windows ) allows
you to install the bundled NSClient++ package. In addition to the Windows plugins you can
2016-08-20 14:17:18 +02:00
use the [nscp-local commands ](10-icinga-template-library.md#nscp-plugin-check-commands )
2016-08-13 15:59:06 +02:00
provided by the Icinga Template Library (ITL).
2019-07-20 12:36:24 +02:00
Add the following `include` statement on all your nodes (master, satellite, agent):
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
vim /etc/icinga2/icinga2.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
include < nscp >
```
2016-08-13 15:59:06 +02:00
The CheckCommand definitions will automatically determine the installed path
to the `nscp.exe` binary.
2019-07-20 12:36:24 +02:00
Based on the [master with agents ](06-distributed-monitoring.md#distributed-monitoring-master-agents )
2016-08-13 15:59:06 +02:00
scenario we'll now add a local nscp check querying a given performance counter.
2019-07-20 12:36:24 +02:00
First, add the agent node as host object:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# cd /etc/icinga2/zones.d/master
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim hosts.conf
2016-08-13 15:59:06 +02:00
2019-07-19 14:44:14 +02:00
object Host "icinga2-agent1.localdomain" {
2019-03-07 19:56:49 +01:00
check_command = "hostalive"
address = "192.168.56.111"
2019-07-20 12:36:24 +02:00
vars.agent_endpoint = name //follows the convention that host name == endpoint name
2019-03-07 19:56:49 +01:00
vars.os_type = "windows"
}
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Next, add a performance counter check using command endpoint checks (details in the
[nscp-local-counter ](10-icinga-template-library.md#nscp-check-local-counter ) documentation):
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /etc/icinga2/zones.d/master]# vim services.conf
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
apply Service "nscp-local-counter-cpu" {
check_command = "nscp-local-counter"
2019-07-20 12:36:24 +02:00
command_endpoint = host.vars.agent_endpoint
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
vars.nscp_counter_name = "\\Processor(_total)\\% Processor Time"
vars.nscp_counter_perfsyntax = "Total Processor Time"
vars.nscp_counter_warning = 1
vars.nscp_counter_critical = 5
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
vars.nscp_counter_showall = true
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
assign where host.vars.os_type == "windows" & & host.vars.agent_endpoint
2019-03-07 19:56:49 +01:00
}
```
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
Validate the configuration and restart Icinga 2.
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# icinga2 daemon -C
[root@icinga2-master1.localdomain /]# systemctl restart icinga2
```
2016-08-13 15:59:06 +02:00
2016-09-21 14:04:20 +02:00
Open Icinga Web 2 and check your newly added Windows NSClient++ check :)
2019-07-20 14:51:24 +02:00
![Icinga 2 Distributed Monitoring Windows Agent with NSClient++ nscp-local ](images/distributed-monitoring/icinga2_distributed_windows_nscp_counter_icingaweb2.png )
2016-09-21 14:04:20 +02:00
2018-05-18 10:43:20 +02:00
> **Tip**
>
> In order to measure CPU load, you'll need a running NSClient++ service.
> Therefore it is advised to use a local [nscp-api](06-distributed-monitoring.md#distributed-monitoring-windows-nscp-check-api)
> check against its REST API.
2017-07-12 20:46:12 +02:00
## Advanced Hints <a id="distributed-monitoring-advanced-hints"></a>
2016-08-13 15:59:06 +02:00
You can find additional hints in this section if you prefer to go your own route
with automating setups (setup, certificates, configuration).
2017-09-07 19:00:11 +02:00
### Certificate Auto-Renewal <a id="distributed-monitoring-certificate-auto-renewal"></a>
2019-04-08 14:08:14 +02:00
Icinga 2 v2.8+ added the possibility that nodes request certificate updates
2017-09-07 19:00:11 +02:00
on their own. If their expiration date is soon enough, they automatically
renew their already signed certificate by sending a signing request to the
2018-09-13 16:33:27 +02:00
parent node. You'll also see a message in the logs if certificate renewal
isn't necessary.
2017-09-07 19:00:11 +02:00
2017-07-12 20:46:12 +02:00
### High-Availability for Icinga 2 Features <a id="distributed-monitoring-high-availability-features"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
All nodes in the same zone require that you enable the same features for high-availability (HA).
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
By default, the following features provide advanced HA functionality:
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
* [Checks ](06-distributed-monitoring.md#distributed-monitoring-high-availability-checks ) (load balanced, automated failover).
* [Notifications ](06-distributed-monitoring.md#distributed-monitoring-high-availability-notifications ) (load balanced, automated failover).
* [DB IDO ](06-distributed-monitoring.md#distributed-monitoring-high-availability-db-ido ) (Run-Once, automated failover).
2018-10-24 14:22:09 +02:00
* [Elasticsearch ](09-object-types.md#objecttype-elasticsearchwriter )
* [Gelf ](09-object-types.md#objecttype-gelfwriter )
* [Graphite ](09-object-types.md#objecttype-graphitewriter )
2021-04-15 14:08:50 +02:00
* [InfluxDB ](09-object-types.md#objecttype-influxdb2writer ) (v1 and v2)
2018-10-24 14:22:09 +02:00
* [OpenTsdb ](09-object-types.md#objecttype-opentsdbwriter )
* [Perfdata ](09-object-types.md#objecttype-perfdatawriter ) (for PNP)
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
#### High-Availability with Checks <a id="distributed-monitoring-high-availability-checks"></a>
2016-08-13 15:59:06 +02:00
All instances within the same zone (e.g. the `master` zone as HA cluster) must
have the `checker` feature enabled.
Example:
2020-12-09 12:32:09 +01:00
```bash
icinga2 feature enable checker
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
All nodes in the same zone load-balance the check execution. If one instance shuts down,
2016-08-13 15:59:06 +02:00
the other nodes will automatically take over the remaining checks.
2017-07-12 20:46:12 +02:00
#### High-Availability with Notifications <a id="distributed-monitoring-high-availability-notifications"></a>
2016-08-13 15:59:06 +02:00
All instances within the same zone (e.g. the `master` zone as HA cluster) must
have the `notification` feature enabled.
Example:
2020-12-09 12:32:09 +01:00
```bash
icinga2 feature enable notification
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Notifications are load-balanced amongst all nodes in a zone. By default this functionality
2016-08-13 15:59:06 +02:00
is enabled.
2016-08-20 14:17:18 +02:00
If your nodes should send out notifications independently from any other nodes (this will cause
2016-08-13 15:59:06 +02:00
duplicated notifications if not properly handled!), you can set `enable_ha = false`
2017-07-12 20:46:12 +02:00
in the [NotificationComponent ](09-object-types.md#objecttype-notificationcomponent ) feature.
2016-08-13 15:59:06 +02:00
2017-07-12 20:46:12 +02:00
#### High-Availability with DB IDO <a id="distributed-monitoring-high-availability-db-ido"></a>
2016-08-13 15:59:06 +02:00
All instances within the same zone (e.g. the `master` zone as HA cluster) must
have the DB IDO feature enabled.
Example DB IDO MySQL:
2020-12-09 12:32:09 +01:00
```bash
icinga2 feature enable ido-mysql
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
By default the DB IDO feature only runs on one node. All other nodes in the same zone disable
the active IDO database connection at runtime. The node with the active DB IDO connection is
not necessarily the zone master.
2016-08-23 20:20:15 +02:00
**Note**: The DB IDO HA feature can be disabled by setting the `enable_ha` attribute to `false`
2017-07-12 20:46:12 +02:00
for the [IdoMysqlConnection ](09-object-types.md#objecttype-idomysqlconnection ) or
[IdoPgsqlConnection ](09-object-types.md#objecttype-idopgsqlconnection ) object on **all** nodes in the
2016-08-23 20:20:15 +02:00
**same** zone.
All endpoints will enable the DB IDO feature and connect to the configured
database and dump configuration, status and historical data on their own.
2016-08-13 15:59:06 +02:00
If the instance with the active DB IDO connection dies, the HA functionality will
automatically elect a new DB IDO master.
The DB IDO feature will try to determine which cluster endpoint is currently writing
to the database and bail out if another endpoint is active. You can manually verify that
2016-08-20 14:17:18 +02:00
by running the following query command:
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
icinga=> SELECT status_update_time, endpoint_name FROM icinga_programstatus;
status_update_time | endpoint_name
------------------------+---------------
2016-08-15 15:52:26+02 | icinga2-master1.localdomain
(1 Zeile)
```
2016-08-13 15:59:06 +02:00
This is useful when the cluster connection between endpoints breaks, and prevents
data duplication in split-brain-scenarios. The failover timeout can be set for the
`failover_timeout` attribute, but not lower than 60 seconds.
2017-07-12 20:46:12 +02:00
### Endpoint Connection Direction <a id="distributed-monitoring-advanced-hints-connection-direction"></a>
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
Endpoints attempt to connect to another endpoint when its local [Endpoint ](09-object-types.md#objecttype-endpoint ) object
2016-08-20 14:51:05 +02:00
configuration specifies a valid `host` attribute (FQDN or IP address).
Example for the master node `icinga2-master1.localdomain` actively connecting
2019-07-20 12:36:24 +02:00
to the agent node `icinga2-agent1.localdomain` :
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
//...
2016-08-20 14:51:05 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.111" // The master actively tries to connect to the agent
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
```
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
Example for the agent node `icinga2-agent1.localdomain` not actively
2016-08-20 14:51:05 +02:00
connecting to the master node `icinga2-master1.localdomain` :
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
//...
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
log_duration = 0 // Disable the replay log for command endpoint agents
2019-03-07 19:56:49 +01:00
}
```
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
It is not necessary that both the master and the agent node establish
2016-08-20 14:51:05 +02:00
two connections to each other. Icinga 2 will only use one connection
2019-07-20 12:36:24 +02:00
and close the second connection if established. This generates useless
CPU cycles and leads to blocking resources when the connection times out.
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
**Tip**: Choose either to let master/satellite nodes connect to agent nodes
2016-08-23 20:20:15 +02:00
or vice versa.
2016-08-20 14:51:05 +02:00
2017-07-12 20:46:12 +02:00
### Disable Log Duration for Command Endpoints <a id="distributed-monitoring-advanced-hints-command-endpoint-log-duration"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
The replay log is a built-in mechanism to ensure that nodes in a distributed setup
keep the same history (check results, notifications, etc.) when nodes are temporarily
disconnected and then reconnect.
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
This functionality is not needed when a master/satellite node is sending check
2019-07-20 12:36:24 +02:00
execution events to an agent which is configured as [command endpoint ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )
for check execution.
2016-08-20 14:51:05 +02:00
2017-07-12 20:46:12 +02:00
The [Endpoint ](09-object-types.md#objecttype-endpoint ) object attribute `log_duration` can
2016-08-20 14:51:05 +02:00
be lower or set to 0 to fully disable any log replay updates when the
2019-07-20 12:36:24 +02:00
agent is not connected.
2016-08-20 14:51:05 +02:00
Configuration on the master node `icinga2-master1.localdomain` :
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
//...
2016-08-20 14:51:05 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.111" // The master actively tries to connect to the agent
2019-03-07 19:56:49 +01:00
log_duration = 0
}
2016-08-20 14:51:05 +02:00
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent2.localdomain" {
2019-07-20 12:36:24 +02:00
host = "192.168.56.112" // The master actively tries to connect to the agent
2019-03-07 19:56:49 +01:00
log_duration = 0
}
```
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
Configuration on the agent `icinga2-agent1.localdomain` :
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# vim /etc/icinga2/zones.conf
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
//...
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master1.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
log_duration = 0
}
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
object Endpoint "icinga2-master2.localdomain" {
2019-07-20 12:36:24 +02:00
// Do not actively connect to the master by leaving out the 'host' attribute
2019-03-07 19:56:49 +01:00
log_duration = 0
}
```
2016-08-13 15:59:06 +02:00
2019-05-08 18:16:54 +02:00
### Initial Sync for new Endpoints in a Zone <a id="distributed-monitoring-advanced-hints-initial-sync"></a>
2019-07-08 13:53:57 +02:00
> **Note**
>
> This is required if you decide to change an already running single endpoint production
> environment into a HA-enabled cluster zone with two endpoints.
> The [initial setup](06-distributed-monitoring.md#distributed-monitoring-scenarios-ha-master-clients)
> with 2 HA masters doesn't require this step.
2019-05-08 18:16:54 +02:00
In order to make sure that all of your zone endpoints have the same state you need
to pick the authoritative running one and copy the following content:
* State file from `/var/lib/icinga2/icinga2.state`
* Internal config package for runtime created objects (downtimes, comments, hosts, etc.) at `/var/lib/icinga2/api/packages/_api`
If you need already deployed config packages from the Director, or synced cluster zones,
2019-07-08 13:53:57 +02:00
you can also sync the entire `/var/lib/icinga2/api/packages` directory. This directory should also be
2022-02-16 10:29:27 +01:00
included in your backup strategy.
2019-05-08 18:16:54 +02:00
2019-07-08 13:53:57 +02:00
Do **not** sync `/var/lib/icinga2/api/zones*` manually - this is an internal directory
and handled by the Icinga cluster config sync itself.
2019-05-08 18:16:54 +02:00
> **Note**
>
> Ensure that all endpoints are shut down during this procedure. Once you have
> synced the cached files, proceed with configuring the remaining endpoints
> to let them know about the new master/satellite node (zones.conf).
2017-11-09 15:14:10 +01:00
### Manual Certificate Creation <a id="distributed-monitoring-advanced-hints-certificates-manual"></a>
#### Create CA on the Master <a id="distributed-monitoring-advanced-hints-certificates-manual-ca"></a>
2016-08-21 12:43:28 +02:00
Choose the host which should store the certificate authority (one of the master nodes).
The first step is the creation of the certificate authority (CA) by running the following command
as root user:
2019-03-07 19:56:49 +01:00
```
[root@icinga2-master1.localdomain /root]# icinga2 pki new-ca
```
2016-08-13 15:59:06 +02:00
2017-11-09 15:14:10 +01:00
#### Create CSR and Certificate <a id="distributed-monitoring-advanced-hints-certificates-manual-create"></a>
Create a certificate signing request (CSR) for the local instance:
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
```
[root@icinga2-master1.localdomain /root]# icinga2 pki new-cert --cn icinga2-master1.localdomain \
--key icinga2-master1.localdomain.key \
--csr icinga2-master1.localdomain.csr
```
2016-08-21 12:43:28 +02:00
Sign the CSR with the previously created CA:
2017-11-09 15:14:10 +01:00
```
[root@icinga2-master1.localdomain /root]# icinga2 pki sign-csr --csr icinga2-master1.localdomain.csr --cert icinga2-master1.localdomain
```
Repeat the steps for all instances in your setup.
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
#### Copy Certificates <a id="distributed-monitoring-advanced-hints-certificates-manual-copy"></a>
2017-09-07 16:55:42 +02:00
Copy the host's certificate files and the public CA certificate to `/var/lib/icinga2/certs` :
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
```
[root@icinga2-master1.localdomain /root]# mkdir -p /var/lib/icinga2/certs
[root@icinga2-master1.localdomain /root]# cp icinga2-master1.localdomain.{crt,key} /var/lib/icinga2/certs
[root@icinga2-master1.localdomain /root]# cp /var/lib/icinga2/ca/ca.crt /var/lib/icinga2/certs
```
2016-08-21 12:43:28 +02:00
Ensure that proper permissions are set (replace `icinga` with the Icinga 2 daemon user):
2017-11-09 15:14:10 +01:00
```
[root@icinga2-master1.localdomain /root]# chown -R icinga:icinga /var/lib/icinga2/certs
[root@icinga2-master1.localdomain /root]# chmod 600 /var/lib/icinga2/certs/*.key
[root@icinga2-master1.localdomain /root]# chmod 644 /var/lib/icinga2/certs/*.crt
```
2016-08-21 12:43:28 +02:00
The CA public and private key are stored in the `/var/lib/icinga2/ca` directory. Keep this path secure and include
it in your backups.
2017-11-09 15:14:10 +01:00
#### Create Multiple Certificates <a id="distributed-monitoring-advanced-hints-certificates-manual-multiple"></a>
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
Use your preferred method to automate the certificate generation process.
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
```
[root@icinga2-master1.localdomain /var/lib/icinga2/certs]# for node in icinga2-master1.localdomain icinga2-master2.localdomain icinga2-satellite1.localdomain; do icinga2 pki new-cert --cn $node --csr $node.csr --key $node.key; done
information/base: Writing private key to 'icinga2-master1.localdomain.key'.
information/base: Writing certificate signing request to 'icinga2-master1.localdomain.csr'.
information/base: Writing private key to 'icinga2-master2.localdomain.key'.
information/base: Writing certificate signing request to 'icinga2-master2.localdomain.csr'.
information/base: Writing private key to 'icinga2-satellite1.localdomain.key'.
information/base: Writing certificate signing request to 'icinga2-satellite1.localdomain.csr'.
[root@icinga2-master1.localdomain /var/lib/icinga2/certs]# for node in icinga2-master1.localdomain icinga2-master2.localdomain icinga2-satellite1.localdomain; do sudo icinga2 pki sign-csr --csr $node.csr --cert $node.crt; done
information/pki: Writing certificate to file 'icinga2-master1.localdomain.crt'.
information/pki: Writing certificate to file 'icinga2-master2.localdomain.crt'.
information/pki: Writing certificate to file 'icinga2-satellite1.localdomain.crt'.
```
2016-08-21 12:43:28 +02:00
2017-11-09 15:14:10 +01:00
Copy and move these certificates to the respective instances e.g. with SSH/SCP.
2016-08-21 12:43:28 +02:00
2017-07-12 20:46:12 +02:00
## Automation <a id="distributed-monitoring-automation"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
These hints should get you started with your own automation tools (Puppet, Ansible, Chef, Salt, etc.)
or custom scripts for automated setup.
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
These are collected best practices from various community channels.
2017-07-12 20:46:12 +02:00
* [Silent Windows setup ](06-distributed-monitoring.md#distributed-monitoring-automation-windows-silent )
* [Node Setup CLI command ](06-distributed-monitoring.md#distributed-monitoring-automation-cli-node-setup ) with parameters
2016-08-20 14:51:05 +02:00
If you prefer an alternate method, we still recommend leaving all the Icinga 2 features intact (e.g. `icinga2 feature enable api` ).
You should also use well known and documented default configuration file locations (e.g. `zones.conf` ).
2018-10-18 09:32:14 +02:00
This will tremendously help when someone is trying to help in the [community channels ](https://icinga.com/community/ ).
2016-08-20 14:51:05 +02:00
2017-07-12 20:46:12 +02:00
### Silent Windows Setup <a id="distributed-monitoring-automation-windows-silent"></a>
2016-08-13 15:59:06 +02:00
2019-07-20 12:36:24 +02:00
If you want to install the agent silently/unattended, use the `/qn` modifier. The
2017-10-30 17:02:42 +01:00
installation should not trigger a restart, but if you want to be completely sure, you can use the `/norestart` modifier.
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
C:> msiexec /i C:\Icinga2-v2.5.0-x86.msi /qn /norestart
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Once the setup is completed you can use the `node setup` cli command too.
2017-07-12 20:46:12 +02:00
### Node Setup using CLI Parameters <a id="distributed-monitoring-automation-cli-node-setup"></a>
2016-08-13 15:59:06 +02:00
2016-08-20 14:17:18 +02:00
Instead of using the `node wizard` CLI command, there is an alternative `node setup`
2016-08-21 12:43:28 +02:00
command available which has some prerequisites.
2016-08-13 15:59:06 +02:00
2016-08-23 20:20:15 +02:00
**Note**: The CLI command can be used on Linux/Unix and Windows operating systems.
The graphical Windows setup wizard actively uses these CLI commands.
2016-08-20 14:51:05 +02:00
2017-07-12 20:46:12 +02:00
#### Node Setup on the Master Node <a id="distributed-monitoring-automation-cli-node-setup-master"></a>
2016-08-21 12:43:28 +02:00
In case you want to setup a master node you must add the `--master` parameter
to the `node setup` CLI command. In addition to that the `--cn` can optionally
be passed (defaults to the FQDN).
Parameter | Description
--------------------|--------------------
2020-12-04 17:05:04 +01:00
`--cn` | **Optional.** Common name (CN). By convention this should be the host's FQDN. Defaults to the FQDN.
`--zone` | **Optional.** Zone name. Defaults to `master` .
`--listen` | **Optional.** Address to listen on. Syntax is `host,port` .
`--disable-confd` | **Optional.** If provided, this disables the `include_recursive "conf.d"` directive and adds the `api-users.conf` file inclusion to `icinga2.conf` . Available since v2.9+. Not set by default for compatibility reasons with Puppet, Ansible, Chef, etc.
2016-08-21 12:43:28 +02:00
Example:
2018-05-08 16:31:06 +02:00
```
[root@icinga2-master1.localdomain /]# icinga2 node setup --master
```
2016-08-21 12:43:28 +02:00
In case you want to bind the `ApiListener` object to a specific
host/port you can specify it like this:
2019-03-07 19:56:49 +01:00
```
--listen 192.68.56.101,5665
```
2016-08-21 12:43:28 +02:00
2018-05-08 16:31:06 +02:00
In case you don't need anything in `conf.d` , use the following command line:
```
[root@icinga2-master1.localdomain /]# icinga2 node setup --master --disable-confd
```
2019-07-20 12:36:24 +02:00
<!-- Keep this for compatibility -->
< a id = "distributed-monitoring-automation-cli-node-setup-satellite-client" > < / a >
2016-08-21 12:43:28 +02:00
2019-07-20 12:36:24 +02:00
#### Node Setup with Agents/Satellites <a id="distributed-monitoring-automation-cli-node-setup-agent-satellite"></a>
2016-08-21 12:43:28 +02:00
2020-02-13 16:15:50 +01:00
##### Preparations
2017-10-16 15:32:57 +02:00
Make sure that the `/var/lib/icinga2/certs` directory exists and is owned by the `icinga`
2016-08-20 14:51:05 +02:00
user (or the user Icinga 2 is running as).
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# mkdir -p /var/lib/icinga2/certs
[root@icinga2-agent1.localdomain /]# chown -R icinga:icinga /var/lib/icinga2/certs
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
First you'll need to generate a new local self-signed certificate.
Pass the following details to the `pki new-cert` CLI command:
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Parameter | Description
--------------------|--------------------
2020-12-04 17:05:04 +01:00
`--cn` | **Required.** Common name (CN). By convention this should be the host's FQDN.
`--key` , `--file` | **Required.** Client certificate files. These generated files will be put into the specified location. By convention this should be using `/var/lib/icinga2/certs` as directory.
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Example:
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# icinga2 pki new-cert --cn icinga2-agent1.localdomain \
--key /var/lib/icinga2/certs/icinga2-agent1.localdomain.key \
--cert /var/lib/icinga2/certs/icinga2-agent1.localdomain.crt
2019-03-07 19:56:49 +01:00
```
2016-08-20 14:51:05 +02:00
2020-02-13 16:15:50 +01:00
##### Verify Parent Connection
In order to verify the parent connection and avoid man-in-the-middle attacks,
fetch the parent instance's certificate and verify that it matches the connection.
The `trusted-parent.crt` file is a temporary file passed to `node setup` in the
next step and does not need to be stored for later usage.
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Pass the following details to the `pki save-cert` CLI command:
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Parameter | Description
--------------------|--------------------
2020-12-04 17:05:04 +01:00
`--trustedcert` | **Required.** Store the parent's certificate file. Manually verify that you're trusting it.
`--host` | **Required.** FQDN or IP address of the parent host.
2016-08-13 15:59:06 +02:00
2020-02-13 16:15:50 +01:00
Request the master certificate from the master host (`icinga2-master1.localdomain`)
and store it as `trusted-parent.crt` . Review it and continue.
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
```
2020-02-13 16:15:50 +01:00
[root@icinga2-agent1.localdomain /]# icinga2 pki save-cert \
2019-03-07 19:56:49 +01:00
--trustedcert /var/lib/icinga2/certs/trusted-parent.crt \
--host icinga2-master1.localdomain
2020-02-13 16:15:50 +01:00
information/cli: Retrieving TLS certificate for 'icinga2-master1.localdomain:5665'.
Subject: CN = icinga2-master1.localdomain
Issuer: CN = icinga2-master1.localdomain
Valid From: Feb 4 08:59:05 2020 GMT
Valid Until: Jan 31 08:59:05 2035 GMT
Fingerprint: B4 90 DE 46 81 DD 2E BF EE 9D D5 47 61 43 EF C6 6D 86 A6 CC
***
*** You have to ensure that this certificate actually matches the parent
*** instance's certificate in order to avoid man-in-the-middle attacks.
***
information/pki: Writing certificate to file '/var/lib/icinga2/certs/trusted-parent.crt'.
2019-03-07 19:56:49 +01:00
```
2016-08-13 15:59:06 +02:00
2020-02-13 16:15:50 +01:00
##### Node Setup
Continue with the additional `node setup` step. Specify a local endpoint and zone name (`icinga2-agent1.localdomain`)
2016-08-20 14:51:05 +02:00
and set the master host (`icinga2-master1.localdomain`) as parent zone configuration. Specify the path to
2020-02-13 16:15:50 +01:00
the previously stored trusted parent certificate (`trusted-parent.crt`).
2016-08-13 15:59:06 +02:00
2016-08-21 12:43:28 +02:00
Pass the following details to the `node setup` CLI command:
2016-08-13 15:59:06 +02:00
2016-08-20 14:51:05 +02:00
Parameter | Description
--------------------|--------------------
2020-12-04 17:05:04 +01:00
`--cn` | **Optional.** Common name (CN). By convention this should be the host's FQDN.
`--ticket` | **Required.** Request ticket. Add the previously generated [ticket number ](06-distributed-monitoring.md#distributed-monitoring-setup-csr-auto-signing ).
`--trustedcert` | **Required.** Trusted parent certificate file as connection verification (received via 'pki save-cert').
`--parent_host` | **Optional.** FQDN or IP address of the parent host. This is where the command connects for CSR signing. If not specified, you need to manually copy the parent's public CA certificate file into `/var/lib/icinga2/certs/ca.crt` in order to start Icinga 2.
`--endpoint` | **Required.** Specifies the parent's endpoint name.
`--zone` | **Required.** Specifies the agent/satellite zone name.
`--parent_zone` | **Optional.** Specifies the parent's zone name.
`--accept-config` | **Optional.** Whether this node accepts configuration sync from the master node (required for [config sync mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-config-sync )).
`--accept-commands` | **Optional.** Whether this node accepts command execution messages from the master node (required for [command endpoint mode ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )).
`--global_zones` | **Optional.** Allows to specify more global zones in addition to `global-templates` and `director-global` .
`--disable-confd` | **Optional.** If provided, this disables the `include_recursive "conf.d"` directive in `icinga2.conf` . Available since v2.9+. Not set by default for compatibility reasons with Puppet, Ansible, Chef, etc.
2016-08-20 14:51:05 +02:00
2018-04-06 20:19:43 +02:00
> **Note**
>
2019-04-08 14:08:14 +02:00
> The `master_host` parameter is deprecated and will be removed. Please use `--parent_host` instead.
2018-04-06 20:19:43 +02:00
2019-04-08 14:08:14 +02:00
Example:
2016-08-20 14:51:05 +02:00
2018-05-08 16:31:06 +02:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# icinga2 node setup --ticket ead2d570e18c78abf285d6b85524970a0f69c22d \
--cn icinga2-agent1.localdomain \
2018-05-08 16:31:06 +02:00
--endpoint icinga2-master1.localdomain \
2019-07-19 14:44:14 +02:00
--zone icinga2-agent1.localdomain \
2018-05-08 16:31:06 +02:00
--parent_zone master \
--parent_host icinga2-master1.localdomain \
--trustedcert /var/lib/icinga2/certs/trusted-parent.crt \
--accept-commands --accept-config \
--disable-confd
```
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
In case the agent/satellite should connect to the master node, you'll
2016-08-20 14:17:18 +02:00
need to modify the `--endpoint` parameter using the format `cn,host,port` :
2016-08-13 15:59:06 +02:00
2019-03-07 19:56:49 +01:00
```
--endpoint icinga2-master1.localdomain,192.168.56.101,5665
```
2016-08-13 15:59:06 +02:00
2018-04-06 20:19:43 +02:00
Specify the parent zone using the `--parent_zone` parameter. This is useful
2019-07-20 12:36:24 +02:00
if the agent connects to a satellite, not the master instance.
2018-04-06 20:19:43 +02:00
2019-03-07 19:56:49 +01:00
```
--parent_zone satellite
```
2018-04-06 20:19:43 +02:00
2019-07-20 12:36:24 +02:00
In case the agent should know the additional global zone `linux-templates` , you'll
2018-02-27 21:22:29 +01:00
need to set the `--global-zones` parameter.
2019-03-07 19:56:49 +01:00
```
--global_zones linux-templates
```
2018-04-06 20:19:43 +02:00
The `--parent-host` parameter is optional since v2.9 and allows you to perform a connection-less setup.
You cannot restart Icinga 2 yet, the CLI command asked to to manually copy the parent's public CA
certificate file in `/var/lib/icinga2/certs/ca.crt` . Once Icinga 2 is started, it sends
a ticket signing request to the parent node. If you have provided a ticket, the master node
2019-07-20 12:36:24 +02:00
signs the request and sends it back to the agent/satellite which performs a certificate update in-memory.
2018-02-27 21:22:29 +01:00
2020-02-13 16:15:50 +01:00
In case you did not provide a ticket, you need to [manually sign the CSR on the master node ](06-distributed-monitoring.md#distributed-monitoring-setup-on-demand-csr-signing-master )
2018-04-06 20:19:43 +02:00
which holds the CA's key pair.
2016-08-13 15:59:06 +02:00
2016-08-31 13:25:57 +02:00
**You can find additional best practices below.**
2016-08-20 14:51:05 +02:00
2019-07-20 12:36:24 +02:00
If this agent node is configured as [remote command endpoint execution ](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint )
2016-08-20 14:51:05 +02:00
you can safely disable the `checker` feature. The `node setup` CLI command already disabled the `notification` feature.
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# icinga2 feature disable checker
2019-03-07 19:56:49 +01:00
```
2016-08-20 14:51:05 +02:00
**Optional**: Add an ApiUser object configuration for remote troubleshooting.
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# cat < < EOF > /etc/icinga2/conf.d/api-users.conf
2019-03-07 19:56:49 +01:00
object ApiUser "root" {
2019-07-20 12:36:24 +02:00
password = "agentsupersecretpassword"
2019-03-07 19:56:49 +01:00
permissions = ["*"]
}
EOF
```
2016-08-20 14:51:05 +02:00
2016-08-31 13:25:57 +02:00
Finally restart Icinga 2.
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
```
2019-07-19 14:44:14 +02:00
[root@icinga2-agent1.localdomain /]# systemctl restart icinga2
2019-03-07 19:56:49 +01:00
```
2016-08-20 14:51:05 +02:00
Your automation tool must then configure master node in the meantime.
2019-03-07 19:56:49 +01:00
```
# cat <<EOF >>/etc/icinga2/zones.conf
2019-07-19 14:44:14 +02:00
object Endpoint "icinga2-agent1.localdomain" {
2019-07-20 12:36:24 +02:00
// Agent connects itself
2019-03-07 19:56:49 +01:00
}
2016-08-20 14:51:05 +02:00
2019-07-19 14:44:14 +02:00
object Zone "icinga2-agent1.localdomain" {
endpoints = [ "icinga2-agent1.localdomain" ]
2019-03-07 19:56:49 +01:00
parent = "master"
}
2016-08-20 14:51:05 +02:00
2019-03-07 19:56:49 +01:00
EOF
```
2018-08-02 14:09:21 +02:00
## Using Multiple Environments <a id="distributed-monitoring-environments"></a>
2019-07-20 12:36:24 +02:00
> **Note**
>
> This documentation only covers the basics. Full functionality requires a not yet released addon.
2018-08-09 12:22:55 +02:00
In some cases it can be desired to run multiple Icinga instances on the same host.
Two potential scenarios include:
2018-08-02 14:09:21 +02:00
2018-08-09 12:22:55 +02:00
* Different versions of the same monitoring configuration (e.g. production and testing)
* Disparate sets of checks for entirely unrelated monitoring environments (e.g. infrastructure and applications)
2018-08-02 14:09:21 +02:00
2018-08-10 12:53:06 +02:00
The configuration is done with the global constants `ApiBindHost` and `ApiBindPort`
or the `bind_host` and `bind_port` attributes of the
2018-08-09 12:22:55 +02:00
[ApiListener ](09-object-types.md#objecttype-apilistener ) object.
2018-08-02 14:09:21 +02:00
2018-08-10 12:53:06 +02:00
The environment must be set with the global constant `Environment` or as object attribute
2018-08-24 11:54:20 +02:00
of the [IcingaApplication ](09-object-types.md#objecttype-icingaapplication ) object.
2018-08-02 14:09:21 +02:00
2018-08-10 12:53:06 +02:00
In any case the constant is default value for the attribute and the direct configuration in the objects
have more precedence. The constants have been added to allow the values being set from the CLI on startup.
2018-08-02 14:09:21 +02:00
2018-08-09 12:22:55 +02:00
When Icinga establishes a TLS connection to another cluster instance it automatically uses the [SNI extension ](https://en.wikipedia.org/wiki/Server_Name_Indication )
2018-08-02 14:09:21 +02:00
to signal which endpoint it is attempting to connect to. On its own this can already be used to position multiple
Icinga instances behind a load balancer.
2019-07-19 14:44:14 +02:00
SNI example: `icinga2-agent1.localdomain`
2018-08-02 14:09:21 +02:00
2018-08-10 12:53:06 +02:00
However, if the environment is configured to `production` , Icinga appends the environment name to the SNI hostname like this:
2018-08-02 14:09:21 +02:00
2019-07-19 14:44:14 +02:00
SNI example with environment: `icinga2-agent1.localdomain:production`
2018-08-02 14:09:21 +02:00
Middleware like loadbalancers or TLS proxies can read the SNI header and route the connection to the appropriate target.
I.e., it uses a single externally-visible TCP port (usually 5665) and forwards connections to one or more Icinga
instances which are bound to a local TCP port. It does so by inspecting the environment name that is sent as part of the
SNI extension.