mirror of https://github.com/Icinga/icinga2.git
Docs: Cluster naming convention for clients, troubleshooting for overdue check results
fixes #10216 fixes #10207
This commit is contained in:
parent
497bce34b3
commit
4c87f62db2
|
@ -25,6 +25,13 @@ monitoring and high-availability, please continue reading in
|
|||
* Clients as [command execution bridge](10-icinga2-client.md#icinga2-client-configuration-command-bridge) without local configuration
|
||||
* Clients receive their configuration from the master ([Cluster config sync](10-icinga2-client.md#icinga2-client-configuration-master-config-sync))
|
||||
|
||||
Keep the [naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention) for nodes in mind.
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
> If you're looking for troubleshooting clients problems, check the general
|
||||
> [cluster troubleshooting](17-troubleshooting.md#troubleshooting-cluster) section.
|
||||
|
||||
### <a id="icinga2-client-configuration-combined-scenarios"></a> Combined Client Scenarios
|
||||
|
||||
If your setup consists of remote clients with local configuration but also command execution bridges
|
||||
|
@ -51,13 +58,22 @@ If you are planning to use the Icinga 2 client inside a distributed setup, refer
|
|||
### <a id="icinga2-client-installation-firewall"></a> Configure the Firewall
|
||||
|
||||
Icinga 2 master, satellite and client instances communicate using the default tcp
|
||||
port `5665`. The communication is bi-directional and the first node opening the
|
||||
connection "wins" if there are both connection ways enabled in your firewall policies.
|
||||
port `5665`.
|
||||
|
||||
Communication between zones requires one of these connection directions:
|
||||
|
||||
* The parent zone nodes are able to connect to the child zone nodes (`parent => child`).
|
||||
* The child zone nodes are able to connect to the parent zone nodes (`parent <= child`).
|
||||
* Both connnection directions work.
|
||||
|
||||
If you are going to use CSR-Autosigning, you must (temporarly) allow the client
|
||||
connecting to the master instance and open the firewall port. Once the client install is done,
|
||||
you can close the port and use a different communication direction (master-to-client).
|
||||
|
||||
In case of a [multiple hierarchy setup](13-distributed-monitoring-ha.md#cluster-scenarios-master-satellite-clients)
|
||||
(master, satellite, client) you will need to manually deploy your [client certificates](11-icinga2-client.md#certificates-manual-creation)
|
||||
and zone configuration.
|
||||
|
||||
### <a id="icinga2-client-installation-master-setup"></a> Setup the Master for Remote Clients
|
||||
|
||||
If you are planning to use the [remote Icinga 2 clients](10-icinga2-client.md#icinga2-client)
|
||||
|
@ -170,7 +186,10 @@ First you'll need to define a secure ticket salt in the [constants.conf](4-confi
|
|||
The [setup wizard for the master setup](10-icinga2-client.md#icinga2-client-installation-master-setup) will create
|
||||
one for you already.
|
||||
|
||||
# grep TicketSalt /etc/icinga2/constants.conf
|
||||
> **Note**
|
||||
>
|
||||
> **Never** expose the ticket salt to your clients. This is the master's private key
|
||||
> and must remain on the master providing the CSR Auto-Signing functionality for security reasons.
|
||||
|
||||
The client setup wizard will ask you to generate a valid ticket number using its CN.
|
||||
If you already know your remote client's Common Names (CNs) - usually the FQDN - you
|
||||
|
@ -184,12 +203,6 @@ Example for a client:
|
|||
# icinga2 pki ticket --cn icinga2-node2.localdomain
|
||||
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> You can omit the `--salt` parameter using the `TicketSalt` constant from
|
||||
> [constants.conf](4-configuring-icinga-2.md#constants-conf) if already defined and Icinga 2 was
|
||||
> reloaded after the master setup.
|
||||
|
||||
### <a id="certificates-manual-creation"></a> Manual SSL Certificate Generation
|
||||
|
||||
This is described separately in the [cluster setup chapter](12-distributed-monitoring-ha.md#manual-certificate-generation).
|
||||
|
|
|
@ -19,13 +19,25 @@ is in effect - all alive instances continue to do their job, and history will be
|
|||
|
||||
Before you start deploying, keep the following things in mind:
|
||||
|
||||
* Your [SSL CA and certificates](12-distributed-monitoring-ha.md#manual-certificate-generation) are mandatory for secure communication
|
||||
* Get pen and paper or a drawing board and design your nodes and zones!
|
||||
* all nodes in a cluster zone are providing high availability functionality and trust each other
|
||||
* cluster zones can be built in a Top-Down-design where the child trusts the parent
|
||||
* communication between zones happens bi-directional which means that a DMZ-located node can still reach the master node, or vice versa
|
||||
* Update firewall rules and ACLs
|
||||
* Decide whether to use the built-in [configuration syncronization](12-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc) to manage the configuration deployment
|
||||
Your [SSL CA and certificates](13-distributed-monitoring-ha.md#manual-certificate-generation) are mandatory for secure communication.
|
||||
|
||||
Communication between zones requires one of these connection directions:
|
||||
|
||||
* The parent zone nodes are able to connect to the child zone nodes (`parent => child`).
|
||||
* The child zone nodes are able to connect to the parent zone nodes (`parent <= child`).
|
||||
* Both connnection directions work.
|
||||
|
||||
Update firewall rules and ACLs.
|
||||
|
||||
* Icinga 2 master, satellite and client instances communicate using the default tcp port `5665`.
|
||||
|
||||
Get pen and paper or a drawing board and design your nodes and zones!
|
||||
|
||||
* Keep the [naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention) for nodes in mind.
|
||||
* All nodes (endpoints) in a cluster zone provide high availability functionality and trust each other.
|
||||
* Cluster zones can be built in a Top-Down-design where the child trusts the parent.
|
||||
|
||||
Decide whether to use the built-in [configuration syncronization](13-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc) to manage the configuration deployment.
|
||||
|
||||
|
||||
> **Tip**
|
||||
|
@ -86,17 +98,19 @@ If you're planning to use your existing CA and certificates please note that you
|
|||
use wildcard certificates. The common name (CN) is mandatory for the cluster communication and
|
||||
therefore must be unique for each connecting instance.
|
||||
|
||||
### <a id="cluster-naming-convention"></a> Cluster Naming Convention
|
||||
## <a id="cluster-naming-convention"></a> Cluster Naming Convention
|
||||
|
||||
The SSL certificate common name (CN) will be used by the [ApiListener](6-object-types.md#objecttype-apilistener)
|
||||
object to determine the local authority. This name must match the local [Endpoint](6-object-types.md#objecttype-endpoint)
|
||||
object name.
|
||||
|
||||
Example:
|
||||
Certificate generation for host with the FQDN `icinga2a`:
|
||||
|
||||
# icinga2 pki new-cert --cn icinga2a --key icinga2a.key --csr icinga2a.csr
|
||||
# icinga2 pki sign-csr --csr icinga2a.csr --cert icinga2a.crt
|
||||
|
||||
Add a new `Endpoint` object named `icinga2a`:
|
||||
|
||||
# vim zones.conf
|
||||
|
||||
object Endpoint "icinga2a" {
|
||||
|
@ -119,6 +133,8 @@ the same name as used for the endpoint name and common name above. If not set, t
|
|||
|
||||
const NodeName = "icinga2a"
|
||||
|
||||
If you're using the host's FQDN everywhere, you're on the safe side. The setup wizards
|
||||
will do the very same.
|
||||
|
||||
## <a id="cluster-configuration"></a> Cluster Configuration
|
||||
|
||||
|
@ -558,8 +574,6 @@ You'll need to think about the following:
|
|||
* Combine that with command execution brdiges on remote clients and also satellites
|
||||
|
||||
|
||||
|
||||
|
||||
### <a id="cluster-scenarios-security"></a> Security in Cluster Scenarios
|
||||
|
||||
While there are certain capabilities to ensure the safe communication between all
|
||||
|
|
|
@ -310,10 +310,41 @@ If the cluster zones do not sync their configuration, make sure to check the fol
|
|||
* The `icinga2.log` log file in `/var/log/icinga2` will indicate whether this ApiListener
|
||||
[accepts config](12-distributed-monitoring-ha.md#zone-config-sync-permissions), or not.
|
||||
|
||||
### <a id="troubleshooting-cluster-replay-log"></a> Cluster Troubleshooting Overdue Check Results
|
||||
|
||||
If your master does not receive check results (or any other events) from the child zones
|
||||
(satellite, clients, etc) make sure to check whether the client sending in events
|
||||
is allowed to do so.
|
||||
|
||||
The [cluster naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention)
|
||||
applies so if there's a mismatch between your client node's endpoint name and its provided
|
||||
certificate's CN, the master will deny all events.
|
||||
|
||||
> **Tip**
|
||||
>
|
||||
> [Icinga Web 2](2-getting-started.md#setting-up-the-user-interface) provides a dashboard view
|
||||
> for overdue check results.
|
||||
|
||||
Enable the [debug log](17-troubleshooting.md#troubleshooting-enable-debug-output) on the master
|
||||
for more verbose insights.
|
||||
|
||||
If the client cannot authenticate, it's a more general [problem](17-troubleshooting.md#troubleshooting-cluster-unauthenticated-clients).
|
||||
|
||||
The client's endpoint is not configured on nor trusted by the master node:
|
||||
|
||||
Discarding 'check result' message from 'icinga2b': Invalid endpoint origin (client not allowed).
|
||||
|
||||
The check result message sent by the client does not belong to the zone the checkable object is
|
||||
in on the master:
|
||||
|
||||
Discarding 'check result' message from 'icinga2b': Unauthorized access.
|
||||
|
||||
|
||||
### <a id="troubleshooting-cluster-replay-log"></a> Cluster Troubleshooting Replay Log
|
||||
|
||||
If your `/var/lib/icinga2/api/log` directory grows, it generally means that your cluster
|
||||
cannot replay the log on connection loss and re-establishment.
|
||||
cannot replay the log on connection loss and re-establishment. A master node for example
|
||||
will store all events for not connected endpoints in the same and child zones.
|
||||
|
||||
Check the following:
|
||||
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
# <a id="getting-started"></a> Getting Started
|
||||
|
||||
This tutorial is a step-by-step introduction to installing Icinga 2 and
|
||||
Icinga Web 2. It assumes that you are familiar with the operating system
|
||||
you're using to install Icinga 2.
|
||||
This tutorial is a step-by-step introduction to installing [Icinga 2](2-getting-started.md#setting-up-icinga2)
|
||||
and [Icinga Web 2](2-getting-started.md#setting-up-the-user-interface).
|
||||
It assumes that you are familiar with the operating system you're using to install Icinga 2.
|
||||
|
||||
## <a id="setting-up-icinga2"></a> Setting up Icinga 2
|
||||
|
||||
|
|
Loading…
Reference in New Issue