Docs: Cluster naming convention for clients, troubleshooting for overdue check results

fixes #10216
fixes #10207
This commit is contained in:
Michael Friedrich 2015-09-25 11:32:34 +02:00
parent 497bce34b3
commit 4c87f62db2
4 changed files with 82 additions and 24 deletions

View File

@ -25,6 +25,13 @@ monitoring and high-availability, please continue reading in
* Clients as [command execution bridge](10-icinga2-client.md#icinga2-client-configuration-command-bridge) without local configuration
* Clients receive their configuration from the master ([Cluster config sync](10-icinga2-client.md#icinga2-client-configuration-master-config-sync))
Keep the [naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention) for nodes in mind.
> **Tip**
>
> If you're looking for troubleshooting clients problems, check the general
> [cluster troubleshooting](17-troubleshooting.md#troubleshooting-cluster) section.
### <a id="icinga2-client-configuration-combined-scenarios"></a> Combined Client Scenarios
If your setup consists of remote clients with local configuration but also command execution bridges
@ -51,13 +58,22 @@ If you are planning to use the Icinga 2 client inside a distributed setup, refer
### <a id="icinga2-client-installation-firewall"></a> Configure the Firewall
Icinga 2 master, satellite and client instances communicate using the default tcp
port `5665`. The communication is bi-directional and the first node opening the
connection "wins" if there are both connection ways enabled in your firewall policies.
port `5665`.
Communication between zones requires one of these connection directions:
* The parent zone nodes are able to connect to the child zone nodes (`parent => child`).
* The child zone nodes are able to connect to the parent zone nodes (`parent <= child`).
* Both connnection directions work.
If you are going to use CSR-Autosigning, you must (temporarly) allow the client
connecting to the master instance and open the firewall port. Once the client install is done,
you can close the port and use a different communication direction (master-to-client).
In case of a [multiple hierarchy setup](13-distributed-monitoring-ha.md#cluster-scenarios-master-satellite-clients)
(master, satellite, client) you will need to manually deploy your [client certificates](11-icinga2-client.md#certificates-manual-creation)
and zone configuration.
### <a id="icinga2-client-installation-master-setup"></a> Setup the Master for Remote Clients
If you are planning to use the [remote Icinga 2 clients](10-icinga2-client.md#icinga2-client)
@ -170,7 +186,10 @@ First you'll need to define a secure ticket salt in the [constants.conf](4-confi
The [setup wizard for the master setup](10-icinga2-client.md#icinga2-client-installation-master-setup) will create
one for you already.
# grep TicketSalt /etc/icinga2/constants.conf
> **Note**
>
> **Never** expose the ticket salt to your clients. This is the master's private key
> and must remain on the master providing the CSR Auto-Signing functionality for security reasons.
The client setup wizard will ask you to generate a valid ticket number using its CN.
If you already know your remote client's Common Names (CNs) - usually the FQDN - you
@ -184,12 +203,6 @@ Example for a client:
# icinga2 pki ticket --cn icinga2-node2.localdomain
> **Note**
>
> You can omit the `--salt` parameter using the `TicketSalt` constant from
> [constants.conf](4-configuring-icinga-2.md#constants-conf) if already defined and Icinga 2 was
> reloaded after the master setup.
### <a id="certificates-manual-creation"></a> Manual SSL Certificate Generation
This is described separately in the [cluster setup chapter](12-distributed-monitoring-ha.md#manual-certificate-generation).

View File

@ -19,13 +19,25 @@ is in effect - all alive instances continue to do their job, and history will be
Before you start deploying, keep the following things in mind:
* Your [SSL CA and certificates](12-distributed-monitoring-ha.md#manual-certificate-generation) are mandatory for secure communication
* Get pen and paper or a drawing board and design your nodes and zones!
* all nodes in a cluster zone are providing high availability functionality and trust each other
* cluster zones can be built in a Top-Down-design where the child trusts the parent
* communication between zones happens bi-directional which means that a DMZ-located node can still reach the master node, or vice versa
* Update firewall rules and ACLs
* Decide whether to use the built-in [configuration syncronization](12-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc) to manage the configuration deployment
Your [SSL CA and certificates](13-distributed-monitoring-ha.md#manual-certificate-generation) are mandatory for secure communication.
Communication between zones requires one of these connection directions:
* The parent zone nodes are able to connect to the child zone nodes (`parent => child`).
* The child zone nodes are able to connect to the parent zone nodes (`parent <= child`).
* Both connnection directions work.
Update firewall rules and ACLs.
* Icinga 2 master, satellite and client instances communicate using the default tcp port `5665`.
Get pen and paper or a drawing board and design your nodes and zones!
* Keep the [naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention) for nodes in mind.
* All nodes (endpoints) in a cluster zone provide high availability functionality and trust each other.
* Cluster zones can be built in a Top-Down-design where the child trusts the parent.
Decide whether to use the built-in [configuration syncronization](13-distributed-monitoring-ha.md#cluster-zone-config-sync) or use an external tool (Puppet, Ansible, Chef, Salt, etc) to manage the configuration deployment.
> **Tip**
@ -86,17 +98,19 @@ If you're planning to use your existing CA and certificates please note that you
use wildcard certificates. The common name (CN) is mandatory for the cluster communication and
therefore must be unique for each connecting instance.
### <a id="cluster-naming-convention"></a> Cluster Naming Convention
## <a id="cluster-naming-convention"></a> Cluster Naming Convention
The SSL certificate common name (CN) will be used by the [ApiListener](6-object-types.md#objecttype-apilistener)
object to determine the local authority. This name must match the local [Endpoint](6-object-types.md#objecttype-endpoint)
object name.
Example:
Certificate generation for host with the FQDN `icinga2a`:
# icinga2 pki new-cert --cn icinga2a --key icinga2a.key --csr icinga2a.csr
# icinga2 pki sign-csr --csr icinga2a.csr --cert icinga2a.crt
Add a new `Endpoint` object named `icinga2a`:
# vim zones.conf
object Endpoint "icinga2a" {
@ -119,6 +133,8 @@ the same name as used for the endpoint name and common name above. If not set, t
const NodeName = "icinga2a"
If you're using the host's FQDN everywhere, you're on the safe side. The setup wizards
will do the very same.
## <a id="cluster-configuration"></a> Cluster Configuration
@ -558,8 +574,6 @@ You'll need to think about the following:
* Combine that with command execution brdiges on remote clients and also satellites
### <a id="cluster-scenarios-security"></a> Security in Cluster Scenarios
While there are certain capabilities to ensure the safe communication between all

View File

@ -310,10 +310,41 @@ If the cluster zones do not sync their configuration, make sure to check the fol
* The `icinga2.log` log file in `/var/log/icinga2` will indicate whether this ApiListener
[accepts config](12-distributed-monitoring-ha.md#zone-config-sync-permissions), or not.
### <a id="troubleshooting-cluster-replay-log"></a> Cluster Troubleshooting Overdue Check Results
If your master does not receive check results (or any other events) from the child zones
(satellite, clients, etc) make sure to check whether the client sending in events
is allowed to do so.
The [cluster naming convention](13-distributed-monitoring-ha.md#cluster-naming-convention)
applies so if there's a mismatch between your client node's endpoint name and its provided
certificate's CN, the master will deny all events.
> **Tip**
>
> [Icinga Web 2](2-getting-started.md#setting-up-the-user-interface) provides a dashboard view
> for overdue check results.
Enable the [debug log](17-troubleshooting.md#troubleshooting-enable-debug-output) on the master
for more verbose insights.
If the client cannot authenticate, it's a more general [problem](17-troubleshooting.md#troubleshooting-cluster-unauthenticated-clients).
The client's endpoint is not configured on nor trusted by the master node:
Discarding 'check result' message from 'icinga2b': Invalid endpoint origin (client not allowed).
The check result message sent by the client does not belong to the zone the checkable object is
in on the master:
Discarding 'check result' message from 'icinga2b': Unauthorized access.
### <a id="troubleshooting-cluster-replay-log"></a> Cluster Troubleshooting Replay Log
If your `/var/lib/icinga2/api/log` directory grows, it generally means that your cluster
cannot replay the log on connection loss and re-establishment.
cannot replay the log on connection loss and re-establishment. A master node for example
will store all events for not connected endpoints in the same and child zones.
Check the following:

View File

@ -1,8 +1,8 @@
# <a id="getting-started"></a> Getting Started
This tutorial is a step-by-step introduction to installing Icinga 2 and
Icinga Web 2. It assumes that you are familiar with the operating system
you're using to install Icinga 2.
This tutorial is a step-by-step introduction to installing [Icinga 2](2-getting-started.md#setting-up-icinga2)
and [Icinga Web 2](2-getting-started.md#setting-up-the-user-interface).
It assumes that you are familiar with the operating system you're using to install Icinga 2.
## <a id="setting-up-icinga2"></a> Setting up Icinga 2