Documentation: Update cluster zones from latest feedback

This is partly related to

refs #6703
This commit is contained in:
Michael Friedrich 2014-08-01 16:18:30 +02:00
parent 3f647bb779
commit 2b91b3124d
1 changed files with 95 additions and 30 deletions

View File

@ -30,7 +30,7 @@ have the `snmp-community` custom attribute.
check_command = "snmp"
vars.snmp_oid = "1.3.6.1.2.1.1.3.0"
assign where host.vars.snmp_community != ""
}
@ -75,7 +75,7 @@ Example:
object Service "users" {
import "generic-service"
host_name = "remote-nrpe-host"
check_command = "nrpe"
@ -100,7 +100,7 @@ Example:
object Service "disk" {
import "generic-service"
host_name = "remote-windows-host"
check_command = "nscp"
@ -123,6 +123,11 @@ immediate replacement, but without any local configuration - or pushing
their standalone configuration back to the master node including their check
result messages.
> **Note**
>
> Remote checker instances are independent Icinga 2 instances which schedule
> their checks and just synchronize them back to the defined master zone.
### <a id="agent-based-checks-snmp-traps"></a> Passive Check Results and SNMP Traps
SNMP Traps can be received and filtered by using [SNMPTT](http://snmptt.sourceforge.net/) and specific trap handlers
@ -177,6 +182,15 @@ Now create a certificate and key file for each node running the following comman
Repeat the step for all nodes in your cluster scenario. Save the CA key in case
you want to set up certificates for additional nodes at a later time.
Each node requires the following files in `etc/icinga2/pki` (replace ` fqdn-nodename` with
the host's FQDN):
* ca.crt
* <fqdn-nodename>.crt
* <fqdn-nodename>.key
### <a id="configure-nodename"></a> Configure the Icinga Node Name
Instead of using the default FQDN as node name you can optionally set
@ -225,7 +239,7 @@ Specifying the local node name using the [NodeName](#global-constants) variable
the same name as used for the endpoint name and common name above. If not set, the FQDN is used.
const NodeName = "icinga2a"
### <a id="configure-clusterlistener-object"></a> Configure the ApiListener Object
@ -246,10 +260,12 @@ You can simply enable the `api` feature using
# icinga2-enable-feature api
Edit `/etc/icinga2/features-enabled/api.conf` if you require the configuration
synchronisation enabled.
synchronisation enabled for this node.
The certificate files must be readable by the user Icinga 2 is running as. Also,
the private key file must not be world-readable.
> **Note**
>
> The certificate files must be readable by the user Icinga 2 is running as. Also,
> the private key file must not be world-readable.
### <a id="configure-cluster-endpoints"></a> Configure Cluster Endpoints
@ -309,6 +325,21 @@ By default all objects for specific zones should be organized in
/etc/icinga2/zones.d/<zonename>
on the configuration master. You should remove the sample config included in
`conf.d` by commenting the `recursive_include` statement in [icinga2.conf](#icinga2-conf):
//include_recursive "conf.d"
Better use a dedicated directory name like `cluster` or similar, and include that
one if your nodes require local configuration not being synced to other nodes. That's
useful for local [health checks](#cluster-health-check) for example.
> **Note**
>
> In a [high availability](#cluster-scenarios-high-availability)
> setup only one assigned node can act as configuration master. All other zone
> member nodes must not have the `/etc/icinga2/zones.d` directory populated.
These zone packages are then distributed to all nodes in the same zone, and
to their respective target zone instances.
@ -340,15 +371,15 @@ process.
> determines the required include directory. This can be overridden using the
> [global constant](#global-constants) `ZonesDir`.
#### <a id="zone-synchronisation-permissions"></a> Global configuration zone
#### <a id="zone-synchronisation-permissions"></a> Global Configuration Zone
If your zone configuration setup shares the same templates, groups, commands, timeperiods, etc.
you would have to duplicate quite a lot of configuration objects making the merged configuration
on your configuration master unique.
That is not necessary by defining a global zone shipping all those templates. By settting
`global = true` you ensure that this zone configuration template will be synchronized to all
involved nodes (only if they accept configuration though).
That is not necessary by defining a global zone shipping all those templates. By setting
`global = true` you ensure that this zone serving common configuration templates will be
synchronized to all involved nodes (only if they accept configuration though).
/etc/icinga2/zones.d
global-templates/
@ -417,6 +448,21 @@ Each cluster node should execute its own local cluster health check to
get an idea about network related connection problems from different
points of view.
Additionally you can monitor the connection from the local zone to the remote
connected zones.
Example for the `checker` zone checking the connection to the `master` zone:
apply Service "cluster-zone-master" {
check_command = "cluster-zone"
check_interval = 5s
retry_interval = 1s
vars.cluster_zone = "master"
assign where host.name == "icinga2b"
}
### <a id="host-multiple-cluster-nodes"></a> Host With Multiple Cluster Nodes
Special scenarios might require multiple cluster nodes running on a single host.
@ -431,11 +477,22 @@ the Icinga 2 daemon.
### <a id="cluster-scenarios"></a> Cluster Scenarios
All cluster nodes are full-featured Icinga 2 instances. You only need to enabled
the features for their role (for example, a `Checker` node only requires the `checker`
feature enabled, but not `notification` or `ido-mysql` features).
Each instance got their own event scheduler, and does not depend on a centralized master
coordinating and distributing the events. In case of a cluster failure, all nodes
continue to run independently. Be alarmed when your cluster fails and a Split-Brain-scenario
is in effect - all alive instances continue to do their job, and history will begin to differ.
#### <a id="cluster-scenarios-features"></a> Features in Cluster Zones
Each cluster zone may use available features. If you have multiple locations
Each cluster zone may use all available features. If you have multiple locations
or departments, they may write to their local database, or populate graphite.
Even further all commands are distributed.
Even further all commands are distributed amongst connected nodes. For example, you could
re-schedule a check or acknowledge a problem on the master, and it gets replicated to the
actual slave checker node.
DB IDO on the left, graphite on the right side - works.
Icinga Web 2 on the left, checker and notifications on the right side - works too.
@ -449,17 +506,18 @@ to a central instance. Their network connection only works towards the central m
(or the master is able to connect, depending on firewall policies) which means
remote instances won't see each/connect to each other.
All events are synced to the central node, but the remote nodes can still run
local features such as a web interface, reporting, graphing, etc. in their own specified
zone.
All events (check results, downtimes, comments, etc) are synced to the central node,
but the remote nodes can still run local features such as a web interface, reporting,
graphing, etc. in their own specified zone.
Imagine the following example with a central node in Nuremberg, and two remote DMZ
based instances in Berlin and Vienna. The configuration tree on the central instance
could look like this:
conf.d/
templates/
zones.d
global-templates/
templates.conf
groups.conf
nuremberg/
local.conf
berlin/
@ -502,6 +560,10 @@ The zones would look like:
parent = "nuremberg"
}
object Zone "global-templates" {
global = true
}
The `nuremberg-master` zone will only execute local checks, and receive
check results from the satellite nodes in the zones `berlin` and `vienna`.
@ -515,13 +577,12 @@ you can achieve that by:
* Let Icinga 2 distribute the load amongst all available nodes.
That way all remote check instances will receive the same configuration
but only execute their part. The central instance can also execute checks,
but you may also disable the `Checker` feature.
but only execute their part. The central instance located in the `master` zone
can also execute checks, but you may also disable the `Checker` feature.
conf.d/
templates/
zones.d/
central/
global-templates/
master/
checker/
If you are planning to have some checks executed by a specific set of checker nodes
@ -544,7 +605,7 @@ Endpoints:
Zones:
object Zone "central" {
object Zone "master" {
endpoints = [ "central-node" ]
}
@ -553,6 +614,10 @@ Zones:
parent = "central"
}
object Zone "global-templates" {
global = true
}
#### <a id="cluster-scenarios-high-availability"></a> High Availability
@ -573,6 +638,12 @@ commands, etc.
Two or more nodes in a high availability setup require an [initial cluster sync](#initial-cluster-sync).
> **Note**
>
> Keep in mind that only one node can act as configuration master having the
> configuration files in the `zones.d` directory. All other nodes must not
> have that directory populated. Detail in the [Configuration Sync Chapter](#cluster-zone-config-sync).
#### <a id="cluster-scenarios-multiple-hierachies"></a> Multiple Hierachies
@ -597,9 +668,3 @@ department instances. Furthermore the central NOC is able to see what's going on
The instances in the departments will serve a local interface, and allow the administrators
to reschedule checks or acknowledge problems for their services.