Documentation: Add cluster scenarios.

Fixes #5443
2025-07-29 00:24:23 +02:00 · 2014-03-19 18:54:02 +01:00 · 2014-03-19 18:54:02 +01:00 · 364f9cfc88
commit 364f9cfc88
parent 46d7cf3d6a
1 changed files with 228 additions and 1 deletions
--- a/doc/6.04-cluster.md
+++ b/doc/6.04-cluster.md
@ -197,4 +197,231 @@ You can either set that variable as constant configuration
 definition in [icinga2.conf](#icinga2-conf) or pass it as runtime variable to
 the Icinga 2 daemon.
-    # icinga2 -c /etc/icinga2/node1/icinga2.conf -DIcingaLocalStateDir=/opt/node1/var
+    # icinga2 -c /etc/icinga2/node1/icinga2.conf -DIcingaLocalStateDir=/opt/node1/var
 ### <a id="cluster-scenarios"></a> Cluster Scenarios
 #### <a id="cluster-scenarios-features"></a> Features in Cluster
 Each cluster instance may use available features. If you have multiple locations
 or departments, they may write to their local database, or populate graphite.
 Even further all commands are distributed (unless prohibited using [Domains](#domains)).
 DB IDO on the left, graphite on the right side - works.
 Icinga Web 2 on the left, checker and notifications on the right side - works too.
 Everything on the left and on the right side - make sure to deal with duplicated notifications
 and automated check distribution.
 #### <a id="cluster-scenarios-location-based"></a> Location Based Cluster
 That scenario fits if your instances are spread over the globe and they all report
 to a central instance. Their network connection only works towards the central master
 (or the master is able to connect, depending on firewall policies) which means
 remote instances won't see each/connect to each other.
 All events are synced to the central node, but the remote nodes can still run
 local features such as a web interface, reporting, graphing, etc.
 Imagine the following example with a central node in Nuremberg, and two remote DMZ
 based instances in Berlin and Vienna. The configuration tree on the central instance
 could look like this:
    conf.d/
      templates/
      germany/
        nuremberg/
          hosts.conf
        berlin/
          hosts.conf
      austria/
        vienna/
          hosts.conf
 The configuration deployment should look like:
 * The node `nuremberg` sends `conf.d/germany/berlin` to the `berlin` node.
 * The node `nuremberg` sends `conf.d/austria/vienna` to the `vienna` node.
 `conf.d/templates` is shared on all nodes.
 The endpoint configuration on the `nuremberg` node would look like:
    object Endpoint "nuremberg" {
      host = "nuremberg.icinga.org",
      port = 8888,
    }
    object Endpoint "berlin" {
      host = "berlin.icinga.org",
      port = 8888,
      config_files_recursive = [ "/etc/icinga2/conf.d/templates",
                                 "/etc/icinga2/conf.d/germany/berlin" ]
    }
    object Endpoint "vienna" {
      host = "vienna.icinga.org",
      port = 8888,
      config_files_recursive = [ "/etc/icinga2/conf.d/templates",
                                 "/etc/icinga2/conf.d/austria/vienna" ]
    }
 Each remote node will only peer with the central `nuremberg` node. Therefore
 only two endpoints are required for cluster connection. Furthermore the remote
 node must include the received configuration by the cluster functionality.
 Example for the configuration on the `berlin` node:
    object Endpoint "nuremberg" {
      host = "nuremberg.icinga.org",
      port = 8888,
    }
    object Endpoint "berlin" {
      host = "berlin.icinga.org",
      port = 8888,
      accept_config = [ "nuremberg" ]
    }
    include_recursive IcingaLocalStateDir + "/lib/icinga2/cluster/config"
 Depenending on the network connectivity the connections can be either
 established by the remote node or the central node.
 Example for `berlin` node connecting to central `nuremberg` node:
    library "cluster"
    object ClusterListener "berlin-cluster" {
      ca_path = "/etc/icinga2/ca/ca.crt",
      cert_path = "/etc/icinga2/ca/berlin.crt",
      key_path = "/etc/icinga2/ca/berlin.key",
      bind_port = 8888,
      peers = [ "nuremberg" ]
    }
 Example for central `nuremberg` node connecting to remote nodes:
    library "cluster"
    object ClusterListener "nuremberg-cluster" {
      ca_path = "/etc/icinga2/ca/ca.crt",
      cert_path = "/etc/icinga2/ca/nuremberg.crt",
      key_path = "/etc/icinga2/ca/nuremberg.key",
      bind_port = 8888,
      peers = [ "berlin", "vienna" ]
    }
 The central node should not do any checks by itself. There's two possibilities to achieve
 that:
 * Disable the `checker` feature
 * Pin the service object configuration to the remote endpoints using the [authorities](#assign-services-to-cluster-nodes)
 attribute.
 #### <a id="cluster-scenarios-load-distribution"></a> Load Distribution
 If you are planning to off-load the checks to a defined set of remote workers
 you can achieve that by:
 * Deploying the configuration on all nodes.
 * Let Icinga 2 distribute the load amongst all available nodes.
 That way all remote check instances will receive the same configuration
 but only execute their part. The central instance can also execute checks,
 but you may also disable the `Checker` feature.
    conf.d/
      templates/
      many/
 If you are planning to have some checks executed by a specific set of checker nodes
 just pin them using the [authorities](#assign-services-to-cluster-nodes) attribute.
 Example on the `central` node:
    object Endpoint "central" {
      host = "central.icinga.org",
      port = 8888,
    }
    object Endpoint "checker1" {
      host = "checker1.icinga.org",
      port = 8888,
      config_files_recursive = [ "/etc/icinga2/conf.d" ]
    }
    object Endpoint "checker2" {
      host = "checker2.icinga.org",
      port = 8888,
      config_files_recursive = [ "/etc/icinga2/conf.d" ]
    }
    object ClusterListener "central-cluster" {
      ca_path = "/etc/icinga2/ca/ca.crt",
      cert_path = "/etc/icinga2/ca/central.crt",
      key_path = "/etc/icinga2/ca/central.key",
      bind_port = 8888,
      peers = [ "checker1", "checker2" ]
    }
 Example on `checker1` node:
    object Endpoint "central" {
      host = "central.icinga.org",
      port = 8888,
    }
    object Endpoint "checker1" {
      host = "checker1.icinga.org",
      port = 8888,
      accept_config = [ "central" ]
    }
    object Endpoint "checker2" {
      host = "checker2.icinga.org",
      port = 8888,
      accept_config = [ "central" ]
    }
    object ClusterListener "checker1-cluster" {
      ca_path = "/etc/icinga2/ca/ca.crt",
      cert_path = "/etc/icinga2/ca/checker1.crt",
      key_path = "/etc/icinga2/ca/checker1.key",
      bind_port = 8888
    }
 #### <a id="cluster-scenarios-high-availability"></a> High Availability
 Two nodes in a high availability setup require an [initial cluster sync](#initial-cluster-sync).
 Furthermore the active master node should deploy the configuration to the
 second node, if that does not already happen by your provisioning tool. It primarly
 depends which features are enabled/used. It is still required that some failover
 mechanism detects for example which instance will be the notification "master".
 #### <a id="cluster-scenarios-multiple-hierachies"></a> Multiple Hierachies
 Your central instance collects all check results for reporting and graphing and also
 does some sort of additional notifications.
 The customers got their own instances in their local DMZs. They are limited to read/write
 only their services, but replicate all events back to the central instance.
 Within each DMZ there are additional check instances also serving interfaces for local
 departments. The customers instances will collect all results, but also send them back to
 your central instance.
 Additionally the customers instance on the second level in the middle prohibits you from
 sending commands to the down below department nodes. You're only allowed to receive the
 results, and a subset of each customers configuration too.
 Your central instance will generate global reports, aggregate alert notifications and check
 additional dependencies (for example, the customers internet uplink and bandwidth usage).
 The customers instance will only check a subset of local services and delegate the rest
 to each department. Even though it acts as configuration master with a central dashboard
 for all departments managing their configuration tree which is then deployed to all
 department instances. Furthermore the central NOC is able to see what's going on.
 The instances in the departments will serve a local interface, and allow the administrators
 to reschedule checks or acknowledge problems for their services.