pandorafms/pandora_doc/en/pandora_advanced.xml

<?xml version="1.0" encoding="utf-8"?>
<chapter>
  <title>Pandora FMS advanced section</title>
  <sect1><title>Pandora FMS High Availability features</title>
  <para>
    You may setup Pandora for use HA in several scenarios:
    <itemizedlist mark='bullet'>
      <listitem>
	<para>
	  <emphasis>Database Clustering for HA</emphasis>. You need to
	  setup a MySQL5 Cluster. In support forums / wiki are
	  information to make this, you only need to convert DB schema
	  in a MySQL Cluster compatible tables. This scenario has been
	  tested and works fine, but you need some advanced knowledge
	  about MySQL Clustering administration.
	</para>
      </listitem>
      <listitem>
	<para>
	  <emphasis>Multiple Pandora Console</emphasis>. It's easy,
	  you only need to setup another one. No locking problems or
	  incompatibility has been detected in several Pandora FMS
	  deployments.
	</para>
      </listitem>
      <listitem>
	<para>
	  <emphasis>Multiple Pandora Data Servers for HA
	  </emphasis>. This is the more complex scenario, because you
	  don't need to know anything special about Pandora Server
	  setup, and you need to use of another tool to implement
	  Network HA, like VRRP or Keepalive. For Pandora Data server
	  you need to setup two identical machines, with the same
	  public keys for all agents connecting to server (and
	  duplicate server SSH host key). And setup Network HA to
	  point one of them. If one fails, VRRP or Keepalive "promote"
	  the other server up, and Pandora Agents, will connect it for
	  the next data packets. There is no need to change anything
	  in each of Pandora Data server, only need ensure that
	  Pandora Server name is the same on both machines (in pandora
	  server setup, not in the system hostname).
	</para>
      </listitem>
      <listitem>
	<para>
	  <emphasis>Multiple Pandora Network Servers for HA
	  </emphasis>. This is easier. You need to setup multiple
	  network servers in several machines across your network (or
	  all of them in the same segment), and assign modules to the
	  same server. If this servers fails, and there are other
	  Network Servers active, marked as "primary" server,
	  automatically, the first network server available marked as
	  "Primary" will launch the network module query. If you have many
	  servers marked as "primary", any of them could launch query.
	</para>
      </listitem>
      <listitem>
	<para>
	  <emphasis>Multiple Pandora Network Servers for load
	  balancing.  </emphasis>. You need to setup multiple network
	  servers in several machines across your network (or all of
	  them in the same segment), and assign agent/modules to
	  different servers, balancing load between all
	  servers available.
	</para>
      </listitem>
    </itemizedlist>
  </para>
  </sect1>
  <sect1><title>Pandora virtual servers</title>
  <para>
    An special case for implement more processing power in servers
    could be to implement "virtual" servers. Using virtual servers
    (another instance of the same server in the same machine) is used
    when Pandora Server cannot process all information without delay
    too much. Pandora 1.2 it's using a limited number of threads to
    process information (this will change in future versions), so you
    can install another instance of Pandora Network or Pandora Data
    server (with another data_in directory!), to be able to process
    more information with the same machine.
  </para>
  </sect1>

  <sect1><title>Pandora Database design (and redesign from 1.1)</title>
  <para>
    First Pandora versions, from 0.83 until 1.1 was based on a simple
    idea: one data, one database insertion. This was very easy to
    develop and allowed to program easily searches, insertions and other
    operations.
  </para>
  <para>
    This had many advantages and a big problem: the
    scalability. This system has a limit defined in maximum number of
    modules that could support in a "easy" way, from that number of
    modules the management was too slow.
  </para>
  <para>
    Solutions based on MySQL cluster was difficult and cames with some
    problems and they did not offer either a solution in the long
    term.
  </para>
  <para>
    Data compression based on interpolation and data purge, makes a
    smaller database, but this was not enough. Production systems has
    a limit based on 100 agents, with about ten modules each one. This
    was not a high limit for large environments.
  </para>
  <para>
    This problem was very important for Pandora future, so we
    changed the way Pandora store its data. New data management system
    store only "new" data. If a duplicate value enter the system, it
    won't be stored in database. It's very useful to keep database
    small. This works for all pandora data modules: numerical,
    incremental, boolean and string.
  </para>
  <para>
    This solves part of scalability problem reducing considerably
    database usage, in about 40%-70%. We also have another solution
    for scalability problems: total segregation of components in
    Pandora and a built-in method to implement High Availability
    solutions on Pandora components.  You may have many Pandora
    servers (network, data or SNMP), Pandora Consoles, and Pandora
    Database (in a MySQL5 Cluster setup).
  </para>
  <para>
    Changes come with some different ways to reading data. With new
    version, if an agent cannot communicate with Pandora, and Pandora
    Server doesn't receive data from agent, this "no data" cannot have
    a graphical representation, for module graph there will be no
    changes. You will have a graph with a perfect horizontal
    line. Pandora, if doesn't receive new values, thinks that there
    are no new values, and everything seems to be as in the last
    notification.
  </para>
  <para>
    This graph, for example, shows changes for each data, received every
    180 seconds.

    <graphic fileref="images/module_graph_full.jpg" scale="60" align="center"/>

    This would be the equivalent graph for the same data, except a
    connection failure, from 05:55 to 15:29 aproximately.

    <graphic fileref="images/module_graph_peak.jpg" scale="60" align="center"/>
  </para>
  <para>
    In Pandora 1.2 we introduce a new general agent graph for show
    connectivity. It reflects access from modules to this agent. This
    graph complements all other graphs showing when agent has activity
    and it's receiving data. This is an example of an agent connecting
    regulary to server:
    <graphic fileref="images/access_graph_full.jpg" scale="65" align="center"/>
    If you have low leaks in this graph, you may have some problems or slow
    connections in Pandora Agent connectivity with Pandora Server. This graph
    with previous example could have an aspect similar to this:
    <graphic fileref="images/access_graph_peak.jpg" scale="65" align="center"/>
  </para>
  </sect1>
  <sect1><title>Programmers guide to Pandora architecture</title>
  <para>
    <graphic fileref="images/Pandora_NetworkServer_Diagram.png" scale="65" align="center"/>
    <graphic fileref="images/Pandora_DataServer_Diagram.png" scale="65" align="center"/>
    <graphic fileref="images/Pandora_SNMP_Diagram.png" scale="55" align="center"/>
    <graphic fileref="images/pandora_ER.png" scale="50" align="center"/>
  </para>
  </sect1>
</chapter>