pandorafms/pandora_doc/en/pandora_advanced.xml

168 lines
7.2 KiB
XML

<?xml version="1.0" encoding="utf-8"?>
<chapter>
<title>Pandora FMS advanced section</title>
<sect1><title>Pandora FMS High Availability features</title>
<para>
You may setup Pandora for use HA in several scenarios:
<itemizedlist mark='bullet'>
<listitem>
<para>
<emphasis>Database Clustering for HA</emphasis>. You need to
setup a MySQL5 Cluster. In support forums / wiki are
information to make this, you only need to convert DB schema
in a MySQL Cluster compatible tables. This scenario has been
tested and works fine, but you need some advanced knowledge
about MySQL Clustering administration.
</para>
</listitem>
<listitem>
<para>
<emphasis>Multiple Pandora Console</emphasis>. It's easy,
you only need to setup another one. No locking problems or
incompatibility has been detected in several Pandora FMS
deployments.
</para>
</listitem>
<listitem>
<para>
<emphasis>Multiple Pandora Data Servers for HA
</emphasis>. This is the more complex scenario, because you
don't need to know anything special about Pandora Server
setup, and you need to use of another tool to implement
Network HA, like VRRP or Keepalive. For Pandora Data server
you need to setup two identical machines, with the same
public keys for all agents connecting to server (and
duplicate server SSH host key). And setup Network HA to
point one of them. If one fails, VRRP or Keepalive "promote"
the other server up, and Pandora Agents, will connect it for
the next data packets. There is no need to change anything
in each of Pandora Data server, only need ensure that
Pandora Server name is the same on both machines (in pandora
server setup, not in the system hostname).
</para>
</listitem>
<listitem>
<para>
<emphasis>Multiple Pandora Network Servers for HA
</emphasis>. This is more easy. You need to setup multiple
network servers in several machines across your network (or
all of them in the same segment), and assign modules to the
same server. If this servers fails, and there are other
Network Servers active, marked as "primary" server,
automatically, the first network server available marked as
"Primary" will launch the network module query. If you have many
servers marked as "primary", any of them could launch query.
</para>
</listitem>
<listitem>
<para>
<emphasis>Multiple Pandora Network Servers for load
balancing. </emphasis>. You need to setup multiple network
servers in several machines across your network (or all of
them in the same segment), and assign agent/modules to
different servers, balancing load between all
servers available.
</para>
</listitem>
</itemizedlist>
</para>
</sect1>
<sect1><title>Pandora virtual servers</title>
<para>
An special case for implement more processing power in servers
could be to implement "virtual" servers. Using virtual servers
(another instance of the same server in the same machine) is used
when Pandora Server cannot process all information without delay
too much. Pandora 1.2 it's using a limited number of threads to
process information (this will change in future versions), so you
can install another instance of Pandora Network or Pandora Data
server (with another data_in directory!), to be able to process
more information with the same machine.
</para>
</sect1>
<sect1><title>Pandora Database design (and redesign from 1.1)</title>
<para>
First Pandora versions, from 0.83 until 1.1 was based on a simple
idea: one data, one database insertion. This was very easy to
develop and allowed to program easily searches, insertions and other
operations.
</para>
<para>
This had many advantages and a big problem: the
scalability. This system has a limit defined in maximum number of
modules that could support in a "easy" way, from that number of
modules the management was too slow.
</para>
<para>
Solutions based on MySQL cluster was difficult and cames with some
problems and they did not offer either a solution in the long
term.
</para>
<para>
Data compression based on interpolation and data purge, makes a
smaller database, but this was not enough. Production systems has
a limit based on 100 agents, with about ten modules each one. This
was not a high limit for large environments.
</para>
<para>
This problem was very important for Pandora future, so we
changed the way Pandora store its data. New data management system
store only "new" data. If a duplicate value enter the system, it
won't be stored in database. It's very useful to keep database
small. This works for all pandora data modules: numerical,
incremental, boolean and string.
</para>
<para>
This solves part of scalability problem reducing considerably
database usage, in about 40%-70%. We also have another solution
for scalability problems: total segregation of components in
Pandora and a built-in method to implement High Availability
solutions on Pandora components. You may have many Pandora
servers (network, data or SNMP), Pandora Consoles, and Pandora
Database (in a MySQL5 Cluster setup).
</para>
<para>
Changes come with some different ways to reading data. With new
version, if an agent cannot communicate with Pandora, and Pandora
Server doesn't receive data from agent, this "no data" cannot have
a graphical representation, for module graph there will be no
changes. You will have a graph with a perfect horizontal
line. Pandora, if doesn't receive new values, thinks that there
are no new values, and everything seems to be as in the last
notification.
</para>
<para>
This graph, for example, shows changes for each data, received every
180 seconds.
<graphic fileref="images/module_graph_full.jpg" scale="60" align="center"/>
This would be the equivalent graph for the same data, except a
connection failure, from 05:55 to 15:29 aproximately.
<graphic fileref="images/module_graph_peak.jpg" scale="60" align="center"/>
</para>
<para>
In Pandora 1.2 we introduce a new general agent graph for show
connectivity. It reflects access from modules to this agent. This
graph complements all other graphs showing when agent has activity
and it's receiving data. This is an example of an agent connecting
regulary to server:
<graphic fileref="images/access_graph_full.jpg" scale="65" align="center"/>
If you have low leaks in this graph, you may have some problems or slow
connections in Pandora Agent connectivity with Pandora Server. This graph
with previous example could have an aspect similar to this:
<graphic fileref="images/access_graph_peak.jpg" scale="65" align="center"/>
</para>
</sect1>
<sect1><title>Programmers guide to Pandora architecture</title>
<para>
<graphic fileref="images/Pandora_NetworkServer_Diagram.png" scale="65" align="center"/>
<graphic fileref="images/Pandora_DataServer_Diagram.png" scale="65" align="center"/>
<graphic fileref="images/Pandora_SNMP_Diagram.png" scale="55" align="center"/>
<graphic fileref="images/pandora_ER.png" scale="50" align="center"/>
</para>
</sect1>
</chapter>