Squid Analyzer parses Squid proxy access log and reports general statistics about hits, bytes, users, networks, top URLs, and top second level domains. Statistic reports are oriented toward user and bandwidth control.
Go to file
Darold Gilles 0db63c377a Fix bug in last/first hour of a day data storage that mixed collected data over the two days and stored false first and last visit time. 2013-05-27 23:37:21 +02:00
doc Fix bug in last/first hour of a day data storage that mixed collected data over the two days and stored false first and last visit time. 2013-05-27 23:37:21 +02:00
etc New feature to show the top N users that look at an url or a domain. Add a new configuration directive, TopUrlUser, to control the number of users to show. Thanks to Mr-Ed for the feature resquest. 2013-05-26 17:20:55 +02:00
lang Fix translation for user and count. 2013-05-27 22:27:59 +02:00
packaging/RPM Update ChangeLog and upgrade version to 5.1. 2013-01-30 00:19:44 +01:00
resources New feature to show the top N users that look at an url or a domain. Add a new configuration directive, TopUrlUser, to control the number of users to show. Thanks to Mr-Ed for the feature resquest. 2013-05-26 17:20:55 +02:00
ChangeLog Update ChangeLog with release informations. 2013-01-30 09:24:33 +01:00
INSTALL Update INSTALL file about FreeBSD issue. 2013-01-29 23:54:42 +01:00
MANIFEST Fix several issues in design, moved png images in a dedicated repository, fix french translation encoding and bug fixes. Thanks to Nathanael Bonin for the patches. 2012-12-05 01:34:03 +01:00
META.yml Update ChangeLog and upgrade version to 5.1. 2013-01-30 00:19:44 +01:00
Makefile.PL Fix install of configuration files. Thanks to David Wasler for the patch. 2013-02-03 15:27:24 +01:00
README Add -b | --build_date command line option to limit rebuilt to the given date, the format of its value can be yyyy-mm-dd, yyyy-mm or yyyy. This should be used to prevent rebuilding all html file from all data file. 2013-05-27 22:27:18 +02:00
SquidAnalyzer.pm Fix bug in last/first hour of a day data storage that mixed collected data over the two days and stored false first and last visit time. 2013-05-27 23:37:21 +02:00
TODO Initial git commit 2012-04-06 10:47:56 +02:00
squid-analyzer Add -b | --build_date command line option to limit rebuilt to the given date, the format of its value can be yyyy-mm-dd, yyyy-mm or yyyy. This should be used to prevent rebuilding all html file from all data file. 2013-05-27 22:27:18 +02:00

README

NAME
    SquidAnalyzer - Squid access log report generation tool

DESCRIPTION
    SquidAnalyzer parse native access log format of the Squid proxy and
    generate general statistics about hits, bytes, users, networks, top url
    and top second level domain.

    Statistic reports are oriented to user and bandwidth control, this is
    not a pure cache statistics generator. SquidAnalyzer use flat files to
    store data and don't need any SQL, SQL Lite or Berkeley databases.

    This analyzer is incremental so it should be run in a daily cron. Take
    care if you have rotate log enable to run it before rotation is done.

REQUIREMENT
    Nothing is required than a modern perl version 5.8 or higher. Graphics
    are based on the Flotr2 Javascript library so they are drawn at your
    browser side without extra installation required.

INSTALLATION
  Generic install
    If you want the package to be intalled into the Perl distribution just
    do the following:

        perl Makefile.PL
        make
        make install

    Follow the instruction given at the end of install. With this default
    install everything configurable will be installed under
    /etc/squidanalyzer. The Perl library SquidAnalyzer.pm will be installed
    under your site_perl directory and the squid-analyzer Perl script will
    be copied under /usr/local/bin.

    The default output directory for html reports will be
    /var/www/squidanalyzer/.

    On FreeBSD, if make install is freezing and you have the following
    messages:

            FreeBSD: Registering installation in the package database
            FreeBSD: Cannot determine short module description
            FreeBSD: Cannot determine module description

    please proceed as follow:

            perl Makefile.PL INSTALLDIRS=site
            make
            make install

    as the issue is related to an install into the default Perl vendor
    installdirs it will then use Perl site installdirs.

  Custom install
    You can create your fully customized SquidAnalyzer installation by using
    the Makefile.PL Perl script. Here is a sample:

            perl Makefile.PL \
                    LOGFILE=/var/log/squid3/access.log \
                    BINDIR=/usr/bin \
                    CONFDIR=/etc \
                    HTMLDIR=/var/www/squidreport \
                    BASEURL=/squidreport \
                    MANDIR=/usr/man/man3 \
                    DOCDIR=/usr/share/doc/squidanalyzer

    If you want to build a distro package, there are two other options that
    you may use. The QUIET option is to tell to Makefile.PL to not show the
    default post install README. The DESTDIR is to create and install all
    files in a package build base directory. For example for Fedora RPM,
    thing may look like that:

            # Make Perl and SendmailAnalyzer distrib files
            %{__perl} Makefile.PL \
                INSTALLDIRS=vendor \
                QUIET=1 \
                LOGFILE=/var/log/squid/access.log \
                BINDIR=%{_bindir} \
                CONFDIR=%{_sysconfdir} \
                BASEDIR=%{_localstatedir}/lib/%{uname} \
                HTMLDIR=%{webdir} \
                MANDIR=%{_mandir}/man3 \
                DOCDIR=%{_docdir}/%{uname}-%{version} \
                DESTDIR=%{buildroot} < /dev/null

    See spec file in packaging/RPM for full RPM build script.

  Local install
    You can also have a custom installation. Just copy the SquidAnalyzer.pm
    and the squid-analyzer perl script into a directory, copy and modify the
    configuration file and run the script from here with the -c option.

    Then copy files sorttable.js, squidanalyzer.css and
    logo-squidanalyzer.png into the output directory.

  Post installation
    1. Modify your httpd.conf to allow access to HTML output like follow:

            Alias /squidreport /var/www/squidanalyzer
            <Directory /var/www/squidanalyzer>
                Options -Indexes FollowSymLinks MultiViews
                AllowOverride None
                Order deny,allow
                Deny from all
                Allow from 127.0.0.1
            </Directory>

    2. If necessary, give additional host access to SquidAnalyzer in
    httpd.conf. Restart and ensure that httpd is running.

    3. Browse to http://my.host.dom/squidreport/ to ensure that things are
    working properly.

    4. Setup a cronjob to run squid-analyzer daily or more often:

         # SquidAnalyzer log reporting daily
         0 2 * * * /usr/local/bin/squid-analyzer > /dev/null 2>&1

    or run it manually. For more information, see README file.

    You can use network name instead of network ip addresses by using the
    network-aliases file. Also if you don't have authentication enable and
    want to replace client ip addresses by some know user or computer you
    can use the user-aliases file to do so.

    See the file squidanalyzer.conf to customized your output statistics and
    match your network and file system configuration.

USAGE
    SquidAnalyzer can be run manually or by cron job using the
    squid-analyzer Perl script. Here are authorized usage:

        Usage: squid-analyzer [ -c squidanalyzer.conf ] [-l logfile]

            -c | --configfile filename : path to the SquidAnalyzer configuration file.
                                         By default: /etc/squidanalyzer.conf
            -b | --build_date date     : set the day to be rebuilt, format: yyyy-mm-dd,
                                         yyyy-mm or yyyy. Used with -r or --rebuild.
            -d | --debug               : show debug informations.
            -h | --help                : show this message and exit.
            -l | --logfile filename    : path to the Squid logfile to parse.
                                         By default: /var/log/squid/access.log
            -p | --preserve number     : used to set the statistic obsolescence in
                                         number of month. Older stats will be removed.
            -r | --rebuild             : use this option to rebuild all html and graphs
                                         output from all data files.
            -v | version               : show version and exit.

    There is special options like --rebuild that force SquidAnalyzer to
    rebuild all HTML reports, useful after an new feature or a bug fix. If
    you want to limit the rebuild to a single day, a single month or year,
    you can use the --build_date option by specifying the date part to
    rebuild, format: yyyy-mm-dd, yyyy-mm or yyyy.

    The --preserve option should be used if you want to rotate your
    statistics and data. The value is the number of months to keep, older
    reports and data will be removed from the filesystem. Useful to preserve
    space, for example:

            squid-analyzer -p 6 -c /etc/squidanalyzer/squidanalyzer.conf

    will only preserve six month of statistics from the last run of
    squidanalyzer.

CONFIGURATION
    Unless previous version customization of SquidAnalyzer is now done by a
    single configuration file squidanalyzer.conf.

    Here follow the configuration directives used by Squid Analyzer.

    Output output_directory
        Where SquidAnalyzer should dump all HTML, data and images files. You
        should give a path that can be read by a Web browser.

    WebUrl
        The URL of the SquidAnalyzer javascript, HTML and images files.
        Default: /squidreport

    LogFile squid_access_log_file
        Set the path to the Squid log file.

    NetworkAlias network-aliases_file
        Set path to the file containing network alias name. Network are show
        as Ip addresses so if you want to display name instead create a file
        with this format:

            LOCATION_NAME IP_NETWORK_ADDRESS

        Separator must be a tabulation.

        You can use regex to match and group some network addresses. See
        network-aliases file for examples.

    UserAlias user-aliases_file
        Set path to the file containing user alias name. If you don't have
        auth_proxy enable users are seen as ip addresses. So if you want to
        show username or computer name instead, create a file with this
        format:

            FULL_USERNAME IP_ADDRESS

        If you have auth_proxy enable but want to replace login name by full
        user name, create a file with this format:

            FULL_USERNAME LOGIN_NAME

        Separator for both must be a tabulation.

        You can use regex to match and group some user login or ip
        addresses. See user-aliases file for examples.

    AnonymizeLogin 0
        Set this to 1 if you want to anonymize all user login. The username
        will be replaced by an unique id that change at each squid-analyzer
        run. Default disable.

    OrderNetwork bytes|hits|duration
    OrderUser bytes|hits|duration
    OrderUrl bytes|hits|duration
        Used to set how SquidAnalyzer sort Network, User and User detailed
        Urls reports screen. Value can be: bytes, hits or duration. Default
        is bytes. Note that OrderUrl is limited to User detailed Urls
        reports and does not apply to Top Url and Top domain report where
        there is three reports each already ordered.

    OrderMime bytes|hits
        Used to set how SquidAnalyzer sort Mime types report screen Value
        can be: bytes or hits. Default is bytes.

    UrlReport 0|1
        Should SquidAnalyzer display user details. This will show all URL
        read by user. Take care to have enougth space disk for large user.
        Default is 0, no url detail report.

    QuietMode 0|1
        Run in quiet mode for batch processing or print debug information.
        Default is 0, verbose mode.

    CostPrice price/Mb
        Used to set a cost of the bandwith per Mb. If you want to generate
        invoice per Mb for bandwith traffic this can help you. Value 0 mean
        no cost, this is the default value, the "Cost" column is not
        displayed

    Currency currency_abreviation
        Used to set the currency of the bandwith cost. Preferably the html
        special character. Default is &euro;

    TopNumber number
        Used to set the number of top url and second level domain to show.
        Default is top 100.

    TopUrlUser Use this directive to show the top N users that look at an
    URL or a domain. Set it to 0 to disable this feature. Default is top 10.
    Exclude exclusion_file
        Used to set client ip addresses, network addresses, auth login or
        uri to exclude from report.

        You can define one by line exclusion by specifying first the type of
        the exclusion (USER, CLIENT or URI) and a space separated list of
        valid regex.

        You can also use the NETWORK type to define network address with
        netmask using the CIDR notation: xxx.xxx.xxx.xxx/n

        See example bellow:

                NETWORK        192.168.1.0/24 10.10.0.0/16
                CLIENT         192\.168\.1\.2 
                CLIENT         10\.169\.1\.\d+ 192\.168\.10\..*
                USER           myloginstr
                USER           guestlogin\d+ guestdemo
                URI            http:\/\/myinternetdomain.dom.*
                URI            .*\.webmail\.com\/.*\/login\.php.*

        you can have multiple line of the same exclusion type.

    Lang language_file
        Used to set the translation file to be used. Value must be set to a
        file containing all string translated. See the lang directory for
        translation files. Default is defined internally in English.

    DateFormat
        Date format used to display date (year = %y, month = %m and day =
        %d) You can also use %M to replace month by its 3 letters
        abbreviation. Default: %y-%m-%d

    SiblingHit
        Adds peer cache hit (CD_SIBLING_HIT) to be taken has local cache
        hit. Enabled by default, you must disabled it if you don't want to
        report peer cache hit onto your stats.

    TransfertUnit
        Allow to change the default unit used to display transfert size.
        Default is BYTES, other possible values are KB, MB and GB.

    MinPie
        Minimum percentage of data in pie's graphs to not be placed in the
        others item. Lower values will be summarized into the others item.

    Locale
        Set this to your locale to display generated date in your language.
        Default is to use the current locale of the system. If you want date
        in German for example, set it to de_DE.

                Rapport généré le mardi 11 décembre 2012, 15:13:09 (UTC+0100).

        with a Locale set to fr_FR.

AUTHOR
    Gilles DAROLD <gilles@darold.net>

COPYRIGHT
    Copyright (c) 2001-2013 Gilles DAROLD

    This package is free software and published under the GPL v3 or above
    license.