=head1 NAME SquidAnalyzer v4.4 - Squid access log report generation tool =head1 DESCRIPTION SquidAnalyzer parse native access log format of the Squid proxy and generate general statistics about hits, bytes, users, networks, top url and top second level domain. Statistic reports are oriented to user and bandwidth control, this is not a pure cache statistics generator. SquidAnalyzer use flat files to store data and don't need any SQL, SQL Lite or Berkeley databases. This analyzer is incremental so it should be run in a daily cron. Take care if you have rotate log enable to run it before rotation is done. =head1 REQUIREMENT perl 5.005_03 or higher and the following Perl modules: GD GD::Graph GD::TextUtil GD::Graph::bars3d; See GD and GD::Graph requirements for other needed libraries. If they are not yet include in your OS distribution you can always find them at http://search.cpan.org/ If you have Internet access from your server, you can execute the following command to install GD::Graph::bars3d and all its dependencies. perl -MCPAN -e 'install GD::Graph::bars3d' Images output format is PNG so libgd must be compiled with libpng. =head1 INSTALLATION =head2 Generic install If you want the package to be intalled into the Perl distribution just do the following: perl Makefile.PL make make install Follow the instruction given at the end of install. With this default install everything configurable will be installed under /etc/squidanalyzer. The Perl library SquidAnalyzer.pm will be installed under your site_perl directory and the squid-analyzer Perl script will be copied under /usr/local/bin. The default output directory for html reports will be /var/www/squidanalyzer/. =head2 Custom install You can create your fully customized SquidAnalyzer installation by using the Makefile.PL Perl script. Here is a sample: perl Makefile.PL \ LOGFILE=/var/log/squid3/access.log \ BINDIR=/usr/bin \ CONFDIR=/etc \ HTMLDIR=/var/www/squidreport \ BASEURL=/squidreport \ MANDIR=/usr/man/man3 \ DOCDIR=/usr/share/doc/squidanalyzer If you want to build a distro package, there are two other options that you may use. The QUIET option is to tell to Makefile.PL to not show the default post install README. The DESTDIR is to create and install all files in a package build base directory. For example for Fedora RPM, thing may look like that: # Make Perl and SendmailAnalyzer distrib files %{__perl} Makefile.PL \ INSTALLDIRS=vendor \ QUIET=1 \ LOGFILE=/var/log/squid/access.log \ BINDIR=%{_bindir} \ CONFDIR=%{_sysconfdir} \ BASEDIR=%{_localstatedir}/lib/%{uname} \ HTMLDIR=%{webdir} \ MANDIR=%{_mandir}/man3 \ DOCDIR=%{_docdir}/%{uname}-%{version} \ DESTDIR=%{buildroot} < /dev/null See spec file in packaging/RPM for full RPM build script. =head2 Local install You can also have a custom installation. Just copy the SquidAnalyzer.pm and the squid-analyzer perl script into a directory, copy and modify the configuration file and run the script from here with the -c option. Then copy files sorttable.js, squidanalyzer.css and logo-squidanalyzer.png into the output directory. =head2 Post installation 1. Modify your httpd.conf to allow access to HTML output like follow: Alias /squidreport /var/www/squidanalyzer Options -Indexes FollowSymLinks MultiViews AllowOverride None Order deny,allow Deny from all Allow from 127.0.0.1 2. If necessary, give additional host access to SquidAnalyzer in httpd.conf. Restart and ensure that httpd is running. 3. Browse to http://my.host.dom/squidreport/ to ensure that things are working properly. 4. Setup a cronjob to run squid-analyzer daily or more often: # SquidAnalyzer log reporting daily 0 2 * * * /usr/local/bin/squid-analyzer > /dev/null 2>&1 or run it manually. For more information, see README file. You can use network name instead of network ip addresses by using the network-aliases file. Also if you don't have authentication enable and want to replace client ip addresses by some know user or computer you can use the user-aliases file to do so. See the file squidanalyzer.conf to customized your output statistics and match your network and file system configuration. =head1 CONFIGURATION Unless previous version customization of SquidAnalyzer is now done by a single configuration file squidanalyzer.conf. Here follow the configuration directives used by Squid Analyzer. =over 4 =item Output output_directory Where SquidAnalyzer should dump all HTML, data and images files. You should give a path that can be read by a Web browser. =item LogFile squid_access_log_file Set the path to the Squid log file. =item NetworkAlias network-aliases_file Set path to the file containing network alias name. Network are show as Ip addresses so if you want to display name instead create a file with this format: LOCATION_NAME IP_NETWORK_ADDRESS Separator must be a tabulation. You can use regex to match and group some network addresses. See network-aliases file for examples. =item UserAlias user-aliases_file Set path to the file containing user alias name. If you don't have auth_proxy enable users are seen as ip addresses. So if you want to show username or computer name instead, create a file with this format: FULL_USERNAME IP_ADDRESS If you have auth_proxy enable but want to replace login name by full user name, create a file with this format: FULL_USERNAME LOGIN_NAME Separator for both must be a tabulation. You can use regex to match and group some user login or ip addresses. See user-aliases file for examples. =item AnonymizeLogin 0 Set this to 1 if you want to anonymize all user login. The username will be replaced by an unique id that change at each squid-analyzer run. Default disable. =item OrderNetwork bytes|hits|duration =item OrderUser bytes|hits|duration =item OrderUrl bytes|hits|duration Used to set how SquidAnalyzer sort Network, User and Url report screen. Value can be: bytes, hits or duration. Default is bytes. =item OrderMime bytes|hits Used to set how SquidAnalyzer sort Mime types report screen Value can be: bytes or hits. Default is bytes. =item UrlReport 0|1 Should SquidAnalyzer display user details. This will show all URL read by user. Take care to have enougth space disk for large user. Default is 0, no url detail report. =item QuietMode 0|1 Run in quiet mode for batch processing or print debug information. Default is 0, verbose mode. =item CostPrice price/Mb Used to set a cost of the bandwith per Mb. If you want to generate invoice per Mb for bandwith traffic this can help you. Value 0 mean no cost, this is the default value, the "Cost" column is not displayed =item Currency currency_abreviation Used to set the currency of the bandwith cost. Preferably the html special character. Default is € =item TopNumber number Used to set the number of top url and second level domain to show. Default is top 10. =item Exclude exclusion_file Used to set client ip addresses, network addresses, auth login or uri to exclude from report. You can define one by line exclusion by specifying first the type of the exclusion (USER, CLIENT or URI) and a space separated list of valid regex. See example bellow: CLIENT 192\.168\.1\.2 CLIENT 10\.169\.1\.\d+ 192\.168\.10\..* USER myloginstr USER guestlogin\d+ guestdemo URI http:\/\/myinternetdomain.dom.* URI .*\.webmail\.com\/.*\/login\.php.* you can have multiple line of the same exclusion type. =item Lang language_file Used to set the translation file to be used. Value must be set to a file containing all string translated. See the lang directory for translation files. Default is defined internally in English. =item HeaderFile custom_header_file Custom header. Must be a path to a text file containing HTML code that will be placed just after the body tag and just before the program name and version. Default is defined internally if this directive is not set to a valid file. =item FooterFile custom_footer_file Custom Footer. Must be a path to a text file containing HTML code that will be placed at the bottom of each page just before the end if the body tag. Default is defined internally if this directive is not set to a valid file. =item SiblingHit Adds peer cache hit (CD_SIBLING_HIT) to be taken has local cache hit. Enabled by default, you must disabled it if you don't want to report peer cache hit onto your stats. =back =head1 AUTHOR Gilles DAROLD =head1 COPYRIGHT Copyright (c) 2001-2012 Gilles DAROLD This package is free software and published under the GPL v3 or above license.