Displaying web access statistics using AWStats
How to configure AWStats to monitor web access statistics on Gentoo Linux server
Business decision makers like to base their judgments on the statistics. You will find a lot of services which will help you gather and present the Web Access statistics in a pleasant and what is most important easy to understand way. The most popular services like Google Analytics relay mostly on Javascript API hidden on the web page. Therefore this kind of services are extremely easy to block by a Web site visitor. In fact most of browser security plug-ins will block them by default. So what to do to see real statistics of your web page access. Rely on your web server logs. The information kept in server log can not be blocked. If someone is downloading something from your web page the information about it is kept in the logs no matter what type of security plug-ins the visitor is using. On my servers I use awesome AWStats web server log parser, which is presenting web site access statistics in a very pleasant way.
You will find a lot of articles considering AWStats configuration on various Linux distributions, but unfortunately I found out that most of provided there information was just not good for me. I manage to configure AWStats the way I wanted using parts of information from various sources, and I want to share my configuration hoping that someone else will find it useful. Most of the information provided below address the Gentoo Linux distribution, but some of them should be applicable to any Linux or Posix type of system.
I will start by defining the objectives of my setup. As I'm running few sites on my servers I would like to have one separate virtual host which will be presenting statics for all sites published on the same server. Presenting statistics in separate location for every single site is not a bad idea, you can access them for example by using url similar to this http://www.mysiteaddress.org/awstats. However some the sites I'm taking care of are build using frameworks like Syfmony or Ruby On Rails and the framework routing engine would make it relay hard for me to access the statistics like this. It's much easier for me to create a separate web page with links to all of the statics I need. I also would like to limit access to my web statistics page to logged in users and access the site using secure SSL connection.
Most of the resources I found in Internet were far from being complete thats why I want to make sure my information will be 100% reliable (at lest for Gentoo Linux users) therefore I will start with required Apache server compilation, then go thorough AWStats installation and configuration and end with virtual hosts and web page configuration.
First thing is Apache server configuration. Apache modules configuration should be defined in Gentoo Linux in /etc/make.conf file. If you want to make sure you will be able to use AWStats I would recommend adding APACHE2_MODULES definition similar to this one:
APACHE_MODULES="ssl alias log_config mime mime_magic unique_id vhost_alias threads authz_host auth_basic auth_default rewrite dir cgid"
This definition is typical for multithreaded apache configuration, if you know that your Apache compilation is not running on:
APACHE2_MPMS="event"
or:
APACHE2_MPMS="worker"
instead of this multiprocessing modules you are using:
APACHE2_MPMS="prefork"
then make sure you will replace a cgid module with cgi one. This is absolutely key module as AWStats web page is generated by cgi perl script.
The above configuration is very minimalistic and to tell you the truth I use more modules on my servers but other modules (like proxy_balancer for example) does not affect my AWStats configuration. You can tune up this configuration to match your personal needs you can for example replace auth_basic with authnz_ldap to use ldap authentication services instead of basic authentication I use (this would also require Apache compilation with ldap support). If your knowledge about Apache modules is limited visit this site to read and understand what every single modules stands for.
Next thing is Apache compilation. I had to make sure my Apache will be compiled with threads and ssl support. This should be enabled by default for every server profile, but if you are unsure you can always add this flags to /etc/make.conf or add them to /etc/portage/package.use like this:
echo "www-servers/apache ssl threads" >> /etc/portage/package.use
or run emerge command like this:
USE="threads ssl" emerge www-servers/apache
If you were using different Apache compilation before make sure you will run:
etc-update
or
dispatch-conf
to update you previously created configuration files.
You should also enable needed Apache options in /etc/conf.d/apache2 configuration. My configuration is looking similar to this one:
APACHE2_OPTS="-D DEFAULT_VHOST -D SSL -D SSL_DEFAULT_VHOST
This will let us use SSL and default Vhost configuration. This is almost everything I had to do to make my apache AWStats ready.
Last thing to do is changing default Apache log format to the format that AWStat will be able to parse. Instead of using suggested by AWStats documentation combided format I use same approach as described on Gentoo Linux Wiki (take a look at sources list at the bottom of the article). I use vhost format. This will of course affect AWStats configuration. For every site available via HTTP protocol I added following log configuration in vhost definition:
ErrorLog /var/log/apache2/my_site_address-error_log CustomLog /var/log/apache2/my_site_address-access_log vhost
For every site available via HTTPS protocol I added following log configuration in vhost definition:
ErrorLog /var/log/apache2/my_site_address-ssl-error_log CustomLog /var/log/apache2/my_site_address-access_log vhost CustomLog /var/log/apache2/my_site_address-ssl-request_log "%t %h %{HTTPS}x %{SSL_PROTOCOL}x %{SSL_CIPHER}x %{SSL_CIPHER_USEKEYSIZE}x %{SSL_CLIENT_VERIFY}x \"%r\" %b"
If you can access the site using both HTTP and HTTPS protocols you should use one access_log file in both configurations. Make sure you will use separate log files for separate web pages. You can delete or rotate your current logs at this moment as AWStats will not be able to parse them (they were most likely created using incompatible format). Then restart apache by running following command:
/etc/init.d/apache2 restart
This way apache will create new log files that can be parsed by AWStats. Visit your sites to make sure you will have some information stored in logs. This is all that has to be done in Apache web server configuration.
Next step is AWStats installation. Latest version available in portage is AWStats-7.0 to use this version we will have to unmask it by running following command:
echo "www-misc/awstats" >> /etc/portage/package.keywords
Next thing to do is choosing USE flags for AWStats and it's dependencies. AWStats comes with apache2 vhost goeip and ipv6 support. I added use flags configuration by running following commands:
echo "dev-libs/geoip perl-geoipupdate" >> /etc/portage/package.use echo "www-misc/awstats -ipv6 geoip apache2 vhost" >> /etc/portage/package.use
As you can see I'm not using ipv6 on my servers but I use geoip. This way AWStats will be able to identify site visitors country. You can install AWStats by running following command:
emerge www-misc/awstats
Now we can create new virtual host for our AWStats statistics. To do it in Gentoo we wll use a great script called webapp-config. I installed AWStats in separate virtual host by running following command:
webapp-config -I -h awstats_host_name awstats 7.0
This will create a new virtual host directory in /var/www/ and copy all needed files to this directory. You can visit this directory and see for yourself that you will have a few cgi-bin scripts inside of cgi-bin directory and some files in htdocs directory. Next thing is creating AWStats configuration file for every site you want to monitor.
To do it simply copy a sample configuration file under new name which will match your main site address. You can do it by running following command:
cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.your_site_address.conf
Now edit the file providing needed configuration options. My configuration is looking similar to this one (I'm providing only configuration options I changed):
LogFile="/var/log/apache2/my_site_address-access_log" LogFormat="%virtualname %host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %other" SiteDomain="my_site_address" HostAliases="IP_address my_site_address REGEX[my_site_address\.pl$]" DirData="../datadir" DirCgi="/var/www/awstats_host_name/cgi-bin" DirIcons="/awstatsicons" BuildReportFormat=xhtml #Depending on your site technology use index.php or index.html DefaultFile="index.php" SkipHosts="127.0.0.1 localhost" Lang="pl" StyleSheet="/awstatscss/awstats_bw.css" LoadPlugin="geoip GEOIP_STANDARD /usr/share/GeoIP/GeoIP.dat"
We should be able to test our configuration by entering virtual host cgi-bin directory and running awstats like this:
cd /var/www/awstats_host_name/cgi-bin ./awstats.pl -config=my_site_address -update
If you can see information similar to this one:
Create/Update database for config "/etc/awstats/awstats.my_site_address.conf" by AWStats version 7.0 (build 1.970) From data in log file "/var/log/apache2/my_site_address-access_log"... Phase 1 : First bypass old records, searching new record... Direct access to last remembered record is out of file. So searching it from beginning of log file... Phase 2 : Now process new records (Flush history on disk after 20000 hosts)... Jumped lines in file: 0 Parsed lines in file: 92 Found 0 dropped records, Found 0 comments, Found 0 blank records, Found 0 corrupted records, Found 0 old records, Found 92 new qualified records.
then AWStats is well configured and you can go on. If you will run into problems make sure your Apache log format configuration matches AWStats log format configuration.
Now we should make sure our statistics will be updated periodically. I did it by creating a simple script in /etc/cron.hourly/ named awstats:
#!/bin/sh cd /var/www/awstats_host_name/cgi-bin ./awstats.pl -config=my_site_address -update > /dev/null 2>&1
You should add similar sample line for every site you want to monitor. Do not forget to make this script executable by running following command:
chmod +x /etc/cron.hourly/awstats
You should also make sure that you will parse logs shortly before every log rotation. To do it you need to add prerotate command to your log rotation configuration. Consult example showing my log rotation configuration for apache access_log files stored in /etc/logrotate.d/apache2 configuration flle.
/var/log/apache2/*access_log { daily missingok notifempty rotate 365 dateext olddir /var/log/old/apache2 sharedscripts nocompress nocreate prerotate /etc/cron.hourly/awstats endscript postrotate /etc/init.d/apache2 reload > /dev/null 2>&1 || true endscript }
Now we are close to finishing configuration. We just have to make our stats appear on some kind of web page. To do it we will first create a sample html page that will show us links to our statistics. Create a index.html file in /var/www/awstats_host_name/htdocs with contents similar to this one:
!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>AWSTAT Domain List</title> </head> <body> AWSTATS Page.<br /> <ul> <li><a href="/awstats/awstats.pl?config=my_site_address">My_Site_Address</a></li> </ul> </body> </html>
Add separate link for every site you are monitoring. Now the last thing to do is making this page appear and forceing our awstas.pl file to work as cgi script. To do it we need to create virtual host configuration for Apache server. In Gentoo virtual host configuration files are kept in /etc/apache2/vhost.d/ directory. As I pointed out at the beginning my configuration should force users to visit this site using secure SSL connection. First thing will be creating a simple virtual host configuration with rewrite rule which will redirect every one to SSL connection. This configuration should be similar to this one:
<VirtualHost *:80> ServerAdmin admin@my_domain.org DocumentRoot /var/www/awstats_host_name/htdocs/ ServerName awstats_host_name.my_domain.org RewriteEngine On RewriteCond %{HTTPS} !=on RewriteRule ^/(.*) https://%{SERVER_NAME}/$1 [R,L] ErrorLog /var/log/apache2/awstats-error_log CustomLog /var/log/apache2/awstats-access_log vhost </VirtualHost>
Next we need to create a proper configuration for SSL virtual host with all parameters that would let us use cgi scripts. This configuration should be similar to this one:
<VirtualHost *:443> ServerAdmin admin@my_domain.org ServerName awstats_host_name.my_domain.org UseCanonicalName On SSLEngine on SSLOptions StrictRequire SSLCertificateFile /etc/ssl/apache2/server.crt SSLCertificateKeyFile /etc/ssl/apache2/server.key SSLProtocol all -SSLv2 DocumentRoot /var/www/awstats_host_name/htdocs Alias /awstatsclasses "/var/www/awstats_host_name/htdocs/classes/" Alias /awstatscss "/var/www/awstats_host_name/htdocs/css/" Alias /awstatsicons "/var/www/awstats_host_name/htdocs/icon/" ScriptAlias /awstats "/var/www/awstats_host_name/cgi-bin/" <Directory "/var/www/awstats_host_name/htdocs"> Options -Indexes FollowSymLinks AllowOverride All AuthType Basic AuthName "AWStats Admin Access Required" AuthUserFile /etc/awstats/.htpasswd require valid-user Order allow,deny Allow from all SSLRequireSSL </Directory> <Directory "/var/www/awstats_host_name/cgi-bin"> Options ExecCGI -Indexes FollowSymLinks SetHandler cgi-script Order allow,deny Allow from all SSLRequireSSL </Directory> <Location /awstats> AuthType Basic AuthName "AWStats Admin Access Required" AuthUserFile /etc/awstats/.htpasswd require valid-user SSLRequireSSL </Location> ErrorLog /var/log/apache2/awstats-ssl-error_log CustomLog /var/log/apache2/awstats-access_log vhost CustomLog /var/log/apache2/awstats-ssl-request_log "%t %h %{HTTPS}x %{SSL_PROTOCOL}x %{SSL_CIPHER}x %{SSL_CIPHER_USEKEYSIZE}x %{SSL_CLIENT_VERIFY}x \"%r\" %b" </VirtualHost>
The last thing to do is access file generation. You can create this file and add a first user by running following command:
htpasswd2 -c /etc/awstats/.htpasswd my_user_name
and add new users to existing file by running following command:
htpasswd2 /etc/awstats/.htpasswd my_2nd_user_name
And this is it. You can enjoy your statistics and watch what type of browsers and operating systems vistors are using:

or what type of content they are accessing:

or analyze the traffic:

Those informations are 100% reliable. You can not say the same about javascript based tracking services.
Sources:
- AWStats documentation
- AWStats Gentoo Linux Wiki.
- AWStats HowTo on Gentoo Linux Forums
GNU Free Documentation License or Creative Commons Share Alike
If you have found something wrong with the information provided above or maybe you just want to speak your mind about it, feel free to leave a comment.
All comments will show up on page after being approved. Sorry for such policy but I want to make sure that my site will be free of abusive or vulgar content. I don't mind being criticized just do it using right words.
Processing a comment.