I've implemented AWStats to do enterprise statistics processing for a content delivery system with over 1000 virtual hosts...the trick was getting a separate awstats.conf file for each virtual host, and running awstats on each one.  I had to hack the perl scripts a bit, and write a sentry app that searches through directories and launches awstats on each conf file it finds.  Coincidentally, my company made the move from a similar Webalizer implementation for the same reason: not supported for 3 years.
You can use a shared logfile location (the access.log coming from squid obviously), and setup the awstats.conf to disregard everything that doesn't match the relavent vhost that's being processed.  May not perform like a dream (depending on how many vhosts you're talking about), but it will work.  Look through awstats documentation, your solution should be in the construction of the awstats.conf files.
Btw, to answer some of your other questions: yes, your reasoning is correct, you'll want to use Squid's logs to get accurate stats because Apache's logs will just be a fraction of the actual hits.  You don't need the custom log format patch, awstats can handle both Squid format and the Native CLF that Squid can emulate (emulate_httpd_log in squid.conf).
If it's too complicated to get awstats working with a shared log, then grep them on the fly (as logs are rotated perhaps) into separate log files for each vhost.  Better yet, use perl to do it.  I love perl.  Hope that helps somewhat.
- Gregori
-----Original Message-----
From: Maciej Zięba [mailto:finch@pf.pl] 
Sent: Wednesday, February 15, 2006 10:54 AM
To: squid-users@squid-cache.org
Subject: [squid-users] An access analyzer that works with Squid
Hi :)
I'm looking for a (log) analyzer that can give me the access/traffic
statistics of an Apache webserver that Squid is accelerating. More
precisely - I need seperate stats for each virtual host that runs on the
webserver.
I think it would be the best if I present the situation more closely... :)
I have an Apache webserver running on port 81 which has a couple of
vhosts (some of them are Zope instances, but I don't think that it
matters) and it is accelerated by Squid running on the same machine, on
port 80. As I've already said - I need statistics for each vhost and not
for all of them.
As I understand I cannot use the Apache main access_log and the logs of
vhosts because not all requests reach them (Squid caches and accelerates
contents that hasn't changed), so I'm left with Squid's access.log (all
traffic passes through Squid). Is my reasoning correct?
Anyhow, previously (before we used Squid) my company has used Webalizer
to parse the Apache's logs (main and vhosts') but it hasn't been
developed for over 3 years and as I've mentioned we can't use those logs
anymore...
I've come over AWStats and thought it would be a good choice.
Unfortunatelly all I can do is get stats of the entire Apache and not
the vhosts. That's because it's impossible for it to get virtual host's
name from Squid's access.log (neither in native, nor in common format).
I've found this patch that would solve my problem by enabling custom
logformat:
http://devel.squid-cache.org/customlog/
but I cannot use it and I cannot install the development Squid 3 - it's
company's semi-production machine and has to be stable :(
Is there some other way I can get AWStats running?
Or maybe you could recommend some other good tool for generating
statistics (HTML with things like graphs, most visited sites, etc.) from
squid's logs?
I'm sorry for the lenghty e-mail. I hope someone can give me pointers -
I'll be very grateful for any...
Umm... And please excuse my not-so-good English :|
Best regards,
Maciej Zieba
Received on Wed Feb 15 2006 - 12:17:10 MST
This archive was generated by hypermail pre-2.1.9 : Wed Mar 01 2006 - 12:00:03 MST