RE: [squid-users] realtime reports of squid3

From: James Zuelow <James_Zuelow_at_ci.juneau.ak.us>
Date: Fri, 3 Apr 2009 09:25:51 -0800

> -----Original Message-----
> From: sameer shinde [mailto:s9sameer_at_gmail.com]
> Sent: Friday, 03 April, 2009 02:11
> To: Squid Users
>
> Hi James,
>
> Thanks for exposing one more software for me. I'll surely try this
> out, but not now. A little letter.
> I checked the demo of this site. but not satisfied with the kind of
> reports it shows.
> Whatever reports it shows are good, I'm not doubting on it. But We
> need some more detail reports,
> like which file downloads, which user has downloaded it? The graphical
> representation of the user status
> These things are good in sarg.
>
> My only problem in the sarg is, I've to run the sarg manually
> every-time to update the database.
> If I can automate this process, with crontab, which I'll do in any
> way, for a specific time intervals
> most of my problems get resolved. At present I'm trying to schedule it
> for 5 min interval, which later I'll
> reduce to 1 min or even lesser. :)
>
>
> ~~~~~~~~~~~~~~
> Sameer Shinde.
> M:- +91 98204 61580
> Think Pink.... Live Green..!
>
Hi Sameer --

MySAR is probably not for everyone, but I think it is still what you're looking for, especially with the database back-end.

At first I didn't particularly think that MySAR's reports were very informative either, but they are a little deceptive. There is a "details" link on most of them that will drill down into the data you're used to seeing with the Sarg reports.

And the stock MySAR web pages, at least on our system, arenn't at all fast. The initial greeting page takes a few minutes to generate. I am not helping things by keeping 365 days of data, so my database is still growing. MySQL complains and says it wants 3GB RAM just for the indexes on the traffic table. :)

My solution was to bang together a couple of very simple Perl CGI scripts. The first one asks for a username, search term for the URL, and a starting date and ending date. Then it passes the form info to a second CGI that parses the form info, queries the database for the search terms, and then presents the data in an HTML table. Since all of the data lives in a MySQL database, your possibilities are endless. Pie charts with GD::Graph, whatever you feel like putting together.

The scripts I use are very basic, but they serve my needs. We very rarely have to actually look at a particular user's browsing habits. Mostly I'm just watching how much bandwidth Pandora is using, and what percentage of traffic is being cached. For that purpose the built in MySAR reports work just fine, and there is no urgency so I can let the browser sit while MySAR builds it's queries and tables.

We switched to MySAR because there's no way to ask Sarg "What did user X do between Wednesday and Friday of last week" unless you want to drill down to the various daily reports, or wade through the Monday and Tuesday data mixed in on the weekly report. And in order to get weekly and monthly reports for Sarg, I had to rotate logs on a monthly basis. By the end of the month the logs would be several gigs in size. Sarg's daily report could take up to four hours to generate, and the monthly report would take up to nine (!) hours to generate. There is just no way I could get Sarg to generate reports every minute.

Some of that slowness is hardware related -- Our Squid server runs on a dual 1GHz PIII IBM 346 with 4GB of RAM, and the MySQL server is also on the same machine. MySQL wasn't installed when we were using Sarg, but even with the database it just keeps chugging along. Except for MySAR's nightly database maintenance the load is usually about 0.4 -- so MySAR is easily able to keep up with our peak use during the day (1800 req/minute, which means 1800 database entries/min).

So MySAR was faster for us, gave us up to the minute stats, and once you start poking around the database and writing CGI it turns out to be much more flexible than the static Sarg HTML pages were. Instead of looking at a bunch of static HTML daily or weekly pages to gather information we can now just enter a range of dates, or if we're really concentrating on something we can enter a range of times on one day for a very specific report without a lot of extraneous stuff included. It's very nice to be able to ask for a five minute window of a user's activity in near real time when our anti-virus system starts kicking out e-mail messages.

James Zuelow....................CBJ MIS (907)586-0236
Network Specialist...Registered Linux User No. 186591
Received on Fri Apr 03 2009 - 17:25:58 MDT

This archive was generated by hypermail 2.2.0 : Sat Apr 04 2009 - 12:00:01 MDT