Re: [squid-users] Logfile analyzing

From: Merton Campbell Crockett <mcc@dont-contact.us>
Date: Sat, 22 Jan 2005 17:41:12 -0800 (PST)

On Sat, 22 Jan 2005, airplays55@yahoo.com wrote:

> I'd like to see this in the logfile:
>
> http://www.microsoft.com
> http://www.microsoft.com/products
> http://www.microsoft.com/products/visual_studio
>
> This is a theoretical example as if those are the
> actual URL locations typed into the address bar, or
> clicked via hyperlink.
>
> I don't see how the access.log can be used to provide
> this kind of report.
>
> For example, if I simply type microsoft.com in my
> address bar and click on "office" in the left pane,
> then check my access.log, I see 35 entries have been
> added just by clicking the "office" link once. I
> understand that there is a separate entry for each
> HTTP GET that the webpage calls for, but the
> access.log doesn't seem to differentiate between what
> the user clicked, and what the webpage requested to
> display the whole page correctly.
>
> More specifically, the first 3 entries say:
>
> 127.0.0.1 - - [22/Jan/2005:15:56:31 -0500] "GET
> http://g.microsoft.com/mh_mshp/2 HTTP/1.1" 301 538
> TCP_MISS:DIRECT
> 127.0.0.1 - - [22/Jan/2005:15:56:32 -0500] "GET
> http://office.microsoft.com/home/default.aspx
> HTTP/1.1" 301 467 TCP_MISS:DIRECT
> 127.0.0.1 - - [22/Jan/2005:15:56:32 -0500] "GET
> http://office.microsoft.com/en-us/default.aspx
> HTTP/1.1" 200 52134 TCP_MISS:DIRECT
>
> How is ANY logfile analyzer going to tell the
> difference between the first entry (which the user
> clicked on) and the second/third entries (which were
> requested by the html from the first entry)?

The few analysers that I've used pay attention to the status codes. In
the above example, the page displayed in the browser was the last one
where a 200 status was returned. The 301 status redirects the browser to
the new location of the requested page.

They, also, allowed me to select what I was interested in having reported.
If I'm not interested in the graphical content on the page, I can tell it
to suppress images.
 
> Is there is a squid configuration parameter that will
> allow the logs to be filtered appropriately?

Why would you want to do this? If you have users complaining that a page
doesn't display correctly, how would you identify the cause of the problem
if you don't record what happened in the log?

Merton Campbell Crockett

-- 
BEGIN:				vcard
VERSION:			3.0
FN:				Merton Campbell Crockett
ORG:				General Dynamics Advanced Information Systems;
				Intelligence and Exploitation Systems
N:				Crockett;Merton;Campbell
EMAIL;TYPE=internet:		mcc@CATO.GD-AIS.COM
TEL;TYPE=work,voice,msg,pref:	+1(805)497-5045
TEL;TYPE=work,fax:		+1(805)497-5050
TEL;TYPE=cell,voice,msg:	+1(805)377-6762
END:				vcard
Received on Sat Jan 22 2005 - 18:43:06 MST

This archive was generated by hypermail pre-2.1.9 : Mon Mar 07 2005 - 12:59:36 MST