Re: [squid-users] Re: Real hit count of a user? Can it be really found?

From: Amos Jeffries <squid3@dont-contact.us>
Date: Sat, 08 Mar 2008 11:23:00 +1300

Aykut Demirkol wrote:
>> Ahmet wrote:
>>> Hi,
>>>
>>> I am trying to count web hits of users. With using proxy it is seems
>>> to be easy (I tried combinations of squid, tiny, privoxy in
>>> transparent modes).
>>> But it is obvious that the hits in the logs not purely the hits that
>>> users wanted to do.
>>> For example when a user goes to cnn.com, cnn.com calls other ad pages
>>> or non-ad pages and it is seen as an user hit in logs. So for real
>>> hit count an analysis must be made on logs.
>>> Do you know any tool, proxy that can help in such analysis?
>>>
>>> Second choice is writing own tool that can be parsing the logs and
>>> doing an analysis on referer field. But solely depending on referer
>>> can cause false positive results for users clicking on a link on a page.
>>> To further investigate the issue I listened (by ethereal) outgoing
>>> packets for a usual user behavior (clicking on a link) and page
>>> calling pages. In request packets they all seem to have same headers
>>> and similar header values. So I stucked and could not found any
>>> possible piece of evidence to track and distinguish the hits.
>>> Is there a known theoretical or practical way for distinguishing this
>>> behaviours?
>>>
>>>
>> No.
>>
>> We use http://awstats.sourceforge.net/ because we need some kind of
>> count, but there's no way to count "people seeing the page".
>>
>> You've got robots of various kinds and user agents that lie and all
>> kinds of other stuff that makes the data "fuzzy". The best you can
>> get is a general increasing or decreasing trend.
>>
>> For all you know, someone's using a script to launch a browser to hit
>> your page as a diagnostic poll to see if their connection is still up.
>>
>> Having said all that, awstats will probably do what you want it to.
>>
>> Cheers!
>>
> Daniel thanks for reply.
> I need to learn exact action taken on the client browser such as if user
> clicked the link or incoming page is calling another page.
> So parsing and analyzing proxy logs or apache logs seems to not helping
> in that case.

Such information usually given in HTTP Referer: header.

Any finder grain than available in those headers and logs requires
complete control of every machine on The Internet connecting to your page.

Amos

-- 
Please use Squid 2.6STABLE17+ or 3.0STABLE1+
There are serious security advisories out on all earlier releases.
Received on Fri Mar 07 2008 - 15:22:27 MST

This archive was generated by hypermail pre-2.1.9 : Tue Apr 01 2008 - 13:00:05 MDT