Re: [RFC] byte hit ratio

From: Henrik Nordström <henrik_at_henriknordstrom.net>
Date: Tue, 07 Feb 2012 09:40:40 +0100

tis 2012-02-07 klockan 14:01 +1300 skrev Amos Jeffries:
> We have a long history of questions and bugs mentioning negative
> numbers in the byte hit ratio.
>
> I've always thought it was a bug we had not tracked down, but the FAQ
> says it is correct.
> http://wiki.squid-cache.org/SquidFaq/InnerWorkings#Why_do_I_see_negative_byte_hit_ratio.3F

Yes.. it's based on the difference between traffic squid<-servers and
clients<-squid. This can be negative (more traffic squid<-servers than
clients<-squid) in some situations.

  - retried requests
  - range retreival being processed by Squid
  - continued download after client disconnects (quick_abort_...)

> I've discussed this with a professional statistician I work with and
> she agrees the algorithm is not calculating hit ratio as per our
> definition of what a HIT is. What is does seem to be calculating is a
> net traffic GAIN ratio.

Yes.

> What I propose is make the numbers reported as HIT ratios use the same
> algorithm. The current request ratio one. And to add alongside this a
> record for Gain/Loss Ratio as output by this byte calculation.

Why is it interesting to calculate a nicer but very inaccurate number?
To hide that the proxy cache may actually cause higher bandwidth usage
than not having the proxy cache?

I would argue that the request hit ratio calculation is the broken one
from a statistical point of view.

Regards
Henrik
Received on Tue Feb 07 2012 - 08:38:01 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 07 2012 - 12:00:10 MST