[squid-users] is there a squid "cache rank" value available for statistics?

From: Gavin McCullagh <gavin.mccullagh_at_gcd.ie>
Date: Sun, 19 Apr 2009 19:03:36 +0100

Hi,

I'm wondering about ways to measure the optimum size for a cache, in terms
of the "value" you gain from each GB of cache space. If you've got a 400GB
cache and only 99% of your hits come from the first 350GB, there's probably
no point looking for a larger cache. If only 80% come from the first
350GB, then a bigger cache might well be useful.

I realise there are rules of thumb for cache size, it would be interesting
to be able to analyse a particular squid installation.

Squid obviously removes objects from its cache based on the chosen
cache_replacement_policy. It appears from the comments in squid.conf that
in the case of the LRU policy, this is implemented as a list, presumably a
queue of pointers to objects in the cache. Objects which come to the head
of the queue are presumably next for removal. I guess if an object in the
cache gets used it goes back to the tail of the queue. I suppose this
process must involve linearly traversing the queue to find the object and
remove it, which is presumably why heap-based policies are available.

I wonder if it would be feasible to calculate a "cache rank", which
indicates the position an object was within the queue at the time of the
hit. So, perhaps 0 means at the tail of the queue, 1 means at the head.
If this could be reported alongside each hit in the access.log, one could
draw stats on the amount of hits served by each portion of the queue and
therefore determine the value of expanding or contracting your cache.

In the case of simple LRU, if the queue must be traversed to find each
element and requeue it (perhaps this isn't the case?), I suppose one could
count the position in the queue and divide by the total length.

With a heap, things are more complex. I guess you could give an indication
of the depth in the heap but there would be so many objects on the lowest
levels, I don't suppose this would be a great guide. Is there some better
value available, such as the key used in the heap maybe?

Or perhaps the whole idea is flawed somehow?

Comments, criticisms, explanations, rebukes all welcome.
Gavin
Received on Sun Apr 19 2009 - 18:03:40 MDT

This archive was generated by hypermail 2.2.0 : Mon Apr 20 2009 - 12:00:02 MDT