Re: Cache Digests

From: Niall Doherty <ndoherty@dont-contact.us>
Date: Tue, 06 Oct 1998 19:21:23 +0100

Hi Alex,

Thanks a lot for the answers. I'm afraid they've just made me more
curious.

> 1) Entries: Suppose your cache has 1000 objects (entries: count). When
> building a local (store) digest, Squid guessed that allocating space for 500
> cache objects would be enough (entries: capacity). "Entries: utilization"
> shows 200%. That's OK if "bits: utilization" is normal, see below.

I'm not really clear on a couple of issues (lots actually :-)

When does it make a decision on how much space to allocate ? And
where does it allocate the space ? (disk/ram ?). Would it not be able
to guess very accurately because it *knows* how many items it has ?
Even if it is changing, it doesn't take very long to build the digest
so its final value wouldn't have changed much ?

Does it change the size of the digest each time it is being created ?
It creates it every hour doesn't it ? or does (will?) it work out
how often would be best to create it... - if it assigns an expiry
time this expiry time has to be the same as the period above ?

My cache (that you saw the stats for) is only ~10% full at the moment,
so I'm wondering when did it make the guess and why did it get it wrong
given that it's not going to change much over the couple of minutes
it takes to build the digest ?!

(I just spotted that util = count*100/capacity - I'm slowly getting
there :-)

> 2) Bits: For each entry in "entries: capacity" Squid allocates bits_per_entry

Will bits_per_entry be configurable ? or is there any need for it -
have you found if 5 is a "good enough" value for all circumstances ?

> bits in the digest (bits: capacity). When digesting entries, some bits are
> turned on (bits: on). Ideally, the bits utilization (bits on/bits capacity)
> should be close to 50%.
That's coz the algorithm is "supposed" to be optimised to set 50% of
the bits on/off, right ? so that this minimises the chances of collisions
in the table - and hence, in this case, more than 1 URL having the same
entry ?

> Looking at your stats, Squid guessed the capacity just right!
>
> > What about the bit and bit-seq lines ?
>
> "bit:" see (2) above.
>
> "bit-seq" is "bit sequence", i.e. an uninterrupted sequence of bits with the
> same value ('0' or '1'). This line gives some insight on the quality of the
> digest hashing function. Extreme values (e.g. very long average sequence
> length) may indicate a problem even if bit utilization is 50%.

What's a very long average sequence length, example ?

> > squid.eei.ericsson.se digest: size: 55324 bytes
> > entries: count: 281373 capacity: 88517 util: 318%
still wondering why this is at 318%...

> > bits: per entry: 5 on: 216176 capacity: 442592 util: 49%
OK - so that looks good.

Now - would you be able to comment on the following ? These
are cache digests from peer servers in SE, DK and DE (all of them
are filling up at the moment):

I assume it's pure fluke (coincidence) that the following two
caches have identical values for the utilisations !

X1 digest: size: 11665 bytes
         entries: count: 6301 capacity: 18663 util: 34%
         deletion attempts: 0
         bits: per entry: 5 on: 22042 capacity: 93320 util: 24%
         bit-seq: count: 33752 avg.len: 2.76

X2 digest: size: 88574 bytes
         entries: count: 47826 capacity: 141718 util: 34%
         deletion attempts: 0
         bits: per entry: 5 on: 167581 capacity: 708592 util: 24%
         bit-seq: count: 256269 avg.len: 2.77

What would cause the following values to be so low ?

X3 digest: size: 787692 bytes
         entries: count: 70233 capacity: 1260307 util: 6%
         deletion attempts: 0
         bits: per entry: 5 on: 274671 capacity: 6301536 util: 4%
         bit-seq: count: 525332 avg.len: 12.00

Thanks,
Niall (trying hard to understand)

-- 
Niall Doherty          | mailto:ndoherty@eei.ericsson.se
Systems Engineer       | http://www.ericsson.ie
Voice: +353 1 207 7506 | Ericsson Systems Expertise Ltd.,
Fax:   +353 1 207 7115 | Beech Hill, Clonskeagh, Dublin 4, Ireland.
Received on Tue Oct 06 1998 - 11:22:21 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:21 MST