Re: [squid-users] accelerator farm: optimizing the sibbling_hit from Henrik Nordstrom on 2003-03-03 (squid-users)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Mon, 3 Mar 2003 22:26:51 +0100

On Monday 03 March 2003 18.29, Ard van Breemen wrote:

> > To allow refreshes via siblings you must also change Squid to not
> > use "only-if-cached" when requesting the object from the sibling,
> > or else the request will be rejected by the sibling.
>
> But doesn't the sibbling only answer UDP_HIT when it has an
> object cached which is not stale? (icp_stale_hit is off...)
> It is already configured to allow fetches for other squids, so
> even if the object would have been stale, it will be fetched
> anyway.

It will, but your Squid will instruct it via the Cache-Control header
that it do not want the sibling to try to fetch the object and only
wants it from the sibling if it is fresh in the cache..

> Hmmmm, the idea is to have one request to the web server, and
> have that cached (using icp) by the complete farm. So we cannot
> add fuziness to the web server. I therefore wanted to add the
> fuziness to the accelerator. Of course: the most important thing
> is that we want to have the multiple cache revalidations
> collapsed into one, but I mean the multiple cache of all
> peers. The problem I face is that at almost exactly the same
> moment we get multiple requests of the same url at different
> peers. By adding calculated fuziness into the farm, the url will
> probably be refreshed DIRECT by only one peer. The others will
> of course have a SIBBLING_HIT or a DIRECT_TIMEOUT.
> Of course it would be beautiful if only one peer would say
> something like UDP_MISS_WILL_FETCH... which would make all my
> hackorish plans obsolete.

For your situationt and most accelerator farms the following
configuration should be optimal I think:

1. Use smart request routing within the array of Squids to make sure
that for each URL one of the Squids is denoted "master". For example
by using the CARP hash algorithm or a manual division using
cache_peer_access. This gets rid of ICP while at the same time
preserving cache redundancy (assuming clients hits all your servers
"randomly").

2. prefer_direct off (as ICP is not used..)

3. very short peer connect timeout for quick recovery in case of peer
failure, to compensate for the lack of ICP in determining peer
availability.

4. Squid modified to collapse refreshes of the same cached object into
one request to avoid storms of refreshes when an object expires.
Normally Squid will issue a refresh for each new request received
until a new reply have been received after the object have expired
which may cause a bit of grief in an heavily hit accelerator setup..

5. Maybe also to collapse requests for the same URL into one even if
not already cached, but this is more risky and may cause high latency
for dynamic information not cached..

The drawback of such design is that the more members of your
accelerator array the closer the cache hop count approaches 2 on
average for cache misses, as the more members the less likely the
client request hits the "denoted master cache" for this URL.

Regards
Henrik
Received on Mon Mar 03 2003 - 14:24:35 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:13:54 MST