Re: [squid-users] cache tree structure

From: Joe Cooper <joe@dont-contact.us>
Date: Wed, 29 May 2002 12:30:06 -0500

Michael R. Wayne wrote:
> Joe Cooper wrote:
>
>>For example, don't do Squid 3 -> Squid 2 -> Squid 1 -> Internet. This
>>is a badly implemented cache topology, and latency will suffer because
>>of it.
>
>
> Henrik Nordström wrote:
>
>>In a sibling relation the request will only be sent to the sibling if the
>>sibling claims it has the response cached.
>>
>>In a parent relation the request is always sent there, assuming the parent
>>will find the requested data if not available in it's cache.
>
>
> Given a star of decent speed WAN links which are not saturated and
> cache servers at both the remote as well as central locations, it
> seems that using a sibling relationship, rather than parent/child
> is being recommended.
>
> As this feels counterintuitive, I'm just looking for confirmation.

No, Henrik has suggested that for a number of WANs, it does make sense
to use a parent at each WAN connected location, with siblings beneath
them. With a deployment as large as is in question here, it makes sense
to have a parent cache at every WAN point--which may have a parent that
is a sibling at the central location, or may be a direct child of the
central parent cache. He also mentioned that CARP potentially could be
used to avoid cache redundancy at each site.

If we were speaking of a smaller LAN with a single internet uplink, then
a sibling mesh with a single parent is a good choice. The goal is to
balance cache hits with cache proximity to the client and smallest
number of hops possible. Every cache peer interaction (parent or
sibling) adds latency, so the goal is to minimize the ICP interactions
required to answer any request. But if taken to an extreme, every cache
would go direct to the internet and no cache sharing would
happen--that's no good either (since overall latency then increases
because of the reduce hit ratio). Confused yet?

So the ideal hit ratio is a One-Parent-Many-Siblings model, because the
parent will always hit, if the item was cacheable (assuming a large
enough cache), but the siblings can offload some of the work and bring
the content a little closer to users. However, this doesn't work so
well with the sibling interaction is across WAN links because the
latency is too high. So in this case one has a parent at each site,
with any number of siblings. The parent then handles the WAN link, and
caches everything that comes down to the local site.

So, it depends on your environment. One smallish LAN only needs one
parent with a few siblings. A big, LANs+WAN environment needs more
distribution of data to bring cache content closer to users.

-- 
Joe Cooper <joe@swelltech.com>
Web caching appliances and support.
http://www.swelltech.com
Received on Wed May 29 2002 - 11:30:57 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:08:15 MST