Cache Digests vs ICP (was Re: Squid-FS) from Andres Kroonmaa on 1998-04-24 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Fri, 24 Apr 1998 14:09:04 +0300 (EETDST)

--MimeMultipartBoundary
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT

On 24 Apr 98, at 9:28, Dancer <dancer@brisnet.org.au> wrote:

> Henrik Nordstrom wrote:
> >
> > The basic Mirror-Image design seems like a feasible way to implement a
> > hierarchy. The frontend caches is the same that contacts the origin
> > servers effectively distributing the load in a scalable way, and only
> > inter-cache traffic goes throught backend servers. But I don't agree on
> > the idea of "Terabyte-Servers".
> >
> > My current idea what a good cache design looks like: (farm model)
> > * Any number of frontend caches, that does the actual work
> > * 1 (or 2 to provide redundancy) "grand centrals" that keeps track of
> > the whole cache.
> > * The grand central keeps track of local servers as well, signalling "go
> > direct" for local resources when peering with remote caches/farms.
> > * Frontend caches updates to the grand central using cache digests.
> > * Frontend caches queries the grand central using a variation of ICP or
> > a similar protocol, to determine if the object is already cached.
> > * Clients uses a PAC with a primary frontend cache based on locaion
> > (source IP) and fallback frontends if the primary dies/fails.
> >
> > This design should be scalable on all factors, no matter what kind of
> > hardware you build with (more powerful == less boxes, less powerful ==
> > more boxes).
> >
> > The service continues to function even if the grand central dies. The
> > impact is lower hit ratio, as each frontend cache then runs in
> > stand-alone without knowledge of the other caches.
> >
> > Peering between caches is done at the grand-central level, using cache
> > digests.
>
> I've been playing with theoretical large cache-farm designs on paper,
> for future setups, and (so far) a three-layer design is the best I've
> been able to come up with. At least, I don't see any serious cons.
>
> * Layer 1: Talks to the net. Fetches from origin servers. Small cache.
> * Layer 2: Main cache boxes. Large cache.
> * Layer 3: Talks to groups of customers. Customers are grouped by
> philosophy, pattern of use, or physical region. small cache.
>
> Layer 3 machines don't have a sibling relationship. They're all children
> of layer 2, which are siblings of layer 2 and children of layer 1.
>
> If I've thought this through correctly, most objects should accumulate
> on the disks in layer 2. Layer 1 just fetches them in, and then goes off
> to grab the next thing. Small, frequently used items would aggregate on
> the layer-3 machines (with an appropriate policy).

I would argue such a layered scheme. Parent relations are not very
lightweight and I can't see very much sense in using L1 parents just as
proxies. Request from client has to traverse all 3 layers before hiting
actual source. This adds lots of delay and lots of failurepoints.
IMHO, anything that can't cope to hold efficient amount of cached objects
on its disks, should NOT be on the way of those who can. For me, this means
cutting off L1 caches.
L2 caches in your scheme are the only ones that are actually needed, L3
caches would be dictated by actual needs, like firewalls. I any case, I
would not point efficient caches to a parent if at all possible.

I'd also propose sort of central index server.
- stripped down ICP server sits on the roof of some collection of caches.
- all caches send ICP updates to this server, who sorts, expires, etc the index.
- all caches query _only_ this ICP server, getting in response a ICP message
   with 0,1..n caches having the object, ideally with timestamps. If done in
   MD5 -> (IP-address/timestamp)[1..n] manner, ICP would be very small.
   (identify any single cache by single IP address, not name)
   having ICP server close to L2 caches allows fast response times
- caches themselves do not peer.
- ICP servers from different admin domains can peer and exchange ICP messages
   to build up wider knowledge of object locations.
- ICP servers can get ICP updates from some caches getting newer version of
   some object that is also in some other caches. ICP server could send a notice
   to these caches and let them drop the old version. This same info can be
   propagated to peering ICP servers.
- ICP servers, being managed by same admin domain as L2 caches, can and would
   be a good place to keep also other vital info for coordinating caches, like
   access-lists, refresh-rules.

In overall, this
- would make main squid process much simpler compared to currently proposed
   digest exchange method. Also, much more lightweight, strict AND more uptodate.
- will make squid main more static, that is move much of ICP handling to ICP
   servers, thus more reliable and will let to focuse on performance issues more
   deeply.
- allows to play with different methods in ICP servers without affecting main
   squid - squid and ICP servers exchange only ICP messages, whats going on inside
   ICP server does not affect main squid design too much. (eg. whether implement
   ICP index as classic MD5 list or using Bloom filters, whatever)
- ICP server can keep track of dead peers and filter them out for responses.

I can see use for hierachies that have 100-200+ cooperating caches, using central
ICP server(s) (as they may filter incoming updates based on timestamps, too many
caches holding same objects, remote cache metrics, preferences, etc) and still
being resource-efficient.
But I can not see any way to peer same amount of caches using digests implemented
in main squid processes. mirroring indexes of 100-200 cahes in every peering box
makes too much redundant data, makes it too complex to keep current.

Also, digests with their high false-positives makes almost impossible to establish
peerings between competing ISPs, unless squid can cope with miss-denies, but then
use of digests loose lots of their value, as tcp-miss connects waste bandwidth
and time.

Whats more, if L3 caches use the same ICP server, they can be also included in
hit-resolving paths, even if they are pretty small boxes. That is, parent caches
can know and make use of objects stored on their children.

In fact, such ICP traffic reminds very much of active routing protocols used in
routers, with networks being MD5 hashes, gateways being cache boxes and metrics
being preferences/loadmetrics... (Maybe it would be even possible to reuse most
popular/efficient routing algoritms)

Currently, ICP asks "do _you_ have this object?" and response is either
yes or no. IMHO we'd need ICP that can ask more like "_where_ do you think this
object might be cached?" and response would be like "dunno" or "you can try this
and that cache" (perhaps with added "they have this and that old copies of it").
IMHO, we'd wish to implement digests or ICP central lists in separate daemon
and develop it somewhat separately from the main squid...

----------------------------------------------------------------------
  Andres Kroonmaa mail: andre@online.ee
  Network Manager
  Organization: MicroLink Online Tel: 6308 909
  Tallinn, Sakala 19 Pho: +372 6308 909
  Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
----------------------------------------------------------------------

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:48 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:45 MST