Re: how hint caching works from Jon Kay on 2001-11-21 (squid-dev)

From: Jon Kay <jkay@dont-contact.us>
Date: Wed, 21 Nov 2001 12:01:59 -0600

Roger Venning suggested:
> Andres Kroonmaa wrote:
> > It seems to me that it would be more memory efficient to use single
> > central "hint-cache" that does nothing else but centralises hint data
> > and provides it to every box in cloud via some ICP-like protocol.
> > Have you considered such approach?
> >
> > Otherwise its interesting.
> >
>> Firstly a thought from left field: which would be easier, updating
> pushcache to HEAD or modifying the design assumptions of cache-digests?

That's a real good question. At first, I was in the latter camp. Or
at least trying to rewrite/redesign hints to be like cache digests so
we could reuse lots of cache digest code. Unfortunately, that turned
out to be tougher than hoped, largely because cache-digests has no
metadata, just a single bit per hashed URL. There will still be some
sharing, but alot less than I thought, alas.

> Secondly, on the topic of Andres thoughts regarding a 'central
> hint-cache', this is exactly what 'central-squid-server
> (http://www.senet.com.au/css) was about. One needs to be clear about the
> design goals though: minimising traffic, or minimising latency for cache
> users.

Our major goal is minimizing latency. After all, that is what caches
exist for, at the end of the day. Making life faster for the users.

We recognize that for many countries on limited bandwidth, minimizing
latency and bandwidth usage are one and the same, but in places with
plentiful bandwidth, there is a divergence. Permit me to guess that you
guys probably are OK on local bandwidth, even though your bandwidth
to anywhere else must needs travel more thousands of miles than we like
to think about and thus must cost accordingly?

Hint Cache is similar to CSS in that it tries to avoid sending actual
data through intermediate hops.

> I was setting about 'proving' how this central squid server would
> actually assist in traffic savings for a large loose confederation of
> users such as the Australian universities network (www.aarnet.edu.au).
> CSS can operate in a hierarchical model where a single CSS is local to a
> cluster of caches, but also peers with a hierarchy of CSS servers. I
> recall that ICP was routed around this hierarchy until an actual
> destination cache was determined and returned. Anyway, I recall that I
> wasn't actually able to convince myself of bandwidth savings. The only
> thing that really made a difference was differential tariffs, ie.
> overseas bandwidth being more expensive than national bandwidth, and
> thus not all bytes were equal. This is one of the reasons that I kind of
> let things slip I guess eventually, although perhaps it is deserving of
> some more in-depth modelling. One of the issues was the diminishing
> returns I expected to be gained from increasing the cluster size.
> Perhaps a more factual-based revisit might find that the locality of
> reference gains from higher request rates to the 'distributed caching
> cluster' might bring more hit-rate improvements than I expected.

The benefit wouldn't so much be from hit rates as from lower latency -
you would go directly to whomever has a copy of the actual data to get
it, rather than having to wait for it to be cached at higher nodes.
CSS is like hint cache in that.
--
Jon Kay pushcache.com jkay@pushcache.com
http://www.pushcache.com/ (512) 420-9025
Squid consulting 'push done right.'
Received on Wed Nov 21 2001 - 11:04:43 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:38 MST