Re: distributed caching

From: Brian Degenhardt <bmd@dont-contact.us>
Date: Fri, 23 Mar 2001 10:23:04 -0800

Running squid peers as proxy-only does not exactly make them all have unique
caches. From experience, you can implement your heuristic of having popular
objects on all peers and less popular objects proxied from one peer. Here's
the deal:

In normal peer configuration with every peer set as proxy-only, squid does
not respond to an ICP request if it has the object but it is stale.
Therefore, extremely popular requests can end up on almost all peers because
the request frequency increases the chance of requesting at a time where the
object is stale on all peers that have it. YMMV with request frequency and
refresh pattern, but my experience points that a linear increase in request
frequency translates to a logarithmic increase in the number of peers in a
cluster that have the object.

In order to actually have peers be proxy-only with unique caches, you have
to make them all parents of each other, or use the 'allow-miss' option for
cache_peer in 2.4 and up.

However, if you want something that sits between totally unique caches and
standard, non proxy-only peers, I would suggest that proxy-only is the way
to go.

cheers

-bmd

On Fri, Mar 23, 2001 at 11:41:45PM -0500, Roger Venning wrote:
>
> I've been thinking just a tiny bit about distributed caching again. Some of
> you might have seen the central squid server concept that SE Net of Adelaide
> had supported work on (http://www.senet.com.au/css). This was essentially
> a centralized cache digest aggregation point, queriable via ICP. I'm not
> sure
> whether the cost of having a separate well memory-resourced box is worth
> the benefits of cache-digests (although of course memory has now dropped
> below $1AU per MB... I'm young but can remember when even disk was more
> expensive than that).
>
> Essentially for a loose confederation of organisations that are prepared
> to act as siblings the problems can be in my (largely uninformed,
> _correction and additions desired_) opinion:
>
> o benefits of having large distributed cache a largely negated by the fact
> that no-one is prepared to run in 'proxy-only' mode, and so all caches
> move to a state where they hold the same objects
>
> o ICP traffic between siblings is n^2, although multicast helps by halving
> this (they all have to reply right?). Unreachable peers impose performance
> penalities on your own clients (admittedly minimised). Slow peers
> continually
> impact performance. If your sibling aren't running well dimensioned links...
> Of course how many people have got multicast going?
>
> o Cache digests solve most of the above problems, but suffer from becoming
> outdated, and issues of accuracy, due to the update interval/size/bandwidth
> saving tradeoffs.
>
> In order to overcome the first problem, I think that a method of running a
> cache in an intermediate state between 'proxy-only' and normal cache
> those objects that are cacheable mode might be useful. I suggest that this
> could be done by using past popularity as a indication of future popularity,
> and that 'highly popular' objects could migrate into multiple positions in
> a distributed cache, while unpopular objects are left on a single cache.
>
> This could be done by keeping popularity state, 'inferring' from last access
> time, or done stochastically(?) by simply assigning a 'proxy-only
> probability' -
> but the number of requests for a single object will normally be too low
> for this
> last idea to work very successfully as far as I can tell.
>
> I think there are elements of the Central Squid Server (CSS) that attack
> the last two points, especially the fact that CSSs could be themselves
> formed
> into a hierarchy, so a CSS could be kept regionally. The 'proxy-only
> probability'
> idea could be implemented separately.
>
> Finally, all of ICP, cache digests and CSS are based around HTTP/1.0 style
> objects, as recognised by HTCP. Does anyone have estimates of what
> percentage of objects are unable to be located by ICP? (Does this make
> sense?)
>
> Roger.
>
>
> -------------------------------------------------------------
> Roger Venning \ Do not go gentle into that good night
> Melbourne \ Rage, rage against the dying of the light.
> Australia <r.venning@bipond.com> Dylan Thomas
>
Received on Fri Mar 23 2001 - 11:23:06 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:40 MST