Re: Suggestion - cache_peer and proxy-only

From: Q <q@dont-contact.us>
Date: Tue, 20 Oct 1998 13:49:46 +1000 (EST)

On Tue, 20 Oct 1998, Mark Reynolds wrote:

>
> In a situation where many squid's are neighbouring with
> each other, there would probably be many objects stored
> multiple times on multiple caches.
>
> If all caches had high bandwidth links to each other (like at
> a peering point, or Internet Exchange point), then this doesn't
> seem to be ideal.
>
> Popular objects will never be purged, and remain stored on
> multiple caches. If objects were stored once (or maybe even twice)
> only, across all caches, then you would end up with a larger
> overall cache, which benefits all those in the mesh.

It depends on what your trying to achieve at your peering point. From a
speed point of view, it would be better to have very popular objects
stored at most of the peers, and not just one or two. The peers would then
be able to server popular objects quickly and resort to sibling fetches
for not so frequently accessed items.

The other thing you need to consider is just what percentage of the
objects in your cache actually generate hits. These isn't much to be
gained in having a 'proxy-only' induced combined peer storage capacity of
100Gig if only 5 or 6Gig of all the objects in the cache will ever
generate a hit before they expire. (Only a guestimate).

It would be better to have as many of these hot objects stored where
they're needed. In most cases I have seen, similarly sized and utilised
caches will have a fairly low peer hit rate. Increasing the cache swap
size will increase the hit rate by an exponentially decreasing amount. The
point of cenvergence of course is where your cache is able to store every
requested object for its designated lifespan.

> Is an expiration or purging policy which looks for common
> objects across all peers a good or bad idea?

It sounds expensive. Intentionally removing objects because they are too
popular doesn't make sense. I believe it would be better to leave the
objects there. If your cache swap becomes full it will remove old or
expired objects, which in most cases are the objects that never generate a
hit. (Assuming your cache isn't undersized for the amount of traffic it is
handling)

The whole point of the cache is not to store the maximum number of unique
objects but to store those objects that are going to be requested more
than once. Having those objects at a peer is good, having them local is
better.

> Of course, you wouldn't want all common objects purged at the
> same time. Nor would you want it running as a high priority task.
> Maybe, purge objects from your local cache if more
> than 2 copies exist on your current cache_peers ?
>
> Just an idea. Any thoughts on the effectiveness would be welcome.

I also thought cache peering would benefit from this sort of 'maximum
available space' setup until I started to spend a lot of time watching the
behaviour and performance of my own caches.

My recommendation is that you shouldn't be too concerned about the hit
rate between peers, unless they are of significantly different sizes. The
bottom line is always the overall HITs you get. It doesn't matter whether
the hits are from local or peer caches, all that matters is the more hits
you get the less it will cost you in traffic charges.

Seeya...Q

               -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                        
                          _____ / Quinton Dolan - q@fan.net.au
  __ __/ / / __/ / / Systems Administrator
     / __ / _/ / / Fast Access Network
  __/ __/ __/ ____/ / - / Gold Coast, QLD, Australia
                    _______ / Ph: +61 7 5574 1050
                           \_\ SAGE-AU Member
Received on Mon Oct 19 1998 - 21:34:18 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:34 MST