Re: Cache Digests from Alex Rousskov on 1998-09-11 (squid-users)

From: Alex Rousskov <rousskov@dont-contact.us>
Date: Fri, 11 Sep 1998 16:36:05 -0600 (MDT)

On Sat, 12 Sep 1998, Stephen Baxter wrote:

> Wouldn't it be better for the digest to be used as really good guess
> mechanism for ICP. So an object is looked up in all of the digests and
> found that it may be in squid1, squid5 and squid7 - not bad if there are
> 20 caches in the mesh - heaps less ICP.

This is already implemented. We scan all the peers and select those with a
HIT reported by their digest. Then we apply time measurements from NetDB to
select the best peer if several had HITs in their digests.

The problem arises when there are MISSes in all peer digests. With high
probability, the object is simply not out there. However, the current code
does ICP queries in this all-MISS case, just to be sure. Eventually, the
order and applicability of peer selection algorithms should be configurable,
I guess.

> Instead of using a lossy mechanism such as digests for absolute resolution
> of an object location let ICP kick in and finish the job for you. The
> result is fewer and better targetted ICP packets.

Why would you want to double-check digest's guess? In most cases, you end up
with the same result, and you do not get 100% insurance for false hits
anyway. Plus you pay for ICP round trip time (at least!) which we were
actually trying to avoid...

> Just an idea - it is the same way we are implementing the smart neighbour
> - in order to get a really hit on the location of the object.
> This would instanly remove the need to mod squid for false hits !

There is NO algorithm that guarantees the absence of false hits in a
distributed environment. Some algorithms have better false ratios, some worse,
that's all. Squid must handle false hits. Also note that a false-hit-like
situation arises when your peer is temporary down or canceled your request
for some reason.

I strongly believe that instead of designing complex and heavy bullet-proof
algorithms, we should use lightweight techniques and then handle 1-5% of
exceptional situations in a robust way. Let's optimize for the common case.

Alex.
Received on Fri Sep 11 1998 - 15:37:35 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:41:57 MST