Re: HTCP versus ICP (fwd on behalf of Paul Vixie)

From: Dax Kelson <dkelson@dont-contact.us>
Date: Wed, 1 Jul 1998 13:18:23 -0600 (MDT)

Since he isn't a list subscriber yet, his message bounced.

---------- Forwarded message ----------
Date: Wed, 01 Jul 1998 12:00:19 -0700
From: Paul A Vixie <vixie@vix.com>
To: squid-users@ircache.net
Cc: Dax Kelson <dkelson@inconnect.com>
Subject: Re: HTCP versus ICP (fwd)

Dax Kelson <dkelson@inconnect.com> asked me to comment on this, since I'm
not a regular member of the squid-users list. In fact, I subscribed to the
"icp" list and hoped to foster some htcp discussion there, but that never
did quite happen. Anyway, please CC me if you want me to see your reply.

> > http://www.vix.com/ietf/htcp.txt
> > Is HTCP a "good" thing?
>
> HTCP is essentially ICP except that HTTP headers are included with ICP_HIT
> reply.

No. HTCP also includes the HTTP headers in the _request_, which allows a
server to avoid false hits which would be true for just a URL key but which
are false if more headers are known. For good or ill, many servers now give
different answers for the same URL based only on differences in the User-Agent
field. ICP does not allow correct caching behaviour in this case, even if
Vary is used, which it is usually not.

And perhaps more importantly, HTCP allows third party replies of the form "I
don't have it but I know who does", which can be used to automatically build
caching hierarchies without static configuration.

And the thing HTCP was actually built for was its ability to monitor another
cache's additions/deletions so that read-behind mirroring could be used in
place of read-ahead cut-through when there's a primary cache miss -- this
results in much better time-to-first-byte for users of the cache. Now, I'll
admit that this optimization helps more with transparent caching than with
normal browser-configured caching, but it's still a boon to scalability of
any given cache hierarchy. Basically the primary cache does the fetch
rather than waiting for the secondary cache to do it, so there's no "peer"
or "parent" dichotomy like in ICP. The secondary cache fetches the object
from the primary cache if, while monitoring the primary cache, it detects
a cache addition that looks interesting.

> HTCP does not solve scalability problems of ICP, as far as I can see. A
> cache has to wait for miss replies from all peers before going through a
> parent or direct.

I suppose I could have written a better "applicability statement". HTCP *is*
more scalable than ICP since the monitoring relationships and third party
responses allow a primary cache to have a single "query neighbor" and to then
get an authoritative response _from_that_neighbor_ which describes the object's
location _anywhere_in_the_hierarchy_ (or describes it as not present anywhere
in the hierarchy).

So while HTCP can be used as a better ICP, its fundamental design premise is
inverted from ICP's, and if you use every protocol feature contained in HTCP,
your data will flow in opposite directions from ICP's design but you'll have
a higher "system aggregate" cache hit rate, lower bandwidth utilization, and
better scalability.

> > Will it be implemented in Squid 1.2?
>
> Initial implementation of HTCP got stuck mainly for performance reasons.
> HTCP, among other things, would require disk I/Os to swap in object headers
> for every ICP_HIT reply because Squid does not keep all headers in memory.

It was a huge performance hit for us, too. But URL-as-key is just broken.
It does not matter to an end user whether the reason they got the wrong page
(what you call a "low probability false hit") is because of Squid's memory
usage profile.

> This does not answer your question though.
>
> Regardless of HTCP status, _if_ most of us consider small false hit ratio
> acceptable, Cache Digests is the way to go. They do not introduce any
> query/response delays and, thus, scale well with the number of peers. The
> price is more RAM, but with less than a 1MB digest per 16GB peer and cheap
> memory, it's not a big deal.

I'd like to learn more about Cache Digests. Is there an Internet Draft?
Received on Wed Jul 01 1998 - 12:26:46 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:41:02 MST