Re: [squid-users] Re: ICP and HTCP and StoreID from Alex Rousskov on 2014-02-13 (squid-users)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 13 Feb 2014 22:22:19 -0700

On 02/13/2014 06:05 PM, Nikolai Gorchilov wrote:
> On Thu, Feb 13, 2014 at 10:04 PM, Alex Rousskov wrote:
>> AFAICT, if Squid always uses URLs for anything
>> outside internal storage, everything would work correctly and all use
>> cases will be supported well, without any additional options.

> I believe this optimization covers the most common scenario when using
> cache peers and StoreID at the same time.

Yes, but I suspect that important scenario is relatively rare. And even
if it was common, we should not break a protocol and an interface design
principle to optimize one important use case, especially when that case
can be optimized using different means.

If you want to add an option to use the received ICP reqnum field as a
public cache key for lookup, you should be allowed to do that IMO. If
you want to add an option to add Store ID to ICP and HTCP requests, you
should be allowed to do that too. AFAICT, each will give you the
performance optimization you want without violating protocols and
interfaces.

> There's almost no practical
> sense to have different cache peers using different StoreID logic.
> They either use the same rewriter, or use no rewriter at all. Seems
> common sense for me.

Sure! Or one of them is using a rewriter and the other one does not use
it at all (or is not even running Squid software). Or both of them were
using a rewriter yesterday, but one of them was changed to use no
rewriter today. Or there is now a load balancer/traffic auditing hop
that blocks or complains about ICP/HTCP requests with bogus URLs.

Using internal StoreIDs instead of URLs for proxy-to-proxy communication
introduces too many problems to be a viable general solution. Yes, in a
tightly controlled cache hierarchy, it is technically possible to throw
all those considerations away to gain a few extra processing
milliseconds, but that is just not enough of a reason to support that as
a general solution. And, again, there may be two ways to save those
milliseconds without introducing serious problems.

> Maybe I'm wrong, but AFAIK Squid never uses "slow" processing methods
> on incoming ICP/HTCP requests. Passing the incoming ICP/HTCP requested
> URL via the StoreID will change this design principle.

Lack of async code is not really a design principle and I am guessing
that HTCP is already async by the very nature of TCP message processing
(i.e., Squid may read a partial message). It is just that the code never
needed an async step [badly enough]. However, with both of the solutions
I am suggesting above, that async step is still not needed!

>> If somebody wants to extend ICP/HTCP to include StoreId in the request
>> (as an optional additional field), they may do so, but that optional
>> optimization does not change the overall design principle: StoreId for
>> the internal storage; URL for everything else.
>
> Let's put it another way: if we need correct UDP_HIT/UDP_MISS
> responses between peers using StoreID we have to compromize on either
> one of the following design priciples:
> - "Squid always uses URLs for anything outside internal storage"
> - "Squid never uses slow processing on UDP requests"
>
> Please correct me if I'm wrong.

Using ICP reqnum as a cache key or adding StoreID to ICP/HTCP requests
does not compromise either AFAICT.

> What is important for me is to be able to properly answer incoming UDP
> requests that require StoreID normalization (UDP_HIT/UDP_MISS), and
> later, when the actual HTTP request comes, to be able to refresh the
> object if refresh logic requires to do so.

Would using ICP reqnum field as a cache key or adding StoreID to
ICP/HTCP requests work for your use cases? I have not fully checked
whether the former is possible, but I think it is. The latter is
possible, but is more difficult to implement (and will bump into UDP
packet size limits more often?).

> Current implementation prevents the refresh.

We know that the current implementation is broken, no questions about
it! IIRC, the developer responsible for that breakage promised to fix it
when the code with a known breakage was committed, but even if he does
not, Amos, I, or others will [eventually].

Cheers,

Alex.
Received on Fri Feb 14 2014 - 05:22:26 MST

This archive was generated by hypermail 2.2.0 : Fri Feb 14 2014 - 12:00:04 MST