Re: [PATCH] experiment in private caching

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 13 May 2014 10:18:53 -0600

On 05/13/2014 04:39 AM, Amos Jeffries wrote:
> On 13/05/2014 9:04 a.m., Alex Rousskov wrote:
>> Performance concerns: To determine whether the requested object is in
>> the cache, SMP Store requires semi-expensive lookups (one for each SMP
>> cache_dir and one for the shared cache_mem). AFAICT, this patch doubles
>> the number of such lookups per typical miss response due to a new
>> http.cc loop. The implementation using existing Store APIs (suggested
>> below) avoids this additional overhead.

> There are also large(r) performance issues with the StoreID approach.

The "large(r)" part is debatable and probably depends on the traffic mix
and the environment: The proposed changes slightly penalize nearly all
miss transactions but greatly benefit the "broken" ones.

> Using StoreID it becomes impossible for the unverified requests to ever
> HIT on any existing content cached from verified sources. This actively
> reduces both the HIT ratio

Yes, as I disclosed.

> and safety of responses. HIT being safer than MISS in these cases.

Not sure what you mean by "safer" in this context, but whether HIT or
MISS is correct (from client and server points of view) depends on the
nature of brokenness, does not it? If Squid cache is stale and the
client is connecting to the right/fresh address, then MISS is actually
the correct action (i.e., it will serve the intended content) even
though it is more "expensive".

> In theory the one I picked has a higher overall HIT ratio possible.

Agreed. I do not think "highest hit ratio possible" trumps other
concerns in this case.

>> Reusing Store ID will not allow requests that fail validation to hit
>> cached entries that correspond to successfully validated requests, but
>> (a) that could be a good thing in some cases and (b) I wonder whether
>> that is a small price to pay for avoiding new uncertainties related to a
>> new way to generate a store key?
>
> Regarding (a) - which cases?

The client destination is actually correct. Squid DNS is
inaccurate/misconfigured/etc., and Squid cache has wrong/stale content
(from server point of view). The client should go where it wants to go
instead of getting stale content from the cache.

> I know it can work. It is just being driven by the "higher HIT ratio"
> group. So options that decrease the HIT ratio are not accepted, and we
> may as well do nothing if those are the only options available. I
> personanly am happy not adding any of this patch.

Let's not add it then! If you want, you can still offer the "group" many
other options, including:

1a) An option to disable Host validation in their environment.
    Can be ACL-driven.

1b) An option not to treat requests that failed validation specially
    as far as caching is concerned (allow HITs and caching as usual).
    Can be ACL-driven.

2) A StoreID-based implementation with lower HIT ratio than (1a) or
    (1b) but increased protection from malicious requests.

In the future, if you think the patch should not be added or are happy
about it not being added, please say so explicitly. Since you are
wearing both developer and committer hats, you have to play devil's
advocate (well, every developer should, but committers are obligated to
do that). I would not have spent hours on the review if I knew that you
are happy abandoning the patch!

>> BTW, are you sure that serving content cached by regular requests to
>> requests that fail validation is always desirable? Is it possible that
>> some failed-validation requests will actually get to a different/newer
>> version of the site (e.g., when DNS changed but Squid does not know
>> about that yet)? If you are not sure, please note that the patch will
>> introduce breakage in these cases that does not exist today. Using the
>> Store ID approach avoids that AFAICT.
>
> It is possible without this patch. If it is a problem then the site
> cache controls are broken or Squid is explicitly configured with
> refresh_pattern abuses.

Not necessarily, but this is not a question of "fresh content" from HTTP
rules point of view. It is a question of client getting what it wants.

The site servers may have moved and updated their content after the
move. In the worst case, imagine Squid trying to validate the cached
content and failing to do that because the server has changed its IP
address but Squid DNS does not have access to the new address yet. Squid
will then respond with an error or serve stale content. Neither is what
the client (or server) wants and deserves in this case.

You may argue that the server is at fault in that case, but that does
not help clients and Squid admins serving them.

>> Request adaptation happens after host validation. Request adaptation may
>> change the request URI and/or the Host header. However, storeSecurityKey
>> will remain set if it was set before adaptation and will not be set if
>> the new request actually becomes invalid. Are both behaviors desirable?
>>
>
> That we need to discuss some more.
>
> The adaptation output is created on the assumption that the URL
> generated from Host header is correct/valid.

There is no general assumption that the Host header (or any other
request aspect) has been validated prior to adaptation IMO. In fact, the
adaptation layer can be used to perform such validations.

> Is the adaptation
> result any more trustable than the original Host header? (GIGO etc.)

The adaptation trust is in the hands of the Squid admin. If they do not
trust the service, they can stop using it or ask us to add knobs to
restrict service adaptations to a "safe" (from the admin point of view)
subset.

>> How will this change work with cache peering? My understanding is that
>> peer queries from requests that failed Host validation may result in
>> HITs and may be cached by the requesting peer, right? Will the
>> requesting peer use the special key (with storeSecurityKey appended) to
>> cache the response from the remote peer?
>
> No changes to the peer situation from this.

I will take that as a "yes" answer. This means that, in peering setups,
the storeSecurityKey cache entry may actually store the content of the
entries without storeSecurityKey. This is not a problem as such, but
perhaps the "highest hit ratio possible" patch code should be changed to
delete storeSecurityKey when caching from-peer response (because that
response is known to come from a storeSecurityKey-less entry).

Cheers,

Alex.
Received on Tue May 13 2014 - 16:19:06 MDT

This archive was generated by hypermail 2.2.0 : Tue May 13 2014 - 12:00:13 MDT