Re: What is The logic of Vary Headers cachiness?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 25 Jul 2013 17:13:42 +1200

On 25/07/2013 11:55 a.m., Henrik Nordström wrote:
> ons 2013-07-24 klockan 10:01 -0600 skrev Alex Rousskov:
>
>> That is what Squid does today, bugs notwithstanding. If the found store
>> entry is the special "Vary" entry, then Squid does another lookup, with
>> the appropriate header values added to the store key hash.
> Yes, but is only part of what is needed for correct Vary support.
>
> The full logics (still missing from squid-3) involves an n-m map of
> requests to response variants.
>
>> Instead of returning a special Vary object during this first lookup,
>> Squid could return a regular cached object (one of the Vary variants),
>> with some special Vary flag set,
> Yes, ideally the vary lookup logics should fit in the store, and not
> upper layers.
>
> Was not really an option when the Vary logics was originally designed in
> Squid-2.

I was intending to work up a body payload for the x-vary-marker object
which contained that N-M map you speak of above. With the pending Key
header feature this makes even more sense.
The redirected lookup can of course be done internally by the store
overall, but it can't be all in one dir class due to size differences.

>
>>> From my point of view on the code and after coding StoreID I know that
>>> there are two lookups for HEAD and GET and they are not the same object
>>> So why not just use 3 object level checking??
> GET and HEAD are handled as different objects due to Squid design. It's
> not meant to from HTTP point of view.
>
>> I am afraid I do not know what you mean by a "3 object level checking".
> Neither do I.
>
>> IIRC Squid does not lookup HEAD and GET for the same request under
>> normal conditions. Lookups with multiple methods happen for HEAD
>> requests and for purging. Both categories are relatively rare.
> There is some cross-magics between HEAD and GET to try to map Squid
> store semantics to what HTTP expects, but far from complete.

And IMHO unecessary complexity.

>
>> Finally, as we are migrating to per-cache store indexes, more store
>> lookups should be avoided when possible because the number of mandatory
>> lookups has to be multiplied by (the number of cache_dirs plus one for
>> the memory cache index) to check all the indexes.
> There is a design decision that needs to be taken here.. should it be
> possible to have different responses for the same Vary:ing URL to be
> stored in different stores, or should they all need to go into the same
> store?

Makign them all us the same store will in some cases violate the
min-size/max-size limitations. It also restricts us from having
something like a special index for looking up the x-vary-marker objects
which does sub-lookup logics and protects from duplicated logics in any
storage media with unusual index searching (ie the remote DB cloudy
stores being played with nowdays).

> This has direct impact on where Vary logics can be performed.
>
> Full Vary support requires the following store operations
>
> * Which responses (ETag values) is known for given URL? This list is
> needed to construct a If-None-Match validation request as needed to ask
> upstream which variant is the correct response.

The x-vary-marker payload being the N-M index voids this impact. ETag
can be a meta data stored there and avoid sub-lookups entirely on many
cases.

> * Add mapping of request headers to specific response variant
> (identified by ETag). Or better yet allow the same response body to be
> referenced by multiple responses (simplifies validation logics and aging
> considerably). Response headers is the original response updated with
> header data from the 304 response in response to If-None-Match.

The x-vary-marker being an N-M index again voids this impact. Using the
response field-values instead of the full request header field covers
this. Again no sub-lookup required.

>
> * Lookup the response matching this specific request.
>
> To complicate matters further different responses MAY have different
> Vary header value. But it is not very likely.

With Key header this becomes a whole lot more likely. In fact there is a
relatively strong case for each response having both a Vary and Key
header in most responses. Making four identifier patterns per response
variant (ETag, Vary, Key, and Digest hash when we get there).

Having the x-vary-marker payload as the N-M map in the form of a list of
variant pattern matches *all* of which are applied to the request then
selected from (in direct accordance with HTTP specs algorithm) we can
fully identify the set of available variants before the second lookup
pulls out a specific one. Working with sets like this is exactly how the
specs are phrased.

Amos
Received on Thu Jul 25 2013 - 05:13:48 MDT

This archive was generated by hypermail 2.2.0 : Thu Jul 25 2013 - 12:00:11 MDT