Re: [PATCH] support parameters for no-cache and private

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 19 Feb 2013 16:20:10 +1300

On 19/02/2013 6:48 a.m., Alex Rousskov wrote:
> On 02/16/2013 06:51 PM, Amos Jeffries wrote:
>
>> Since we now support HTTP/1.1 storage and revalidation of
>> Cache-Control:no-cache it is important that we at least detect the cases
>> where no-cache= and private= contain parameters.
> The patch appears to do more than just detection -- Squid starts to
> ignore CC:no-cache and, in some cases, CC:private (with or without
> parameters, depending on circumstances).
>
>
>> AFAIK these are still rare occurances due to the historic lack of
>> support. So for now Squid just detects and exempts these responses from
>> the caching performed. The basic framework for adding future support of
>> these HTTP/1.1 features is made available
>> + } else if (/* p &&*/ !httpHeaderParseQuotedString(p, (ilen-nlen-1), &private_)) {
>> + debugs(65, 2, "cc: invalid private= specs near '" << item << "'");
>> + clearPrivate();
>> + } else {
> It feels wrong to clear and ignore a private CC directive that we know
> is there, even though we cannot parse its value. Should we be
> conservative and treat this as private without parameters instead of
> possibly violating HTTP by ignoring CC:private (e.g., if our parser is
> wrong)?

Yes. This is undefined behaviour, but to be on the safe side this
defaults again to "CC:private".

> Also, if this code stays, please reshape it to handle the successful
> parsing case (with side-effect of setting private_) first, for clarity:
>
> if (!p) {
> private_.clean(); // last CC:private wins
> setMask(true, true);
> } else if (parse(private_)) {
> setMask(true, true); // the above [re]sets private_
> } else {
> clearPrivate(); // ignore this and previous CC:private
> }
>
>
> Same concerns for no-cache, of course.

They are formed that way to ft in with the surrounding code. In
particular the parsing stye of the other 5 options with parameters.

The HTTP semantic meaning of these three cases is very different between
themselves but the same syntax<->meaning pattern for both private and
no-cache.

CC:private ... means *whole* response is private, MUST NOT cache.

CC:private="foo" ... means the specific listed *headers* are private,
response itself MAY be cached.

CC:private=^%%$ ... (or other breakage) means invalid options and SHOULD
be ignored.

CC:private, private="foo" ... is treated in HTTP as an explicit hack by
the sender - such that caches coping with private="foo" will obey
CC:private="foo" and older 1.0 caches will do "CC:private" handling.

This patch is supposed to pull the parameters into a string for later
re-use and treat it as CC:private (MUST NOT cache). If the admin uses
ignore-private, then we require a revalidate to ensure correct response
to later clients, which should hopefully avoid cache poisoning in a lot
of cases.

>
>> + void Private(String &v) {setMask(CC_PRIVATE,true); private_.append(v);} // uses append for multi-line headers
> Would not append() merge two header names together without delimiters if
> the directive is split between two header fields? For example, would not
> the private_ value become "foobar" in the example below?
>
> Foo: 1
> Cache-Control: private="foo", private="bar"
> Bar: 2

Fixed.

>
> For the future code to work, the values from multiple directives should
> be concatenated using ",". However, it may be better to store them in a
> list or map of some kind instead (since that future code will have to
> check individual header names for a match anyway). Thus, the best
> structure for this may be rather different. This doubles my concern that
> we are adding code that should be essentially unused (IMO), that adds
> overhead, but that may have to be fixed/rewritten further when it is used.

Yes. SBufList was the idea in future. But we are not there yet. At
present we are forced to use String by the current HttpHeader mechanisms.

> Also, I think the code will mishandle this case:
>
> Foo: 1
> Cache-Control: private
> Cache-Control: private="bar"
> Bar: 2
>
> because it will treat the above as the following less restrictive case:
>
> Foo: 1
> Cache-Control: private="bar"
> Bar: 2
>
> instead of treating it as
>
> Foo: 1
> Cache-Control: private
> Bar: 2
>
> Does HTTP say that [later] less restrictive CC directives overwrite
> [earlier] more restrictive ones?

Yes. These two directives directly contradict each other. See above for
the semantics. The resulting behaviour is undefined in the spec, but
given that Squid treats both forms as non-cacheable I don't think it is
a problem.

>
>> + // CC:no-cache (not only if there are no parameters)
>> + const bool CcNoCacheNoParams = (rep->cache_control->hasNoCache() && rep->cache_control->noCache().undefined());
> The comment says "not only if there are no parameters" but the boolean
> condition says "only if there are no parameters".

Dropped the 'not'.

>> + // CC:private (if stored)
>> + const bool CcPrivate = rep->cache_control->hasPrivate();
> Does "if stored" apply to all other CC directives in this context as well?

Yes this is a HIT pathway. Dropped.

> Please start local variable names such as CcPrivate with a small letter.

Done.

>
>> // NP: request CC:no-cache only means cache READ is forbidden. STORE is permitted.
>> + if (rep->cache_control->hasNoCache() && rep->cache_control->noCache().defined()) {
>> + /* TODO: we are allowed to cache when no-cache= has parameters.
>> + * Provided we strip away any of the listed headers unless they are revalidated
>> + * successfully (ie, must revalidate AND these headers are prohibited on stale replies).
>> + * That is a bit tricky for squid right now so we avoid caching entirely.
>> + */
>> + debugs(22, 3, HERE << "NO because server reply Cache-Control:no-cache has parameters");
>> + return 0;
>> + }
> The new comment and behavior seems inconsistent, especially in light of
> the old "NP" comment that is left intact: It seems like we should store
> (or not) regardless of the header names presence. Why does presence of
> header names make a difference in this case?

The above matches the RFC no-cache definition...

  Response CC:no-cache="foo" requires header manipulation Squid cannot
do yet. ==> Do not cache.

The if() and TODO is about that missing functionality

  Response CC:no-cache requires validation, Squid does this. ==> Cacheable.

The NP left in the code later on is about that caching.

And yes the whole option is very counter-intuitive.

>
>> // NP: given the must-revalidate exception we should also be able to exempt no-cache.
>> // HTTPbis WG verdict on this is that it is omitted from the spec due to being 'unexpected' by
>> // some. The caching+revalidate is not exactly unsafe though with Squids interpretation of no-cache
>> - // as equivalent to must-revalidate in the reply.
>> - } else if (rep->cache_control->noCache() && !REFRESH_OVERRIDE(ignore_must_revalidate)) {
>> + // (without parameters) as equivalent to must-revalidate in the reply.
>> + } else if (rep->cache_control->hasNoCache() && !rep->cache_control->noCache().defined() && !REFRESH_OVERRIDE(ignore_must_revalidate)) {
>
> Same concern here. Perhaps I am missing the meaning of including those
> header names, but I think if we cannot handle them properly, we should
> ignore their presence instead of ignoring the presence of the entire
> CC:no-cache directive!

You are missing the meaning. Their presence alters the entire no-cache
from being scoped to the whole response to be scoped to those headers
alone. Handling them properly involves a lot of tricky if-exists
post-processing on the old reply and revalidation response - whih at the
time Squid has both those sets of details one is locked away behind a
const barrier I could not unwind easily. It is far easier to simply
treat no-cache="foo" as a no-store option and make this a "Precondition
failed" as Co-Advisor says instead of "HTTP violation".

RFC specification has an "unless" clause on both these response
directives. This whole patch is about making that "unless" case become
non-cacheable again. As Dmitry pointed out it was broken by the earlier
no-cache storge update.
  For the no-cache option we are reverting that behaviour change in the
specific case of parameters being present.
  For the private option we are storing the parameters in a String, but
otherwise not changing the current behaviour.

>
>> - debugs(22, 3, HERE << " Authenticated but server reply Cache-Control:s-maxage");
>> + debugs(22, 3, HERE << "Authenticated but server reply Cache-Control:s-maxage");
> Please remove this non-change.

Done. Applied separately.

Amos
Received on Tue Feb 19 2013 - 03:20:17 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 19 2013 - 12:00:06 MST