Re: [PATCH] HTTP/1.1 response caching upgrade

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 17 Oct 2012 09:42:33 -0600

On 10/16/2012 03:24 AM, Amos Jeffries wrote:
> Updated patch with requested changes.

Looks OK to me although I have not verified each individual CC handling
line.

> + if ((mayStore |= rep->cache_control->Public())) {
> + } else if ((mayStore |= (rep->cache_control->mustRevalidate() && !REFRESH_OVERRIDE(ignore_must_revalidate)) )) {
> + } else if ((mayStore |= (rep->cache_control->noCache() && !REFRESH_OVERRIDE(ignore_no_cache)) )) {
> + } else if ((mayStore |= rep->cache_control->sMaxAge())) {
> + }

If mayStore starts as false and is modified only once, I would replace
"|=" cleverness with a simple assignment.

> + debugs(22, 3, HERE << "YES because Authenticated and server reply Cache-Control:public");
> + debugs(22, 3, HERE << "YES because Authenticated and server reply Cache-Control:public");
> + debugs(22, 3, HERE << "YES because Authenticated and server reply Cache-Control:no-cache");
> + debugs(22, 3, HERE << "YES because Authenticated and server reply Cache-Control:s-maxage");

These should not be "YES" because we may end up returning 0 or -1 later
anyway. We are already using "MAYBE" for -1 case. I suggest replacing
"YES because Authenticated and ..." text with "Authenticated but ..." text.

These polishing touches do not require another round of review as far as
I am concerned.

Do you want us to run this by Co-Advisor before the commit?

Thank you,

Alex.

> On 16/10/2012 12:31 p.m., Amos Jeffries wrote:
>> On 16.10.2012 09:56, Alex Rousskov wrote:
>>> On 10/14/2012 01:17 AM, Amos Jeffries wrote:
>>>> On prompting from bug 3670 which outlines how Auth transactions are now
>>>> wrongly non-cacheable. I have performed a small audit of the
>>>> HttpStateData::cacheableReply() method and this patch contains the
>>>> result.
>>>>
>>>> trunk rev 11361 converted Cache-Control header from using a single mask
>>>> bitmap (shared by request and response) to separate CC header
>>>> objects in
>>>> the request response. This conversion contained several regressions
>>>> like
>>>> the one bug 3670 reports.
>>>>
>>>> This patch (on trunk rev.12394):
>>>> * documents HttpStateData::cacheableReply() clarifying the overall
>>>> method action and what each individual check it doing.
>>>> * resolves several visible regressions, including bug 3670.
>>>> * extends the caching to handle the "no-cache" controls as per
>>>> HTTP/1.1
>>>> (MAY store, but MUST revalidate before use).
>>>> * extends the caching for several lesser known cases of "MAY store"
>>>> exemptions handling authenticated transactions.
>>>>
>>>> One side effect of now caching transactions utilizing "no-cache" is
>>>> that
>>>> hacks around Pragma:no-cache are reduced to only having any effect when
>>>> Cache-Control is absent. Reducing their performance cost.
>>>>
>>>> This should give Squid a major boost to both its caching compliance and
>>>> HIT ratios.
>>>>
>>>> The patch is build tested, but not yet run-time tested or HTTP/1.1
>>>> compliance tested.
>>>
>>>
>>> Hi Amos,
>>>
>>> Thank you for fixing this. I wish you did not re-indented the old
>>> checks in the patch though -- it complicates reviewing logic changes.
>>>
>>>
>>>> HttpReply const *rep = finalReply();
>>>> +
>>>> + // non-existent replies are not cacheable.
>>>> + if (!rep)
>>>> + return 0;
>>>
>>> ServerStateData::finalReply() is guaranteed to return a non-nil pointer.
>>> We have other code relying on that guarantee. There is no need to double
>>> check IMO.
>>>
>>
>> Removed.
>>
>>>
>>>> + if (!ignoreCacheControl) {
>>> ...
>>>> + // RFC 2068, sec 14.9.4 - MUST NOT cache any response with
>>>> Authentication UNLESS certain CC controls are present
>>>> + // allow HTTP violations to IGNORE those controls (ie
>>>> re-block caching Auth)
>>>> + if (request && (request->flags.auth ||
>>>> request->flags.authSent) && !REFRESH_OVERRIDE(ignore_auth)) {
>>>> + if (!rep->cache_control)
>>>> + return 0;
>>>
>>> Guarding authentication check with ignoreCacheControl looks odd to me.
>>> Should not the check be moved outside the guard, while possibly ignoring
>>> cache_control inside the check? That is:
>>>
>>> // RFC ... MUST NOT cache any response with Authentication
>>> if (response with authentication) {
>>> // UNLESS certain CC controls are present
>>> if (!rep->cache_control || ignoreCacheControl)
>>> return 0;
>>> ... see if cache controls allow caching ...
>>> }
>>>
>>> if (!ignoreCacheControl) {
>>> ...
>>>
>>> The authentication blob can be moved below the ignoreCacheControl blob
>>> if we think authentication is, on average, more rare than conditions
>>> tested by other CC checks. Since both blobs never return 1, it is OK to
>>> order them as we please.
>>
>> You are right. Fixed.
>>
>>>
>>>
>>>> + String s = rep->header.getList(HDR_PRAGMA);
>>>> + const int no_cache = strListIsMember(&s,
>>>> "no-cache", ',');
>>>> + s.clean();
>>>> +
>>>> + if (no_cache)
>>>> + EBIT_SET(entry->flags, ENTRY_REVALIDATE);
>>>
>>> Consider using HttpHeader::hasListMember() instead of getList() plus
>>> strListIsMember(). If you have to use strListIsMember(), please note
>>> that String cleans up itself during destruction so the above can be
>>> simplified:
>>>
>>> const String s = rep->header.getList(HDR_PRAGMA);
>>> if (strListIsMember(&s, "no-cache", ','))
>>> EBIT_SET(entry->flags, ENTRY_REVALIDATE);
>>>
>>
>> Done.
>>
>>>
>>> Finally, if you are rewriting so much code anyway, consider returning a
>>> "const char *" reason string and make the caller log it with debugs(). A
>>> nil result would indicate a cachable response. The source code comments
>>> are great, but it is often difficult to understand the decision when
>>> looking at a debugging log. Polishing this may be viewed outside this
>>> patch scope though (and I would probably not even suggest it if you kept
>>> the old indentation for the review :-).
>>
>> We can't return a char* because this function returns 0/1/-1 cases.
>> I've threaded debugs() before each return statement though for the
>> same effect.
>>
>>
>> New patch coming after new build tests.
>>
>> Amos
>
Received on Wed Oct 17 2012 - 15:42:41 MDT

This archive was generated by hypermail 2.2.0 : Thu Oct 18 2012 - 12:00:06 MDT