Re: [PATCH] reply_from_cache and reply_to_cache

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 15 Oct 2013 18:03:06 +1300

On 15/10/2013 5:42 p.m., Amos Jeffries wrote:
> On 15/10/2013 5:09 p.m., Alex Rousskov wrote:
>> On 10/14/2013 07:06 PM, Alex Rousskov wrote:
>>> On 10/11/2013 08:55 PM, Amos Jeffries wrote:
>>>> On 12/10/2013 11:38 a.m., Alex Rousskov wrote:
>>>>
>>>>> The attached patch adds reply_from_cache and reply_to_cache
>>>>> squid.conf directives to control caching of responses using response
>>>>> info.
>>>>>
>>>>> The reply_from_cache directive can prevent serving of HITs while
>>>>> reply_to_cache can prevent storage of MISSes. The two can be
>>>>> combined or
>>>>> used independently.
>>>>>
>>>>> As you know, the existing "cache" directive does both at the same
>>>>> time.
>>>>> However, the "cache" directive is checked before Squid has access
>>>>> to the
>>>>> response and, hence, could not use response-based ACLs such as
>>>>> http_status. Response-based ACLs may be essential when fine-tuning
>>>>> caching. Squid Bug 3937 (StoreID can lead to 302 infinite loop) is a
>>>>> good use case.
>>>
>>>> I have been considering way to make the "cache" directive the top
>>>> level
>>>> of a set of the caching configuration. Similar to how auth_param is
>>>> the
>>>> tope level of most auth scheme options.
>>>>
>>>> Would you be able to make "cache" directive accept two alternative
>>>> parameter alongside allow/deny as the first field and then process the
>>>> rest of the line according to that field?
>>>> I would suggest "store-miss" and "send-hit" for those parameters.
>>
>>
>>> We could do that, but I doubt that the advantages of that approach
>>> outweigh its drawbacks:
>>>
>>> Since each option is applied independently from others, it may only
>>> confuse folks that are used to our usual "first matching ACL rule wins"
>>> approach:
>>>
>>> cache store-miss allow foo
>>> cache send-hit allow foo
>>> cache early deny foo
>>>
>>> The last rule actually wins here, but the above configuration seems to
>>> imply the opposite to folks used to looking at Squid ACLs. I know we
>>> cannot use the "first matching rule wins" approach for some existing
>>> directives, but are you sure this approach works better for the two new
>>> directives we are discussing here?
>>
>>
>> I forgot to mention that we can also try to do here what we did for
>> ssl_bump. That is, enlarging the set of actions from the default
>> allow/deny to allow/deny/ignore-miss/ignore-hit/store-miss/send-hit:
>>
>> cache deny foo # same as cache deny foo
>> cache send-hit baz # same as reply_from_cache allow baz
>> cache ignore-miss bar # same as reply_to_cache deny bar
>> ...
>>
>> I think that might be better than "store-miss allow" and friends because
>> this scheme follows the traditional "first matching rule wins" approach,
>> but I am not sure it is better than reply_to/from_cache. The problem
>> here is that, unlike ssl_bump, these directive(s) have to be checked
>> multiple times and some of the options do not make sense at some of the
>> decision points.
>
> It does not solve the issue of using reply details in the ACLs though.
> That is the most important goal here.
>
>
>
>> Current decision points are:
>>
>> * before hit/miss is detected (the current cache directive)
>> * when a hit is detected (proposed reply_from_cache)
>> * when a miss is being received (proposed reply_to_cache)
>
> How do you see 1 and 2 on that list being different?
>
> The old cache directive makes no decision about whether the stores are
> involved or not. It just determines between converting a HIT into a
> MISS and (wrongly) causes invalidation of any stored content.
>
> We need to change that decision point to being a decision whether
> store is not-involved or is-involved.
> * If the store is not-involved no HIT is possibe, but also
> invalidation and revalidation does not take place on already stored
> content.
> * if the store is involved, it may HIT, revalidate or invalidate
> stored content
>
> NP: this decision may be made irrelevant by HTTP protocol settings
> from the client. Forced is-involved by CC:only-if-cached. Forced
> not-involved by CC:no-cache.
>
>
> The 3rd decision point, which is the only completely new one here to
> make a local store behave as if CC:no-store was received from the server.
> * if the store writing is denied CC:no-store makes no statement about
> existing content (invalidation does not have to happen).
> * if the store write is allowed, then existing content gets
> invalidated/revalidated as per HTTP normal requirements.
>
> NP: we have been talking in terms of HIT/MISS so far, but for the MISS
> checks we also need to consider REFRESH/revalidate backend requests.
> ** In the event that Squid is performing a REFRESH to the server do
> we want the store-write denial case to prevent updating of the cached
> content? or to treat that somehow different?
>

Looking at the latest HTTPbis documents this second decision point seems
to be equivalent to client request CC:no-store operations.

>
> Overall I am inclined to scope these ACL access checks in terms of
> read/write access to the store rather than HIT/MISS on stored
> contents. Doing so makes the criteria much more simple:
> * First decision point is simply whether to involve store lookups
> yes/no. This can only be made on request details (current "cache"
> decision point with new semantics)
> * Second decision point being whether to write any new information
> found (regardless of MISS/REFRESH states) back to cache. This can be
> made after receiving reply.

Amos
Received on Tue Oct 15 2013 - 05:03:16 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 15 2013 - 12:00:12 MDT