Re: [RFC] If-Not-Digest and duplicate transfer suppression

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 20 Oct 2011 17:15:18 -0600

On 10/19/2011 01:57 PM, Henrik Nordström wrote:
> fre 2011-10-14 klockan 13:34 -0600 skrev Alex Rousskov:
>
>> Executive summary: The second version of the specs uses instance
>> digests, relies on RFC 3230 Digest and Have-Digest mechanisms, and uses
>> a new If-Not-Digest conditional request with a standard 304 response.
>
> Looks fine to me.

>> a) handling of Range requests (I think we can use similar 304
>> responses there);
>
> Is this cache validations where the requesting client only requested a
> range?

Right. A child Squid has a stale instance cached. The client requests a
range of that cached instance. The child Squid forwards a Range request
to the parent, adding an If-Not-Digest header. The parent Squid may
respond with a 304 if the instance digest has not really changed (either
because the parent has a fresh instance cached or because it got a fresh
instance from the server and computed the same digest).

>> b) optimal place to compute digests (it is the parent for the specific
>> use case I am addressing, but other use cases may place all the burden
>> on the child or share the burden);
>
> I'd say whoever stores objects received without a digest. This may be
> both parent & child on cache misses.

There are many possible variables here. For example, an adaptation
service may compute a digest "for free" while performing adaptations or
a resource-limited child Squid may want to rely on the parent Squid
computing digests.

>> c) use of trailers to send Digest headers (with just-computed digests);
>
> Optimization to avoid the double calculation mentioned above.

Yep.

>> d) inclusion of modified headers in 304 responses (to update cached
>> entity)
>
> How do we tell which headers have been modified compared to the child
> cache?

Do we have to guess? Can the parent cache simply include _all_ response
headers, allowing the child cache to update what has changed?

>> e) how the optimizaton is enabled/configured (e.g., response size or
>> digest computation time limits)
>
> digest computation can be done while storing objects, and is then not a
> major performance issue. But requires store changes to be able to attach
> the digest (or any other trailer header) to the response after storing
> the body.

I do not know how a, say, SHA1 digest computation would affect Squid
storing [large] responses. I would not be surprised if the overheads
would be significant, especially as we try to remove extra memory copies
of body chunks.

>> f) dealing with a combination of If-Not-Digest and other conditional
>> headers or cache control directives as well as ETags (e.g., the cache
>> admin may want to add If-Not-Digest to client's "reload" requests).
>
> Don't really see much of a conflict there. Any specific cases you think
> about where this may conflict?

Not at this time, but I remember there were a few places in RFC 2616
where the conflicts between standard conditional headers were studied,
and they are painful. I doubt we can avoid all that pain with the
introduction of a new conditional header.

>> Question: Can we accept a quality implementation of the above
>> optimization into Squid?
>
> Yes.

Thank you,

Alex.
Received on Thu Oct 20 2011 - 23:15:48 MDT

This archive was generated by hypermail 2.2.0 : Fri Oct 21 2011 - 12:00:06 MDT