Re: [RFC] cache architecture

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 25 Jan 2012 21:47:19 -0700

On 01/25/2012 06:53 PM, Amos Jeffries wrote:

>> We created workers as an internal performance optimization that has
>> nothing to do with HTTP. It is our responsibility to make sure that
>> optimization stays internal. If caches are not synchronized, the
>> optimization may negatively affect external HTTP agents.
>
> I see you arguing that IPC messages about purges is a requirement we
> imposed on ourselves. I agree, and focus on IPC so that admin who
> disable ICP/HTCP/PURGE are not causing problems.

I am not talking about any specific technology to enforce cache
synchronization, just the assertion that either the internal caches are
synchronized or they are violating HTTP.

> I see no evidence that sharing an IP is any more (or less) of a
> violation than each worker having a unique IP and same FQDN. We haven't
> gone around claiming that sibling relationships or popular CDN
> hierarchies are all violating HTTP, though they hit sync problems too.

If they have sync problems, they may violate HTTP. I am just doing my
best trying to stay focused on the [local] cache architecture topic; I
do not want to get into discussion about distributed hierarchies.

>> Again, if HTTP has no text defining when two cooperating caches must be
>> in sync, then it would be difficult to decide which interpretation of
>> the HTTP spirit is "correct".
>
> The new wording for HTTPbis part 6 draft -18 section 2.5 about
> PUT/POST/DELETE/unknown explicitly clarifies the spirit with:
> "
> Note that this does not guarantee that all appropriate responses are
> invalidated. For example, the request that caused the change at the
> origin server might not have gone through the cache where a response
> is stored.
> "

IMO, this just warns the implementer that the network complications
(policy routing, load balancing, cache hierarchies, etc) may cause HTTP
violations. It does not permit those violations any more than a warning
of a possible DDoS attack makes that attack benign. It just says "this
MUST/SHOULD cannot guarantee anything because it applies to the caches
that received the request and not to the other caches that did not; your
next request may go through those other caches".

> section 2.2 on what responses can be served uses the wording
> "
> Also, note that unsafe requests might invalidate already stored
> responses; see Section 2.5.
> "
> *might* invalidate.

I think this just means that while an unsafe request MUST invalidate the
corresponding stored response, there may be no such corresponding
responses stored.

> Two giant loopholes to walk through. Invalidation MUST is a best-effort
> benefit for a hierarchy, not a guarantee of removal.

Clearly, we interpret the same specs differently. You see loopholes
negating explicit MUSTs. I see a non-normative explanation why somebody
cannot rely on the request path staying the same and a non-normative
reference to a MUST.

> With Squid SMP mode design being an entire hierarchy inside one box we
> have to adjust our viewpoint to that of hierarchy compliance. The
> workers are as compliant as ever individually. We have raised awareness
> of the hierarchy level interaction problems and need to fix it above and
> beyond the specs. They in word and spirit focus on requirements of
> individual cache instances, not distributed or hierarchy.
>
> What we do by fixing the problem is improve the
> friendliness/predictability and usefulness of Squid responses. Not the
> compliance level.

I understand your point of view, but I have not seen anything in HTTP
that supports a definition of compliance for a hierarchy of caches (with
a single entry point) that differs from compliance of a single cache. I
have not read the entire HTTPbis yet so it is possible that I will find it.

However, since we seem to agree that worker caches must be in sync, we
can build the implementation on that consensus!

It may be a good idea to polish HTTPbis if it indeed allows both yours
and mine points of view to coexist even though they contradict each other.

Thank you,

Alex.
Received on Thu Jan 26 2012 - 04:47:33 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 26 2012 - 12:00:13 MST