Re: gzip support

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sun, 15 Dec 2002 22:49:05 +0100

On Sunday 15 December 2002 21.48, Stephen Sprunk wrote:

> Section 14.11 states:
> However, a non-transparent proxy MAY modify the content-coding
> if the new coding is known to be acceptable to the recipient,
> unless the "no-transform" cache-control directive is present in the
> message.
>
> I don't see any indication that modification of content-[en]coding
> must be off by default. One can presume that it MUST NOT be done
> by transparent proxies, but that's not explicitly stated. I can't
> think of any problems either case would cause.

It is explicitly stated.

For a hint see the definitions of "semantically transparent" and
"proxy".

For the explicit statement see first and last paragraphs of section
13.5.2 and section 13.5.1.

Short story:

* Content-Encoding is a end-to-end header (first paragraph 13.5.1) and
a proxy SHOULD NOT modify end-to-end headers (13.5.2).

* A transparent proxy MUST preserve the length of body of the message
(last paragraph 13.5.2), which pretty much rules out any
modifications and certainly rules out gzip as a "transparent"
operation.

* A non-transparent proxy MAY modify Content-Encoding or Content-Type
in order to provide some added service to the user agent unless told
not to do so by the user agent (no-transform), but when doing so it
MUST include a appropriate Warning header telling the user that the
response is modified. (13.5.2)

The whole field is non-transparent proxies is quite vagely specified
in the RFC, and there is numerous pitfalls which is not touched in
the RFC. One very obvious such pitfall is Range requests and the
relation of ETag on modified responses, but this is only one of many
cases. The only indication a of a non-transparent proxy is the
Warning: 214, but nowere is stated what effect this has on caching
and merging of responses. Careful reading of the RFC makes it quite
clear that one should not attempt to merge responses with a Warning:
214 but nowhere is this stated.

When implementing non-transparent proxies you are pretty much on your
own when it comes to unexpected sideeffects of your transformations.

I have to correct my earlier message here. A non-transparent proxy
applying content-encoding transformations SHOULD NOT modify ETag,
even if this would make the HTTP protocol work more reliably for the
clients if they ever connect via other means not using the exact same
transformations.

Note: A server MUST indicate different ETag for different
Content-Encoding, or only use weak ETag values.

> I'm still a bit fuzzy on the RFC's terminology... Does the gzipped
> content qualify as an entity or message body?

In case of Content-Encoding each Content-Encoding represents a entity
of it's own, with it's own entity-headers and entity-body.
Content-Encoding is a entity-header.

In case of Transfer-Encoding then gzip is one of the transfer
encodings that may be applied to entity-bodies while in transit.
Transfer-Encoding is just encoding for the purpose of transfer and
does not modify the entity.

Byte ranges is on entities, not transferred data.

> I think it would be extremely useful for forward proxies as well,
> but should probably be turned off by default since the compression
> will eat CPU power and the required Warning may surprise people.

And will for sure make the cache a semantically non-transparent cache,
where the primary goal of Squid is (or at least have been) to provide
a semantically transparent cache.

Regards
Henrik
Received on Sun Dec 15 2002 - 14:48:38 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:19:01 MST