Re: next version of content-encoding / gzip design doc from Henrik Nordstrom on 2004-03-03 (squid-dev)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Wed, 3 Mar 2004 10:11:25 +0100 (CET)

On Wed, 3 Mar 2004, Jon Kay wrote:

> Because current browser implementations treat Content-Encoding much as
> though it was Transfer-Encoding, we will implement Content-Encoding and
> Accept-Encoding as though they were actually the Transfer-Encoding and
> TE described in the HTTP specifications.

This part I do not understand.

Coontent-Encoding and Transfer-Encoding is fundamentally different in
their operation far beyond the hop-by-hop vs end-to-end difference. You
can not interchange one for the other.

It is not safe to assume a clients accepts gzip TE only because they
accept gzip content-encoding. For one thing the message format is
completely different.

> Etags of replies encoded by Squid will be modified to turn them into
> weak tags if they are not already so.

Why to you oppose creating new unique ETags?

> There will be a configuration option to turn off content-encoding.

Granted, and this will default off in the standard distribution, as any
other option which violates the semantically transparent HTTP proxy
requirements.

> Content-Encoding Implementation

No comments there.

> Objects will be stored both in unencoded and encoded formats. An object
> will stay in the format in which Squid receives it until requested by a
> client requesting a different Content-Encoding which Squid supports
> (this could be immediate). Once this happens, the object will be
> streamed coded into a different StoreEntry and on to the client.

Ok.

> A new store_dup module will be created to manage dup store_entries and
> make sure duplicate entries are invalidated when a new version of an
> object is read. It consists of a circular list of StoreEntry pointers
> named "dupnext" and "dupprev" When a new duplicate encoding (or
> decoding) of an object is created, it's added to the list. When any
> StoreEntry is invalidated or updated, all dups are invalidated.

Looks a little too complex to me.

Wouldn't something simpler like the following work:

Modify the store key to account for content encoding.

Add a internal meta object listing the known content encodings of a given
object. When a new encoding is added rewrite this object to add the new
encoding name.

On cache hits, iterate over the known acceptable encodings until a match
is found in the cache.

In recoded objects include a meta header indicating the identity of the
original object and disregard the recoded object on a cache hit if it no
longer matches the original.

From what I can tell the above would also work for adding server-driven
Content-Encoding negotiation to the proxy to complement the use of Vary
(which most mod_gzip servers do not support btw).

Regards
Henrik
Received on Wed Mar 03 2004 - 02:11:28 MST

This archive was generated by hypermail pre-2.1.9 : Thu Apr 01 2004 - 12:00:04 MST