next version of content-encoding / gzip design doc

From: Jon Kay <jkay@dont-contact.us>
Date: Wed, 03 Mar 2004 00:31:13 -0600

Here's a new version of the design document, that incorporates the
results of your suggestions.
I hope this is better...

Jon

                    Gzip Content-Encoding in Squid Design

Version Choice

The goal will be to get these changes into Squid3 HEAD.

Content-Encoding Protocol

Because current browser implementations treat Content-Encoding much as
though it was Transfer-Encoding, we will implement Content-Encoding and
Accept-Encoding as though they were actually the Transfer-Encoding and
TE
described in the HTTP specifications.

Etags of replies encoded by Squid will be modified to turn them into
weak
tags if they are not already so.

There will be a configuration option to turn off content-encoding.

Content-Encoding Implementation

New HttpHdrContCode module, that parses related HTTP headers, and
arranges
for encoding or decoding appropriately. Includes the following
functions:

   * codeParseRequest(): Called from client_side:parseHttpRequest()
     after clientStreamInit() call. Checks for and parses
     Allow-Encoding headers. Instantiates content_coding appropriately,
     and calls codeClientStreamInit().
   * codeClientStreamInit(): Adds a new node to clientStream with
     codeStreamRead(), codeStreamCallback(), and codeStreamStatus()
     functions.
   * codeStreamCallback()set up encoding/decoding state depending on
     combination of Content-Encoding and Allow-Encoding fields seen.
   * codeStreamRead(): call HttpContentCoder transformation functions
     appropriately.
   * codeStreamStatus(): report status to stream.
   * codeDupNode(): Alloc new store_entry and insert new clientStream
     dup node (see below) to (v?)copy data to store_entry as well as
     reply.

New HttpContentCoder abstract type, with functions:

   * encodeStart()
   * encodeEnd()
   * encodeChunk()
   * decodeStart()
   * decodeEnd()
   * decodeChunk()

New per-coded-object ContentCoderState, to handle coding state. It'll be

referenced from the clientStream, and include fields:

   * HttpContentCoder *coder
   * off_t codedOffset

Objects will be stored both in unencoded and encoded formats. An object
will
stay in the format in which Squid receives it until requested by a
client
requesting a different Content-Encoding which Squid supports (this could

be
immediate). Once this happens, the object will be streamed coded into a
different StoreEntry and on to the client.

A new store_dup module will be created to manage dup store_entries and
make
sure duplicate entries are invalidated when a new version of an object
is
read. It consists of a circular list of StoreEntry pointers named
"dupnext"
and "dupprev" When a new duplicate encoding (or decoding) of an object
is
created, it's added to the list. When any StoreEntry is invalidated or
updated, all dups are invalidated. Functions:

   * storeNewDup(): called from codeDupNode(), above, and creates new
     node with the dup'ed node attached via the dup list.
   * storeDupClientStreamInit(): called from codeDupNode(), and adds
     new clientStreamNode to copy off encoded data to new node as well
     as reply.
   * storeDupClientStreamRead(): does copying off.
   * storeDupClientStreamCallback(): null function
   * storeDupClientStreamStatus(): returns status

Other changes needed:
*Add new content_coding field to HttpReply.
*New httpHeaderGetContentEncoding(HttpReply *) function in
HttpHeader.cc.
*HttpReply:httpReplySetHeaders will weaken the etag if appropriate.
*A new configuration flag to turn content-encoding off, if desired.

Gzip

A new GzipContentCoder module, which will be an instance of
HttpContentCoder.

Data encoding will be handled by the gzip.org zlib library.

Functions:

   * gzEncodeStart: call inflateInit2(), write header
   * gzEncodeEnd: write trailer
   * gzEncodeChunk: call inflate()
   * gzDecodeStart: call deflateInit2(), read and verify header
   * gzDecodeEnd: verify trailer
   * gzDecodeChunk: call deflate()
   * gzDoSaveEncoded(): true

Test Strategy

Must pass the test suite.

Must add appropriate tests, including sending gzipped content to oneself

successfully.

Will also test against Apache mod_gzip implementation, and maybe even
gunzip.
Received on Tue Mar 02 2004 - 23:32:57 MST

This archive was generated by hypermail pre-2.1.9 : Thu Apr 01 2004 - 12:00:04 MST