http response headers from Dancer on 1998-04-20 (squid-dev)

From: Dancer <dancer@dont-contact.us>
Date: Tue, 21 Apr 1998 14:08:54 +1000

--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Okay. Here's a daft idea that I've been fiddling with..I've had my
coffee, so I've got an approximation of rational thought. I've had a
hard idea getting this beyond about half-formed, so bear with me, if you
would, while I gallop madly in all directions.

1) We want to keep object data (content), response-headers and squid's
meta-data all in one file. That's a given. We don't want to waste inodes
unecessarily.

2) http response headers, may change independantly of the object. Not
much (usually) it's true. An IMS request may return response headers
with different last-modified-times (but that still give us a
not-modified response).

3) We might choose to do something to the object subsequently: Calculate
an MD5sum to share with our peers, to prevent duplication of objects.
Maybe.

Whatever...the current storage format of metadata/header/object is
somewhat constraining in that we can't realistically change the headers
without completely rewriting the stored object. This is assuming we
might want to change the headers in future, which I can anticipate us
possibly wanting to.

So what makes more sense to me (and it surprised me somewhat that it
wasn't done this way) was to write metadata/offset/object/headers. Most
of the metadata fields are fixed length, except for the URL. The object
isn't going to change without this storage unit getting flushed and/or
replaced. That leaves the headers on the end, where they can be
truncated/rewritten/axe-murdered/set-alight or whatever.

Obvious problem:

Orphaned headers. Since the headers go in at the end, the only way to
get them on disk (assuming the object is not being cached in memory
during the transfer) for another request to reference, is to know the
size of the object in advance (not always true) and be assured that it
is correct, write the offset, seek forward, write the headers, seek
back, and write the object as it comes in. Unpretty.
The headers can be held in memory, and reference-accessed by subsequent
requests until the object fetch is complete. This still doesn't evade
the basic issue that the headers and the object are somewhat divorced
from each-other (well, a trial-separation, but they get back together
for the sake of the children), and are ultimately stored ass-backwards
(if you'll pardon the quaint colloquiallism).

OTOH, this setup gives you almost complete freedom to update the object
response-headers stored in the cache, without rewriting the whole
storage file. It also gives you the freedom to write arbitrary meta-data
to the object, pretty much at will:

X-Squid-MD5: E7BD48A8
X-Rating-Module: MA18 (adult)
X-Squid-Storage-Algorithm: algorithm=4, parms="0000c0"

Or just about anything else. IMO, it would contribute to (not instantly
instantiate) a more modular design where code could be inserted at
predefined slots in the LRU and cache-maintenance, neighbour selection,
and so forth, and the modules could use the header space to generate,
store, or perhaps transmit custom metadata to peers.

As I write this, I can see other ways, without disrupting the existing
storage format. Data could be appended to the object as further-headers,
or as just about anything...The problem with that would be the same as
with rearranging the storage format. Telling the delivery code when to
stop (it ain't just eof any more).

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GAT d- s++: a C++++$ UL++++B+++S+++C++H++U++V+++$ P+++$ L+++ E-
W+++(--)$ N++ w++$>--- t+ 5++ X+() R+ tv b++++ DI+++ e- h-@ 
------END GEEK CODE BLOCK------
--MimeMultipartBoundary--

Received on Tue Jul 29 2003 - 13:15:47 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:45 MST