Antwort: [Mod_gzip] Re: Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re: Antwort: [Mod _gzip] Vary: header and mod_gzip

From: <Michael.Schroepl@dont-contact.us>
Date: Thu, 29 Aug 2002 19:16:24 +0200

^^Hi Hendrik,

>> After all, the Apache admin running mod_gzip has the
>> option whether he should rather
>> a) use the "reqheader" directive to support the broken
>> UserAgents by sending them uncompressed content or
>> b) ignore the broken UserAgents and serve content that
>> works best for caching proxies, with a short "Vary"
>> header, thus supporting the correctly working User-
>> Agents.
>> The fewer broken browsers are around, the more likely
>> it will be that scenario b) will work best.
> And it will take considerable amount of time before we
> are there, making it very likely there is ETag support
> in Squid to address the redundant caching issue.

but aren't we already there for the moment, if
mod_gzip did sent a "Vary:" header containing the
complete list of relevant HTTP headers?

I am losing the many open ends of the thread.
We should try to make something like a requirements
list for mod_gzip to best support the work of
Squid 2.4, 2.5, 2.6, which may well contain three
different extension levels of mod_gzip (of which
1.3.19.1b with or without "UserAgent" could be the
first one but would be improved by its successor
providing dynamic computation of this header list).

Of course such a requirements list would not mean
that there are the C programmers at hand (and wil-
ling) to implement all these changes (although it
would mean that the Coders would at least know
where to start if they wanted to do it.)
But at least it would mean that the Squid crew
could read each forthcoming mod_gzip CHANGES list
and easily check whether something significant has
happened about the level of communication confor-
mity between Squid and mod_gzip. So it might at
least influence your blacklist content ... ;-)

>> If I get it right, then the requirements made by Squid
>> 2.6 for mod_gzip and Apache are of the type that if
>> compression via negotiation is in use, so that an URI
>> will generally be mapped to a set of (exactly two)
>> entities with different ETags, then the whole HTTP
>> handling of the Apache would have to take that into
>> consideration.
> Quite likely.

I haven't yet read all your other mails but I hope
there is a way to compute this Etag efficiently ...

>> Doing any "If-None-Match" processing seems to first
>> of all require Apache to compute these two possible
>> ETag values and check which of the possible cases in
>> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
>> applies. This would have to be done even before the
>> request would have to be handled, whose output would
>> much later become an input into mod_gzip as to in-
>> spect whether it should be compressed or not.
> Yes. Interesting problem to solve. But for Squid's
> purposes it is sufficient if mod_gzip detects the
> If-None-Match and if a match is found with the current
> etag (after mod_gzip transformation has been accounted
> for) change the reply to a 304. Sure, it is not the most
> optimal approach to If-None-Match in precense of trans-
> formations such as mod_gzip, but most likely very easy
> to implement.

Again, I am missing preconditional knowledge.

If ETag computation would require having access to
the result of the request, this would be a problem.
If mod_gzip did DECLINE in rules evaluation phase 1
it will not be invoked again after the results are
there, given its current architecture.

If computing the ETag would require to have the com-
pressed version available, and the decide to throw
it away and send HTTP 304 instead, it could become
at least worse performance-wise compared to the
situation as of now.
mod_gzip compression just for ETag computation's
sake could easily be a performance killer, if large
responses are to be handled - especially those where
compression would be rejected because of file size
but a copy resulting from some former mod_gzip con-
figuration variant is still stored in some cache
which now sends its ETag ...

So I really hope there is a way to compute ETags
without having to compress the content, as DECLINEing
compression for requests that must not or should not
be compressed is a valuable means to save CPU load.
(My machine is compressing about - let's have numbers
of today - 35% of all requests, which leads to saving
56% of all traffic including HTTP and TCP headers,
69% of the HTTP content without headers. Not a good
day, there have been better numbers already ...)

> More technically problemantic to deal with is the If-Match
> header, and will probably render more or less mod_gzip
> unuseable on WebDAV enabled servers until solved.. But
> this too is easy to solve by enforcing a Apahce standard
> in how ETag should be transformed when the reply entity is.

I didn't fully understand this point.
Do you say there should be a way to "transform" an
ETag for some uncompressed variant so that it would
represent the compressed variant?
This would then of course not require the content to
be compressed, and maybe save the day (see below).

And how should some 3rd-party module enforce some
"Apache standard" for ETag transformation? Are we
back in the "this needs some Apache core patch" area?

>> So if I didn't miss some important point, my conclusion
>> would be: If the correct use of Content-Encoding accor-
>> ding to HTTP/1.1 would require Apache to change its
>> basic HTTP behaviour, like now handling "If-None-Match"
>> headers differently and being aware of two separate en-
>> tities with separate Etags, then it might be rather
>> difficult to do this outside the Apache core by some
>> add-on module.
> If-None-Match can most likely be processed again by the
> add-on module. If-None-Match is by nature a fall-thru
> condition so if the no match was found in the initial
> check it can be reevaluated later.

I can only hope that this is handled this way by the
Apache architecture, i. e. it will do nothing special
in case of some ETag it doesn't know (because mod_gzip
was the originator).
We ought to have someone here with a lot more knowledge
about Apache architecture ...

>> Or it might imply that some future mod_gzip version
>> trying to meet the requirements of Squid 2.6 would
>> need to patch the Apache core.
> I see it more likely that this future mod_gzip version
> may need to duplicate some of the functions also performed
> by the core to compensate for the fact that the conditions
> have changed.

This would seem as a minor problem only, given that
mod_gzip will be invoked in all relevant scenarios
and early enough to prevent request evaluation when
there should be none.
mod_gzip will surely be able to compensate missing
operations but very unlikely be able to make things
undone.

>> I really _hope_ that I have failed to understand
>> something on the way to this point ...
> I am afraid not..

Which has at least the advantage that I am not
totally useless in this discussion. ;-)

> Applying entity transformations has implications on ETag,
> and therefore has direct implications on HTTP preconditions,
> resulting in a deadlock as mod_gzip cannot correctly evaluate
> the response until the response exists, and the action
> resulting in the response is not allowed to be taken before
> the preconditions have been evaluated.

mod_gzip can of course wait until the response arrives,
then do precondition checking, and possibly throw away
the results of whatever has happened until then, and
send HTTP 304 or anything. This would surely not be
very efficient, but it would not yet provide a deadlock.

But it cannot make events undone that were caused by
the evaluation of HTTP requests (like server-side appli-
cations being executed and changing database contents)
that shouldn't even have been executed.

So the problem seems to be a circular dependency bet-
ween the HTTP request handling and the ETag computation,
if there is one ... is it?

This wouldn't be a problem for static content.
mod_gzip is already able to negotiate between uncom-
pressed and static precompressed variants
     (http://www.schroepl.net/projekte/mod_gzip/config.htm#responsibilities
),
and it could possibly even provide a cache of compressed
variants (like gzip_cnc is doing, and thus saving gzip
executions) and check whether they are outdated
     (
http://www.schroepl.net/projekte/mod_gzip/enhancements.htm#detect_outdated)
If so, then for static content both variants would be
at hand nearly all the time if this were a requirement
for ETag computation.

But compressing dynamic content (like CGI etc.) is another
story, as there is no sense in caching most of the time.
(Personally, I have CGI scripts that send "Expires:"
headers in the future, but they are special cases anyway.)

> But as I said above it isn't this bad for the If-None-Match
> condition. It can most likely be dealt with in reasonable
> manners anyway assuming a response filter such as mod_gzip
> is allowed to replace the response by another response.

Replacing content isn't the problem - this is what mod_gzip
is doing already.

Greetings, Michael
Received on Thu Aug 29 2002 - 11:29:06 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:16 MST