Re: Antwort: [Mod_gzip] Vary: header and mod_gzip

From: Robert Collins <robertc@dont-contact.us>
Date: 27 Aug 2002 00:16:36 +1000

On Mon, 2002-08-26 at 23:35, Michael.Schroepl@telekurs.de wrote:
>
> Hi Robert,
>
>
> > I'm one of the squid (www.squid-cache.org) proxy & cache developers.
>
> I am very happy that someone like you is showing up here.
> Obviously the problem you mention is around and there should
> be some cooperation and information exchange to solve it,
> even beyond simply adding the missing "Vary:" header to the
> mod_gzip output.

Cool. The mod_gzip author Kevin Kiley also replied to me off-list. A
synopsis of email is
"rather than destroy cache efficiency, mod_gzip removed Vary support
during beta testing". Anil Madhavapeddy has also replied on-list with a
patch - which is excellent!.

> I am not using Squid myself but have read some parts of the
> manuals which told me that from some Squid version on (2.4?)
> Squid decided to not cache content with a "Vary:" header.
> This would solve the data integrity problem but disable the
> caching feature of Squid and thus cause a severe performance
> problem for high-traffic sites. So there might be more to do
> than just disable caching compressed content.

There is, and 'we' are on it. Actually Henrik Nordstrom is doing almost
all of the Vary support work. Squid 2.5 will cache Varied objects, and
2.6 will support If-None-Match requests.
 
> You may be the perfect guy to proof-read
> http://www.schroepl.net/projekte/mod_gzip/cache.htm
> (which is now - finally! - completely translated to English,
> sigh) and tell me whether I might have understood the caching
> problem and all of its aspects and which parts of my page
> should be corrected and/or improved.

I will do this. Chalk up a bug for galeon - it failed to negotiate
english for me, but IE negotiated it fine. Sigh.
 
> The second user may even use the best browser in the world
> and will still not be able to understand the response in
> some cases.

Yep.

> So the problem is even worse than you describe it, because
> many M$IE users may have their Internet Options set to
> HTTP/1.0, possibly even without knowing or by decision of
> their company admins.

Or if they want to work well with 1.0 caches - such as squid. (Actually
squid is kinda 1.0.5 :}. It supports many 1.1 features.)
 
> I myself use "mod_headers" to explicitly send the "Vary:"
> header in these cases as a workaround, and additionally send
> "Cache-Control: Private", both of these in combination with
...

You should not need the Cache-Control: Private directive. The Vary
directive will 'fix' the data integrity issue for all squid 2.4+ users
(and no-one should be using an older version due to security bugs in
them). And given squid's market share that is probably enough to
dramatically reduce any issues - whilst still allowing Vary supporting
caches (squid 2.5 and above) to cache the response.
 
> > (See RFC 2616 section 14.44, and 14.3). Setting 'Vary: Accept-Encoding'
> > is a SHOULD requirement of RFC 2616 when using Server driven negotiation.
>
> Yes, unfortunately.
> I wish it were a MUST because then mod_gzip would simply
...

Well, it does drop a mod_gzip apache to conditonally compliant status,
not unconditional compliance.
 
> But the real problem seems to be that a correct "Vary:" header
> should contain _all_ the HTTP headers that are subject to
> negotiation, which would then enable Squid to even do the
> negotiation itself in some cases.

Exactly.

> Unfortunately, the structure of mod_gzip doesn't allow for this,
> as it is doing negotiation on things that are _not_ part of the
> HTTP header, so it seems to be impossible to express in the
> "Vary:" header what mod_gzip _really_ did.

Hmm? If it looks at 2 headers - say Accept-Encoding: and User-Agent:,
then the Vary: header should be
Vary: Accept-Encoding User-Agent
What does mod_gzip look at? (BTW: I've had a look at your caching page.
Can you enlarge on the meaning of the four non-HTTP header variables
used for deciding whether to encode?). I don't see how the will
influence the semantics of the client-proxy-origin-proxy chain. They may
cause unpredictable content to be served - but so does a webcam :}. As
long as the client gets something it thinks it requested, the client
should behave correctly - no more binary data surprising users.
 
> Only some days ago we had a discussion here where I tried to
> outline a concept where mod_gzip compression an Squid caching
> could even be used in combination - I will forward the mail
> to you separately. My idea would have required a third piece
> of software, but maybe this one could even be a part of Squid
> itself - at least we should look at it together more closely.

I.e. content-coding compression within squid? There are *serious*
architecture issues in squid that we are *currently* resolving that
would make this quite feasible. I've also implemented Transfer-Encoding
with on-the-fly-gzip support for squid in the past, but those same
architectural issues made it unable to be production-quality. I'd be
interested in your idea all the same - as we get these architecture
issues out of the way, things should be *much* more feasible.
 
> > Is there somewhere for the users of mod_gzip to lodge a bug report on
> > this? Or is it trivial enough for a mod_gzip hacker to fix that I'll
> > get a reply "Hey, we overlooked this, it's fixed now"? :}.
>
> I already have this on my list, and I am optimistic that there
> will be a solution to it in the near future. But sadly, the
> number of "mod_gzip hackers" is very close to zero these days.

To summarize what I've learnt about this:
* There is a extant patch from the openBSD folk, and indeed from the
early mod_gzip versions, to generate a Vary: header.
* The author doesn't want to include this at the moment - until Vary
caching proxies exist.
* It's causing ?numerous? incidents on the mod_gzip list where users are
complaining about the effects of the missing Vary: header.
* It's causing occasional reports on the squid lists (growing in
frequency).

What I'd love to see is mod_gzip to be distributed with Vary: header
support enabled, and (optionally) provide a clearly labelled (*this will
cause X, Y, and Z problems) option to disable Vary support.

If that causes all the old commercial implementations of HTTP/1.0 to
stop caching everything... well so what :}? They should support their
paying customers...

Rob

Received on Mon Aug 26 2002 - 08:16:25 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:09 MST