[Fwd: Re: [Mod_gzip] Vary: header and mod_gzip]

From: Robert Collins <robertc@dont-contact.us>
Date: 26 Aug 2002 17:38:41 +1000

-----Forwarded Message-----

> From: TOKILEY@aol.com
> To: robertc@squid-cache.org
> Cc: michael@schroepl.net, cranstone1@attbi.com, TOKILEY@aol.com
> Subject: Re: [Mod_gzip] Vary: header and mod_gzip
> Date: 26 Aug 2002 03:28:56 -0400
>
>
> Hi Rob...
> This is Kevin Kiley, the author of mod_gzip.
>
> The fact that (some) version(s) of mod_gzip are not
> always 'automatically' adding a 'Vary:' header of any
> kind to responses is NOT a 'bug' nor was it something
> that has 'been overlooked'.
>
> It was a conscious choice and is completely consistent
> with RFC 2616. The addition of a 'Vary:' header is still not
> considered a 'MUST' item for any Content Origin Server.
>
> It is still considered 'optional' and for a very good reason.
> There are times when it HURTS more than HELPS.
>
> A little history is in order here...
>
> The original version(s) of mod_gzip were coded almost
> 3 years ago and did, in fact, contain code that would
> automatically add a 'Vary:' header. Whether or not that
> 'Vary:' header should be for just the 'Accept-encoding:'
> field or for the full range of fields that are involved in
> determing if a User-Agent can really accept compressed
> data or not is a moot point. Different combinations were
> in use during Beta testing but it was the Beta tests
> themselves that revealed why the 'Vary:' header itself
> was actually doing more HARM than GOOD.
>
> At the time mod_gzip was released... even your own
> SQUID code was completely and totally Non-RFC
> compliant and there was no actual support for the
> 'Vary:' header at all. SQUID would simply do the
> Proxy-cache equivalent of 'throw it on the floor' if
> there was any 'Vary:' header appearing in any
> response and SQUID would make no attempt to
> even cache the response at all ( A complete
> viloation of RFC 2616 ).
>
> The story was the same with all the other major Proxies
> of the day. No one was making any attempt to actually
> fulfill the requirements of RFC 2616 with regards to
> negotiated Server content.
>
> Given that reality... the choice was made to not use
> the 'Vary:' header at all ( at least not in the versions that
> were leaving my desk ) until there was some inidication
> that even just one major proxy Server was going to make
> any attempt to correctly support it.
>
> Why?... simple... because the reality of the day was
> that the use of the 'Vary:' header was no different from
> simply using 'Expires: -1' or 'pragma: no-cache' and there
> was no chance at all of anyone seeing any benefit from
> caching for any document that contains a 'Vary' header.
>
> This simply did not ( and does not ) meet my 'sanity' test
> since MOST modern browsers and User-Agents DID ( and DO )
> support the particular variant in question ( Content-encoding: gzip ).
>
> To force MOST people to suffer the (total) loss of caching benefits
> just because a FEW people are still using inferior
> and/or hopelessly out-dated user-agents is, in my opinion,
> simply an absurd point-of-view.
>
> Especially when, for those few people, the solution(s) are actually
> quite simple and are listed here in my own personal order
> of preference...
>
> 1. Join the 21st century and get an HTTP/1.1 compliant browser.
> They are free.
>
> 2. If you ever happen to get a page that your ( legacy ) Non-HTTP/1.1
> compliant browser is unable to display and you suspect it's
> because of the behavoir of an equally Non-HTTP/1.1 compliant
> inline cache then just hit CTRL-R and be sure to get a fresh
> copy of the page from the origin server. No big deal.
>
> 3. Turn off your local cache in your browser. That way you
> are assured that the user-agent will always send a 'no-cache'
> request and you are bound to get fresh content negotiated
> at the origin server. That way, if there is a caching penalty
> to be paid... then it is being paid locally ( by this one
> particular user ) and is not affecting everyone using the
> intermediate cache.
>
> 4. Tell whoever is involved in the delivery of the pages to make
> sure they are using a copy of mod_gzip that DOES send
> a 'Vary:' header and let everyone using that cache suffer the
> loss of caching benefits. There are copies of mod_gzip
> around that have the 1 line addition (patch) added and they
> should be easy to find if that's really the way you want to go.
>
> 5. Wait until all inline Proxies actually DO support the 'Vary:'
> header correctly and then make sure whatever upstream
> copy of mod_gzip is in use is one of the versions that
> sends the 'Vary:' header. No predictable timeline here.
>
> 6. Just add 'mod_headers' to the Apache configuration
> and tell it to add whatever response headers you like
> that don't seem to come from anywhere else. That's
> its entire function in life... to solve these kinds of
> things with run-time 'configuration' parameters.
>
> As far as anyone turning to you and asking you to
> analyze all of this as a 'bug' ( it is not ) please read on...
>
> Regardless of SQUID's inability to correctly process a
> 'Vary:' header... with regards to 'Content-encoding: gzip'
> discrepancies between requests the current behavior of
> SQUID is not actually wrong according to RFC 2616.
>
> Section 14.3 says, in no uncertain terms...
>
> "If no Accept-Encoding field is present in a request, the
> server MAY assume that the client will accept any
> content coding."
>
> There are no 'ifs/ands/or buts' associated with this
> statement in the RFC and there is most certainly no
> caveat that says 'Unless, of course, it's something so
> old that it can't support 'gzip' decompression'.
>
> That is exactly what SQUID is (currently) doing.
>
> It is 'assuming' that, even though there is an obvious discrepancy
> between one request containing 'Accept-Encoding: gzip' and
> another request does not... it is still OK to 'assume' that
> the requestor can handle the currently cached response
> with the currently applied 'Content-encoding:'
>
> And this is as it should be. I am not sure you are aware of it
> but there are, in fact, a number of major HTTP 1.0 legacy
> browsers out there that pre-date the requirement for an
> 'Accept-encoding:' field but they are, in fact, perfectly capable
> of decompressing any response that has 'Content-encoding: gzip'.
> Some old versions of UNIX command line LYNX browser come to mind.
>
> According to RFC 2616... only a specific 'Accept-encoding: None'
> header or an 'Accept-encoding: xxxx' string with encoding name
> 'xxxx' having a value of ZERO will ever specifically indicate that
> a requestor should NOT receive a certain encoding.
>
> Whether or not this point-of-view reflects the reality of legacy
> browsers is not really the point. It's what the protocol standards
> documents say is 'correct behavior'.
>
> The authors of the HTTP/1.1 standards document have openly
> acknowledged that there comes a time when you simply have
> to stop caring about (broken) legacy implementations and the
> fact that even HTTP 0.9 user-agents were SUPPOSED to have
> support for gzip inflation built in puts this particular encoding
> capbility well into the 'if it isn't there by now then there's no
> use catering to that old, broken agent' category.
>
> All of the above being said... it is simply a no-brainer
> for anyone who sees the necessity of the 'Vary:' to
> be used even when it is not supported by the inline
> cache(s) to go ahead and make the one-line change
> to mod_gzip.
>
> Just add a line that adds any of the following to
> the response header and you will achieve the
> same desired effect with proxies that don't actually
> support RFC2616 negotiation schema and are
> going to simply refuse to cache the response
> at all...
>
> Vary: Accept-encoding
> Vary: User-Agent
> Vary: *
> Expires: -1
> etc...
>
> Now it's my turn to ask a question...
>
> Any idea WHEN SQUID will (fully) support 'Vary:' and
> can correctly cache/deliver Server negotiated
> content and/or store/forward non-expired multi-variants
> of the same page?
>
> That's what has to happen first.
>
> Yours...
> Kevin
>
> In a message dated 8/25/2002 9:35:21 AM Central Daylight Time,
> robertc@squid-cache.org writes:
>
>
> > Hi all,
> > I'm one of the squid (www.squid-cache.org) proxy & cache developers.
> > We've had a couple of bugs now reported to us which are actually caused
> > by mod_gzip.
> >
> > This is the scenario: A user behind a proxy requests a page from a
> > mod_gzip enabled server. The proxy caches the result according to RFC
> > 2616 rules. A second user, using a non-compress enabled browser (i.e.
> > older netscapes, older text browsers, or apparently recent Mac browsers)
> > requests the same document, and the proxy (again in conformance with RFC
> > 2616) returns the cached object - which the client can not interpret.
> >
> > This occurs because mod_gzip is not setting a Vary: header in the reply.
> > (See RFC 2616 section 14.44, and 14.3). Setting 'Vary: Accept-Encoding'
> > is a SHOULD requirement of RFC 2616 when using Server driven
> > negotiation.
> >
> > I've grepped the mod_gzip sources, but could not see any reference to
> > Vary.
> >
> > Is there somewhere for the users of mod_gzip to lodge a bug report on
> > this? Or is it trivial enough for a mod_gzip hacker to fix that I'll get
> > a reply "Hey, we overlooked this, it's fixed now"? :}.
> >
> > Cheers,
> > Rob
> >
> >
>

Received on Mon Aug 26 2002 - 01:38:32 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:08 MST