Re: [Mod_gzip] Vary: header and mod_gzip

From: Robert Collins <robertc@dont-contact.us>
Date: 26 Aug 2002 18:10:13 +1000

Hi Kevin,
 great to get an authoritative response.

On Mon, 2002-08-26 at 17:28, TOKILEY@aol.com wrote:
> The fact that (some) version(s) of mod_gzip are not
> always 'automatically' adding a 'Vary:' header of any
> kind to responses is NOT a 'bug' nor was it something
> that has 'been overlooked'.

Ok, thats good to know.
 
> It was a conscious choice and is completely consistent
> with RFC 2616. The addition of a 'Vary:' header is still not
> considered a 'MUST' item for any Content Origin Server.

Sure. It is a SHOULD however.
 
> It is still considered 'optional' and for a very good reason.
> There are times when it HURTS more than HELPS.

...

> At the time mod_gzip was released... even your own
> SQUID code was completely and totally Non-RFC
> compliant and there was no actual support for the
> 'Vary:' header at all. SQUID would simply do the
> Proxy-cache equivalent of 'throw it on the floor' if
> there was any 'Vary:' header appearing in any
> response and SQUID would make no attempt to
> even cache the response at all ( A complete
> viloation of RFC 2616 ).

I should note here - squid is a HTTP/1.0 proxy, and makes no pretense at
being other than that. We are working on full 1.1 support, but all the
requests issued by squid are downgraded to HTTP/1.0 and it's responses
are via HTTP/1.0
 
> This simply did not ( and does not ) meet my 'sanity' test
> since MOST modern browsers and User-Agents DID ( and DO )
> support the particular variant in question ( Content-encoding: gzip ).
>
> To force MOST people to suffer the (total) loss of caching benefits
> just because a FEW people are still using inferior
> and/or hopelessly out-dated user-agents is, in my opinion,
> simply an absurd point-of-view.

I can understand your logic path here, but I don't agree with it.
 
> Especially when, for those few people, the solution(s) are actually
> quite simple and are listed here in my own personal order
> of preference...
>
> 1. Join the 21st century and get an HTTP/1.1 compliant browser.
> They are free.

The most recent bug report is from (IIRC) MSIE 6.0 on the Mac. That's
rumoured to be HTTP/1.1 compliant - and gzip is optional.
 
> 2. If you ever happen to get a page that your ( legacy ) Non-HTTP/1.1
> compliant browser is unable to display and you suspect it's
> because of the behavoir of an equally Non-HTTP/1.1 compliant
> inline cache then just hit CTRL-R and be sure to get a fresh
> copy of the page from the origin server. No big deal.

A HTTP/1.1 compliant browser can be confused by the response from the
cache. NON-HTTP compliance for the browser is orthogonal to the issue.
 
> 3. Turn off your local cache in your browser. That way you
> are assured that the user-agent will always send a 'no-cache'
> request and you are bound to get fresh content negotiated
> at the origin server. That way, if there is a caching penalty
> to be paid... then it is being paid locally ( by this one
> particular user ) and is not affecting everyone using the
> intermediate cache.

This is an implementation specific approach - there's no guarantee that
disabling browser caching will cause a no-cache request to be sent. It
may for some browsers, sure.
 
> 4. Tell whoever is involved in the delivery of the pages to make
> sure they are using a copy of mod_gzip that DOES send
> a 'Vary:' header and let everyone using that cache suffer the
> loss of caching benefits. There are copies of mod_gzip
> around that have the 1 line addition (patch) added and they
> should be easy to find if that's really the way you want to go.

I'd like to put a copy of that patch on the squid website, yes. It's
certainly the recommended way according to the rfc.
 
> 5. Wait until all inline Proxies actually DO support the 'Vary:'
> header correctly and then make sure whatever upstream
> copy of mod_gzip is in use is one of the versions that
> sends the 'Vary:' header. No predictable timeline here.

There's no need to wait for all proxies to support Vary correctly. The
failure mode for proxies in the middle will be as if Vary had not been
issued (except for some not caching as you noted). Open source proxies
will get pressure from the users to support Vary, and commercial proxies
already have fiscal reasons to support Vary correctly.
 
> As far as anyone turning to you and asking you to
> analyze all of this as a 'bug' ( it is not ) please read on...
>
> Regardless of SQUID's inability to correctly process a
> 'Vary:' header... with regards to 'Content-encoding: gzip'
> discrepancies between requests the current behavior of
> SQUID is not actually wrong according to RFC 2616.

Squid will support Vary - at least to the extent of correct caching
multiple objects with the same URL and a Vary: in the response - in 2.5
(very very close to release).
 
> Section 14.3 says, in no uncertain terms...
>
> "If no Accept-Encoding field is present in a request, the
> server MAY assume that the client will accept any
> content coding."
>
> There are no 'ifs/ands/or buts' associated with this
> statement in the RFC and there is most certainly no
> caveat that says 'Unless, of course, it's something so
> old that it can't support 'gzip' decompression'.
>
> That is exactly what SQUID is (currently) doing.

Actually, it's NOT what squid is doing. Squid is passing on to the 2nd
user, a *negotiated* response from an origin server. When a HTTP/1.1
client requests something from squid with:
Accept-Encoding: identity
the cached, gzipped response will be sent - because squid hasn't been
told that the response varied according to the acceptable encodings.

One possible way we could workaround this is to check that the candidate
cached object satisfies the client's Accept-Encoding request header -
but that is *exactly* what Vary: was designed to do.
 
> And this is as it should be. I am not sure you are aware of it
> but there are, in fact, a number of major HTTP 1.0 legacy
> browsers out there that pre-date the requirement for an
> 'Accept-encoding:' field but they are, in fact, perfectly capable
> of decompressing any response that has 'Content-encoding: gzip'.
> Some old versions of UNIX command line LYNX browser come to mind.

Thanks, but I am aware of that.
 
> According to RFC 2616... only a specific 'Accept-encoding: None'
> header or an 'Accept-encoding: xxxx' string with encoding name
> 'xxxx' having a value of ZERO will ever specifically indicate that
> a requestor should NOT receive a certain encoding.

There is no content-coding of 'None' in the initial registry. Secondly,
if the server can't send a response compatible with the clients
Accept-Encoding: header, it SHOULD send a 406 response. Yes, nothing is
guaranteed, but all the same, we developers should be paying attention
to the SHOULD requirements, not just the MUST and REQUIRED ones.
 
> All of the above being said... it is simply a no-brainer
> for anyone who sees the necessity of the 'Vary:' to
> be used even when it is not supported by the inline
> cache(s) to go ahead and make the one-line change
> to mod_gzip.
>
> Just add a line that adds any of the following to
> the response header and you will achieve the
> same desired effect with proxies that don't actually
> support RFC2616 negotiation schema and are
> going to simply refuse to cache the response
> at all...
>
> Vary: Accept-encoding
> Vary: User-Agent
> Vary: *
> Expires: -1
> etc...

These are different (but there impact on old implementations may well be
equivalent). From what I understand of mod-gzip,
Vary: Accept-Encoding User-Agent
is the most appropriate header?
 
> Now it's my turn to ask a question...
>
> Any idea WHEN SQUID will (fully) support 'Vary:' and
> can correctly cache/deliver Server negotiated
> content and/or store/forward non-expired multi-variants
> of the same page?

The 2.5 release will fully support caching and delivering of Server
Negotiated content and will store/forward non-expired multi-variants of
the same object. Henrik is working on the next step - ETag support, to
make that caching even more efficient, and closer to the HTTP/1.1 design
goals.
 
> That's what has to happen first.

Well, it's happened :}.

Rob

Received on Mon Aug 26 2002 - 02:10:04 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:08 MST