Re: Some Squid 1.NOVM.18 oddities [extra information]

From: WWW server manager <webadm@dont-contact.us>
Date: Thu, 18 Dec 1997 21:58:20 +0000 (GMT)

Yesterday, I wrote:
>
> Some Squid 1.NOVM.18 oddities (running on a Sun SPARCserver 20/151
> with Solaris 2.5, compiled with Sun's cc). Partly just to comment on various
> points I've noticed, though including some apparent bugs and places where
> there's room for improvement.
>...
> [I was also surprised that Squid was caching the problem document, since it
> lacked last modification and expiry timestamps, and didn't specify content
> length, but I haven't looked more closely yet to decide whether I think
> there's anything genuinely odd there.]

I've now had a closer look, and found that the unexpected caching was for
304 "not modified" responses, when neither the headers of the 304 nor of the
original (full, 200-response) document gave any strong support for caching
(no Last-Modified, Expires, or other expiry related headers).

The doc/HTTP-codes.txt included with 1.NOVM.18 states that 304 responses are
not cached, but in reality they are cached (at least for the URL I was
trying, http://www.macfixit.com/, which does not supply a Last-Modified
header with 200 or 304 responses (or any other caching-related headers).

I can't see *why* it is happening, but it looks as though the response is
being negatively cached by Squid: 200 responses for the document (e.g. after
a forced reload, method = GET) give cache/log entries like

00000009 34998afc fffffffe 34998afc 17942 http://www.macfixit.com/

with last-modified set (artificially) to the Date: timestamp (not supplied
by the server). That appears to mean that is always deemed stale (as a
special case) - though I had to dig around in the code to establish that,
the part of the Squid 1.1 release notes documenting the refresh algorithm
does not say what happens if there's no last-modified head when LM-AGE needs
to be checked. A subsequent unconditional GET request certainly seems to
prompt Squid to send a get-if-modified to its parent. Though I was
initially surprised at Squid caching a superficially non-cacheable document,
on reflection it seems sensible - if it's truly dynamic, a subsequent
conditional GET will get a 200 response, but if it's static and the origin
server just doesn't bother with Last-Modified, it will get a 304 and you've
saved some bandwidth.

The bug or feature then arises when the parent (or, presumably, the origin
server) passes back a 304 response; the cache/log entry looks like

00000014 349988d7 34998b24 349988d7 91 http://www.macfixit.com/

with an expiry timestamp in the future, apparently due to the server
deciding to negatively cache it; the entire response (from a later test,
timestamps won't match!) is

=====
HTTP/1.0 304 Not Modified
Date: Thu, 18 Dec 1997 20:46:41 GMT
Content-type: text/html

=====

That seems like a mistake when the response is for a document which was
itself flagged as implicitly stale because it lacked both last-modified and
expires headers (and any HTTP 1.1 headers relating to caching). The
corresponding entries in store.log show both the last-modification and
expiry timestamps as -2 (unset).

In reality, it probably doesn't matter too much as long as the information
is not changing rapidly. It could be seriously irritating if (hypothetical
example) a script return a response which would change every minute and
would return 304 responses if a get-if-modified request quoted a timestamp
in the current minute, with last-modified and expires unset to discourage
caching. Or whatever. I don't happen to know of any "serious" examples, but
they could well exist (a server status report updated each minute,
perhaps?).

In that situation, if you requested the document via a Squid cache then
waited a minute and requested it again, you'd get a new copy. If you
misjudged and requested the second copy while the old document was current,
you'd then be stuck with a cached copy for the Squid negative expiry interval
(typically several minutes) unless you sent a Pragma: no-cache request.

In the case that prompted this investigation, a no-cache request was not
something the user understood how to fudge, given that the cached copy was
broken; Netscape never let it become the current document and hence it could
not be shift-RELOAD-ed... Of course, the same situation could arise with a
legitimately cacheable document, but especially where neither the original
document nor the 304 response has headers allowing caching, it seems wrong
to negatively-cache the 304 response!

Anyway: I'd be interested to know whether the negative caching of 304
responses (as seen in the case described above, or in any other situation)
is a bug or a feature. If a feature, it should be noted that
doc/HTTP-codes.txt claims that 304's are not cached. Additionally, the 1.1
release notes do not document how freshness calculations are handled in the
case where a document does not have a Last-Modified header (or any other
headers that would take precedence).

                                John Line

-- 
University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk
Received on Thu Dec 18 1997 - 14:07:31 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:59 MST