[squid-users] Too many refresh hits on old object

From: Benno Blumenthal <benno@dont-contact.us>
Date: Wed, 23 Apr 2008 11:32:18 -0400

Hello All,

I have an oldish popular object which is getting requested through a
chain of squid servers: a proxy server is talking to a reverse proxy
which asks a reverse proxy which asks the server. My problem is that
the squids rarely respond with HIT/NONE -- every request seems to go
through the whole chain to see if the file is updated. I think the
problem is some confusion about how the Age: and max-age headers
interact (I know I am confused; I worry about squid)

Here is the request for the file with headers:

wget -S http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml
--11:04:12-- http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml
           => `thredds.xml.1'
Resolving iridl.ldeo.columbia.edu... 129.236.110.69
Connecting to iridl.ldeo.columbia.edu|129.236.110.69|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.0 200 OK
  Date: Thu, 10 Apr 2008 22:57:15 GMT
  Server: Ingrid 0.9
  Mime-Version: 1.0
  Cache-Control: public
  Content-Type: text/xml
  Last-Modified: Fri, 14 Mar 2008 21:44:28 GMT
  X-Cache: MISS from iridlc6.ldeo.columbia.edu
  X-Cache-Lookup: MISS from iridlc6.ldeo.columbia.edu:80
  Age: 1094817
  Content-Length: 23219
  X-Cache: HIT from iridl2p.ldeo.columbia.edu
  X-Cache-Lookup: HIT from iridl2p.ldeo.columbia.edu:80
  Via: 1.0 iridl2p.ldeo.columbia.edu:80 (squid/2.6.STABLE12)
  Connection: keep-alive
Length: 23,219 (23K) [text/xml]

And here is a portion of the front cache's log:

1208962913.226 5 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc6 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc6.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc6.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962913.286 3 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc5 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc5.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc5.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962913.674 3 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc6 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc6.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc6.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962913.725 5 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc6 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc6.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc6.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962914.167 4 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc6 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc6.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc6.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962914.176 3 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc6 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc6.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc6.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962914.642 3 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml -
ROUNDROBIN_PARENT/ingridc5 - [If-Modified-Since: Fri, 14 Mar 2008
21:44:28 GMT\r\nUser-Agent: Ingrid 0.9\r\nVia: 1.0
kage.ldgo.columbia.edu:3128 (squid/2.6.STABLE17)\r\nX-Forwarded-For:
127.0.0.1\r\nHost: iridl.ldeo.columbia.edu\r\nCache-Control:
max-age=259200\r\nConnection: keep-alive\r\n] [HTTP/1.0 304 Not
Modified\r\nX-Cache: MISS from
iridlc5.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc5.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]
1208962914.642 1 129.236.21.133 TCP_REFRESH_HIT/304 315 GET
http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml - NONE/- -
[If-Modified-Since: Fri, 14 Mar 2008 21:44:28 GMT\r\nUser-Agent: Ingrid
0.9\r\nVia: 1.0 kage.ldgo.columbia.edu:3128
(squid/2.6.STABLE17)\r\nX-Forwarded-For: 127.0.0.1\r\nHost:
iridl.ldeo.columbia.edu\r\nCache-Control: max-age=259200\r\nConnection:
keep-alive\r\n] [HTTP/1.0 304 Not Modified\r\nX-Cache: MISS from
iridlc5.ldeo.columbia.edu\r\nX-Cache-Lookup: HIT from
iridlc5.ldeo.columbia.edu:80\r\nConnection: keep-alive\r\n\r]

Note that the TCP_REFRESH_HIT/NONE is from a pair of simultaneous
requests, so otherwise, it always goes back to the server to ask.

I see that the age of the object is much greater than the max-age of the
request, which may be the problem. But I am confused:

1) I am pretty sure that squid inserted the max-age header -- I wrote
the client (though I am not running it) and my code certainly did not
have that header in the request.
2) I am confused as to how this is supposed to work: Is it the age of
the object, or the age of the cache of the object? I could see having
a max-age for the cache copy, (in which case I don't understand why
squid does not reset the age once it checks with the server that it is
unchanged), but I don't see why there should be a limit on the age of an
object (because then old objects get constantly checked).

The reverse caches are squid/2.6STABLE12, the proxy cache is
squid/2.6STABLE17.

Thanks for your time,

Benno
Received on Wed Apr 23 2008 - 15:32:22 MDT

This archive was generated by hypermail 2.2.0 : Thu May 01 2008 - 12:00:04 MDT