Re: [squid-users] intermittent TCP_MISS on file specified in refresh_pattern

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 01 Nov 2012 12:46:18 +1300

On 01.11.2012 04:46, Mike Pentecost wrote:
> We are running Squid 3.1 on Debian Squeeze. We are using squid as a
> reverse proxy/cache for a Django backend.
>
> The cache is working well, but there is one file that keeps getting
> by. It has a "?" in its URL, which is needed because it has a
> license
> key parameter in it. I put a refresh pattern to try to catch it, but
> it is not cached in a consistent manner. It returns a HIT maybe 50%
> of the time, sometimes requests seconds apart will show different
> results.

You mean the URL has parameters which are not shown in your log?

Or do you mean that you append '?' without anything following to the
URL in order to make the network infrastructure treat it as dynamic
content? (default for dynamic content in a lot of places is not to
cache, or not for long)

The difference is important. It could be correct behaviour, or not.

  * When a URL parameter changes a single byte it is a whole different
URL. MISS is expected if any URL is not cached already.

versus

  * Depending on URL octets to determine traffic caching behaviour is a
major FAIL.
   - Squid's old behaviour of not caching URLs with '?' was solely due
to a default config workaround for old broken CGI scripts, which has
never been followed by many non-Squid caches, and is no longer followed
by Squid-3 either.
   - The only thing you can rely on is the above detail about URLs with
different exact-string values being considered different URLs by
HTTP-compliant caches.

>
> Here is the refresh pattern that I was hoping would catch it, this is
> above any other patterns (this is a static file, and we want squid to
> cache it for at least an hour):
>
> refresh_pattern -i http://foo.bar/static/floatbox/options.js? 60
> 100% 60 override-expire override-lastmod
>

That is supposed to be a regular expression pattern.

  '\.' and '\?' are required to match '.' and '?' characters in the
input value.

> Here are some logs showing the weird caching:
>
> 62.189.22.21 [31/Oct/2012:06:15:29 -0700] "GET
> http://foo.bar/static/floatbox/options.js? HTTP/1.1" 200
> TCP_MEM_HIT:NONE
> 208.101.141.24 [31/Oct/2012:06:15:55 -0700] "GET
> http://foo.bar/static/floatbox/options.js? HTTP/1.1" 200
> TCP_MISS:FIRST_UP_PARENT
>
> I was hoping it was a staleness issue, and setting the
> override-expire
> and lastmod options would help enforce the min/max in the refresh
> pattern. I'm sure I have missed something.

You mentioned there is a license key transferred. In which case you
absolutely do not want to override those two cache controls. Occasional
unnecessary checks with the backend are better than leaving obsolete
security/license keys responding with 'allow' type actions.

Speaking of headers, what *are* the response headers being produced by
the backend server for Squid to work with?

Amos
Received on Wed Oct 31 2012 - 23:46:21 MDT

This archive was generated by hypermail 2.2.0 : Thu Nov 01 2012 - 12:00:05 MDT