Re: [squid-users] Caching behavior

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 08 Feb 2012 23:44:20 +1300

On 8/02/2012 10:13 p.m., FredB wrote:
> Hi,
>
> I take time to test the cache behavior with different versions
> I just get a picture and refresh
>
> Squid 3.0 STABLE 25
>
> 10.1.1.1 - - [07/Feb/2012:14:52:48 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 200 225918 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:52:51 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 282 TCP_REFRESH_UNMODIFIED:DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:52:57 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 282 TCP_REFRESH_UNMODIFIED:DIRECT
>
> Squid 3.1.18
>
> 10.1.1.1 - - [07/Feb/2012:14:50:29 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 0 0 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:50:47 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 200 225909 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:51:04 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 275 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:51:16 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 275 TCP_MISS:DIRECT
>
> Squid 3.2.0.15
>
> 10.1.1.1 - - [07/Feb/2012:14:49:40 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 277 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:49:47 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 277 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:49:59 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 277 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:14:50:00 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 277 TCP_MISS:HIER_DIRECT
>
> Squid 3.1.12.1-20110523
>
> 10.1.1.1 - - [07/Feb/2012:15:04:35 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 286 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:15:04:39 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 304 286 TCP_MISS:DIRECT
> 10.1.1.1 - - [07/Feb/2012:15:04:44 +0100] "GET
> http://animals.nationalgeographic.com/staticfiles/NGS/Shared/StaticFiles/animals/images/1024/giant-squid2-lw.jpg
> HTTP/1.1" 0 0 TCP_MISS:DIRECT
>
> Identical configuration, except squid 3.2 with workers
>
> acl QUERY urlpath_regex cgi-bin ig? /ig/ -> no change without this

You mean removing it has no effect on the HIT/MISS result and status?
You should expect just about everything to be a MISS when that QUERY
setup is used.

That whole QUERY definition is a redundant pattern.
  => the second value 'ig?' is a complex way of writing 'i'.
  => the letter 'i' by itself is a sub-pattern of all the other patterns
in that ACL. So they do not add anything by their presence.

Meaning ... every URL which contains the letter 'i' anywhere in its path
section will *not* be cached or served from cache. 100% of your test
URLs contain match that pattern.

> no_cache deny QUERY

Remove the "no_" portion of that. It was deprecated back in squid-2.2
IIRC and has only ever confused people.

> cache_mem 500 MB
> connect_timeout 5 minutes
> dns_retransmit_interval 5 seconds
> dns_timeout 1 minutes
> persistent_request_timeout 2 minutes
> request_timeout 60 seconds
> maximum_object_size_in_memory 2000 KB
> maximum_object_size 800 MB
>
> I tried light pictures without more success
>
> 10.1.1.1 - - [07/Feb/2012:15:08:05 +0100] "GET
> http://images.nationalgeographic.com/wpf/media-live/photos/000/183/cache/transparent-squid-newbert_18395_600x450.jpg
> HTTP/1.1" 304 347 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:15:08:13 +0100] "GET
> http://images.nationalgeographic.com/wpf/media-live/photos/000/183/cache/transparent-squid-newbert_18395_600x450.jpg
> HTTP/1.1" 304 347 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:15:08:14 +0100] "GET
> http://images.nationalgeographic.com/wpf/media-live/photos/000/183/cache/transparent-squid-newbert_18395_600x450.jpg
> HTTP/1.1" 304 347 TCP_MISS:HIER_DIRECT
> 10.1.1.1 - - [07/Feb/2012:15:08:15 +0100] "GET
> http://images.nationalgeographic.com/wpf/media-live/photos/000/183/cache/transparent-squid-newbert_18395_600x450.jpg
> HTTP/1.1" 304 347 TCP_MISS:HIER_DIRECT
>
> The next day, I clear my web browser's cache and retry, there is TCP_HIT:NONE with only 3.0 (although there are many HIT in access.log in 3.1 and 3.2). If this is not a bug then how to explain such behavior ?

The status is 304. Those responses do not have any body attached, so
there is nothing for Squid to collect and cache as it passes through.

The 3.0 trace shows a 200 followed by IMS revalidations from the client
which get relayed through to the server.

For the others it looks a bit weird, but is possible under some
conditions. Without the sequence of full headers its impossible to say
what *should* have or is happening in the other cases.

For testing try using squidclient in a shell script to make a series of
controlled calls. That way you can have a series of actions and know in
advance what the Squid behaviour is supposed to be at each point of the
test. Comparing what happens and what gets logged against the expected
results.
  If you have a trace of real traffic headers you can tune the scripted
calls around what those do and compare the versions under identical
conditions.

Amos
Received on Wed Feb 08 2012 - 10:44:28 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 08 2012 - 12:00:02 MST