Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server

From: Daniele Segato <daniele.segato_at_gmail.com>
Date: Sat, 31 Mar 2012 17:01:34 +0200

On 03/31/2012 10:13 AM, Amos Jeffries wrote:
> On 30/03/2012 12:47 p.m., Daniele Segato wrote:
>> Instead squid is ALWAYS requiring the resource to the server:
>> $ curl -v -H 'If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT'
>> 'http://localhost:280/alfresco/service/catalog/products'
>>
>> * About to connect() to localhost port 280 (#0)
>> * Trying 127.0.0.1...
>> * connected
>> * Connected to localhost (127.0.0.1) port 280 (#0)
>>> GET /alfresco/service/catalog/products HTTP/1.1
>>> User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0
>> OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3
>>> Host: localhost:280
>>> Accept: */*
>>> If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT
>>>
>> * additional stuff not fine transfer.c:1037: 0 0
>> * HTTP 1.0, assume close after body
>> < HTTP/1.0 304 Not Modified
>> < Date: Thu, 29 Mar 2012 23:27:57 GMT
>> < Cache-Control: public, max-age=3600
>> < Last-Modified: Thu, 29 Mar 2012 22:14:20 GMT
>
> "
>
> max-age
>
> The max-age response directive indicates that the response is to
> be considered stale after its age is greater than the specified
> number of seconds.
>
> "
>
> The logic goes like this:
>
> Object modified ... 22:14:20
> Valid +3600
> ==> fresh until 23:14:50
> Current time: 23:27:57
>
> 23:14:50 < 23:27:15 ==> currently stale. must revalidate.
>
> Expires header can be used to set an absolute time for invaldation.
> max-age is relative to age.

Hi amos,

My content has been lastly modified at 22:14:20.
But I did two successive request, one at 23:27:00, one at 23:27:20

the first one: 23:27:00 was a cache miss
the second is what you see above.

you are saying that max-age is added to "last modified date"
but that doesn't make much sense to me.

If the server (parent cache) is returning the content at 23:27:00 saying
max-age 3600 I would expect that 3600 start from "now".

anyway, I thought about this before and I also tried to modify the
content, then immediately giving two request to squid.

this time, suppose:

   Object modified ... 00:00:00
   Valid +3600
    ==> fresh until 01:01:00
   Current time: 00:05:00

    01:01:00 > 00:05:00 ==> currently fresh. shouldn't bother the server.

instead what's actually happening is that squid is doing a request to my
server, only header, but it's still doing it.

My server, to compute the "Last-Modified" date has to do all the job of
collecting the data, looping to each data element and extract, for each,
the last modified date, then compute the "last one".. it build a "model"
that is then rendered: it's pretty short anyway since it's gzipped text.

So the big work of my server is to collect the data, and my server have
to do it both if you do a GET both if you do an HEAD request.

I would like squid to revalidate with my server every, say 1 minute,
even 10 seconds is ok.. but it shouldn't revalidate every single request
it is receiving.

I hope I made my point.

I wanted to give you an example but now squid is always giving me a TCP_MISS

# squid3 -k debug && curl -v
'http://localhost:280/alfresco/service/catalog/products'; squid3 -k debug
* About to connect() to localhost port 280 (#0)
* Trying 127.0.0.1...
* connected
* Connected to localhost (127.0.0.1) port 280 (#0)
> GET /alfresco/service/catalog/products HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0
OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3
> Host: localhost:280
> Accept: */*
>
* additional stuff not fine transfer.c:1037: 0 0
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Sat, 31 Mar 2012 14:53:51 GMT
< Content-Language: en_US
< Cache-Control: public, max-age=3600
< Last-Modified: Sat, 31 Mar 2012 14:03:55 +0000
< Vary: Accept, Accept-Language
< Content-Type: application/json;charset=UTF-8
< Content-Length: 1668
< Server: Jetty(6.1.21)
< X-Cache: MISS from localhost
< X-Cache-Lookup: MISS from localhost:280
< Via: 1.0 localhost (squid/3.1.19)
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive

in the debug log I see:

2012/03/31 16:53:51.696| getDefaultParent: returning localhost
2012/03/31 16:53:51.696| peerAddFwdServer: adding localhost DEFAULT_PARENT
2012/03/31 16:53:51.696| peerSelectCallback:
http://localhost/alfresco/service/catalog/products
2012/03/31 16:53:51.696| fwdStartComplete:
http://localhost/alfresco/service/catalog/products
2012/03/31 16:53:51.696| fwdConnectStart:
http://localhost/alfresco/service/catalog/products
2012/03/31 16:53:51.696|
PconnPool::key(flexformAccel,8080,localhost,[::]is
{flexformAccel:8080/localhost}
2012/03/31 16:53:51.696| PconnPool::pop: found
myfAccel:8080/localhost(to use)

[...]

2012/03/31 16:53:52.159| mem_hdr::write: [249,251) object end 249
2012/03/31 16:53:52.159| storeSwapOut:
http://localhost/alfresco/service/catalog/products
2012/03/31 16:53:52.159| storeSwapOut: store_status = STORE_PENDING
2012/03/31 16:53:52.159| store_swapout.cc(190) swapOut: storeSwapOut:
mem->inmem_lo = 0
2012/03/31 16:53:52.159| store_swapout.cc(191) swapOut: storeSwapOut:
mem->endOffset() = 251
2012/03/31 16:53:52.159| store_swapout.cc(192) swapOut: storeSwapOut:
swapout.queue_offset = 0
2012/03/31 16:53:52.159| store_swapout.cc(203) swapOut: storeSwapOut:
lowest_offset = 0
2012/03/31 16:53:52.159| MemObject::isContiguous: Returning true
2012/03/31 16:53:52.159| storeSwapOut: swapout_size = 251
2012/03/31 16:53:52.159| InvokeHandlers: 36A4E18536613102118E64F9C24B04A1
2012/03/31 16:53:52.159| StoreEntry::InvokeHandlers: checking client #0
2012/03/31 16:53:52.159| storeClientCopy2: returning because
ENTRY_FWD_HDR_WAIT set
2012/03/31 16:53:52.159| ctx: enter level 0:
'http://localhost/alfresco/service/catalog/products'
2012/03/31 16:53:52.159| haveParsedReplyHeaders: HTTP CODE: 200
2012/03/31 16:53:52.159| storeGet: looking up
84C0C508214C0BF21C5F311EE274EA0A
2012/03/31 16:53:52.159| storeGet: looking up
480FC351B11ECDB2420E2FAC5F72DF45
2012/03/31 16:53:52.159| 0x1663600: joined for id 59: Accept,
Accept-Language
2012/03/31 16:53:52.160| 0x161f430: joined for id 0: */*
2012/03/31 16:53:52.160| httpMakeVaryMark: accept="*%2F*", accept-language
2012/03/31 16:53:52.160| refreshCheck:
'http://localhost/alfresco/service/catalog/products'
2012/03/31 16:53:52.160| refreshCheck: Matched '. 0 20%% 259200'
2012/03/31 16:53:52.160| age: 61
2012/03/31 16:53:52.160| check_time: Sat, 31 Mar 2012 14:54:52 GMT
2012/03/31 16:53:52.160| entry->timestamp: Sat, 31 Mar 2012 14:53:51 GMT
2012/03/31 16:53:52.160| STALE: expires 1333205632 < check_time 1333205692
2012/03/31 16:53:52.160| Staleness = 60
2012/03/31 16:53:52.160| refreshIsCachable() returned non-cacheable..
2012/03/31 16:53:52.160| StoreEntry::expireNow:
'36A4E18536613102118E64F9C24B04A1'
2012/03/31 16:53:52.160| StoreEntry::setReleaseFlag:
'36A4E18536613102118E64F9C24B04A1'
2012/03/31 16:53:52.160| ctx: exit level 0

he decide the content shouldn't be kept in cache "non-cachable" but I do
not understand why... and the weird thing is that I did not modified the
squid configuration since yesterday so I don't know what changed.

Summary:
1) can I make squid3 update the cache with my server every, say, 1
minute (at most) but use it's cache otherwise without bothering the
server (not even for headers)? how?

2) why is now always giving TCP_MISS?

3) which is the best way to debug why squid3 is deciding to keep a cache
entry, contact the server or not? looking at the huge debug log is not
very simple :) maybe some log option to filter it with the cache
decisions informations only would help

thanks
Received on Sat Mar 31 2012 - 14:57:24 MDT

This archive was generated by hypermail 2.2.0 : Sat Mar 31 2012 - 12:00:04 MDT