Re: [squid-users] cache dynamically generated images

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 19 Feb 2011 11:47:07 +1300

On 19/02/11 04:53, Charles Galpin wrote:
> Hi All
>
> I'm having no luck doing something which I expected to be very simple, =
> yet has me completely stumped. I'll give you an overview of what I am =
> trying to do and then my problems. Any help is appreciated.
>
> In short, I want to put squid in front of apache which is serving images =
> dynamically (ie generating them from a script) and cache them for the =
> period specified by the script as it will vary per image. I am going to =
> have different lifespans for the images and I am testing with a max-age =
> of 10 minutes but in production it could be as little as 2 seconds. At =
> this point I'm hoping you're thinking "A basic http accelerator, ok".
>

Yep.

Make sure your config has had these changes:
   http://wiki.squid-cache.org/ConfigExamples/DynamicContent

which allows Squid to play with query-string (?) objects properly.

> So one of my questions might not really squid specific but if anyone =
> know you will. Squid is not caching the images and I suspect it's due to =
> the headers being generated. I get TCP_REFRESH_MISS/200 every time, even =
> with a 10 minute age. The browser is sending a "If-Modified-Since" =
> header as well as a "Cache-Controlmax-age=3D0" and I suspect this is =
> part of the problem.

Very likely.

TCP_REFRESH_MISS means the backend sent a new changed copy while
revalidating/refreshing its existing copy.

max-age=0 means revalidate that is has not changed before sending anything.

> I have set an Expires, Etag, "Cache-Control: =
> max-age=3D600, s-max-age=3D600, must-revalidate", "Content-Length and =

must-revalidate from the server is essentially the same as max-age=0
form the client. It will also lead to TCP_REFRESH_MISS.

BUT, these controls are only what is making the problem visible. The
server logic itself is the actual problem.

> Last-Modified headers in the response to no avail. Can you advise on =
> how to set the headers properly to get squid to cache them? Here is an =
> example request/response. My test script is pretty dumb so I can't keep =
> track of a last-modified date across sessions, but if I do not send a =
> last modified header in the response the browser does not use an =
> if-modified-since header in the subsequent request, but there is still =
> no caching.
>
> Request headers:
>
> Host: 192.168.13.31
> User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; =
> rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
> Accept: text/html,application/xhtml+xml,application/xml;q=3D0.9,*/*;q=3D0.=
> 8
> Accept-Language: en-us,en;q=3D0.5
> Accept-Encoding: gzip,deflate
> Accept-Charset: ISO-8859-1,utf-8;q=3D0.7,*;q=3D0.7
> Keep-Alive: 115
> Connection: keep-alive
> If-Modified-Since: Fri, 18 Feb 2011 15:42:29 GMT
> If-None-Match: 6382.jpg
> Cache-Control: max-age=3D0
>
> Response headers:
>
> Date: Fri, 18 Feb 2011 15:42:33 GMT
> Server: Apache/2.2.14 (Win32) SVN/1.6.15 mod_python/3.3.2-dev-20080819 =
> Python/2.6.4 DAV/2 mod_perl/2.0.4-dev Perl/v5.10.1
> Expires: Fri, 18 Feb 2011 15:52:35 GMT
> Etag: 6382.jpg

ETag should be the MD5 checksum of the file or something similarly
unique. It is used alongside the URL to guarantee version differences
are kept separate.

> Cache-Control: max-age=3D600, s-max-age=3D600, must-revalidate
> Content-Length: 15965
> Last-Modified: Fri, 18 Feb 2011 15:42:35 GMT
> Content-Type: image/jpg
> X-Cache: MISS from PowerSpecG158
> X-Cache-Lookup: HIT from PowerSpecG158:80
> Via: 1.1 PowerSpecG158:80 (squid/2.7.STABLE8)
> Connection: keep-alive
>
>
> If you think this approach is bad and you have a better way, I am all =
> ears. I have tried a mod_rewrite rule to check for the existence of a =
> static file and serve it if it exists, otherwise call the script which =
> generates both the static file and returns the content, but the files =
> would need to be purged when they are stale and at rapid refresh rates I =
> found I could easily get timing issues with the check for the static =
> file succeeding, but then apache not being able to find the find when it =
> goes to serve it.

Your approach is reasonable for your needs. But the backend server
system is letting you down by sending back a new copy every validation.
If you can get it to present 304 not-modified responses between file
update times this will work as intended.

This would mean implementing some extra logic in the script to handle
If-Modified-Since, If-Unmodified-Since, If-None-Match and If-Match headers.
   The script itself needs to be in control of whether a local static
duplicate is used, apache does not have enough info to do it as you
noticed. Most CMS call this server-side caching.

>
> Lastly, I was unable to setup squid on an alternate port - say 8081, and =
> use an existing apache on port 80, both on the same box. This is for =
> testing so I can run squid in parallel with the existing service without =
> changing the port it is on. Squid seems to want to use the same port =
> for the origin server as itself and I can't figure out how to say =
> "listen in 8081 but send requests to port 80 of the origin server". Any =
> thoughts on this? I am using another server right now to get around =
> this, but it would be more convenient to use the same box.

cache_peer parameter #3 is the port number on the origin server to send
HTTP requests to.

Also, to make the Host: header and URL contain the right port number
when crossing ports like this you need to set the http_port vport=X
option to the port the backend-server is using. Otherwise Squid will
place its public-facing port number in the Host: header to inform the
backend what the clients real URL was.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.11
   Beta testers wanted for 3.2.0.5
Received on Fri Feb 18 2011 - 22:47:13 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 22 2011 - 12:00:02 MST