Re: [squid-users] cache dynamically generated images

From: Charles Galpin <cgalpin_at_lhsw.com>
Date: Tue, 22 Feb 2011 11:36:07 -0500

Hi Amos, thanks so much for the help. More questions and clarification needed please

On Feb 18, 2011, at 5:47 PM, Amos Jeffries wrote:
>
> Make sure your config has had these changes:
> http://wiki.squid-cache.org/ConfigExamples/DynamicContent
>
> which allows Squid to play with query-string (?) objects properly.

Yes these were default settings for me. I don't think this is
necessarily an issue for me though since I am sending URLs that look
like static image requests, but converting them via mod_rewrite in
apache to call my script.

> TCP_REFRESH_MISS means the backend sent a new changed copy while
revalidating/refreshing its existing copy.
>
> max-age=0 means revalidate that is has not changed before sending anything.
>
>> I have set an Expires, Etag, "Cache-Control:
>> max-age=600, s-max-age=600, must-revalidate", "Content-Length

>
> must-revalidate from the server is essentially the same as max-age=0
form the client. It will also lead to TCP_REFRESH_MISS.

I'll admit I threw in the must-revalidate as part of my increasingly
desperate attempts to get things behaving the way I wanted, and didn't
fully understand it's ramifications, nor the client side max-age=0
implications, but your explanation helps!

> BUT, these controls are only what is making the problem visible. The
server logic itself is the actual problem.

Agreed!

> ETag should be the MD5 checksum of the file or something similarly
unique. It is used alongside the URL to guarantee version differences
are kept separate.

Yes, this was another desperate attempt to force caching to occur, and
will implement something more sane for the actual app. But this should
have helped shouldn't it? For my testing this should have uniquely
identified this image right?

I guess I have a fundamental mis-understanding, but my assumption was
all these directives were ways to tell squid to not keep asking the
origin, but server from the cache until the age expired and at that
point check if it changed. I totally didn't expect it to check every
time, and this still doesn't sit well with me. Should it really check
every time? I know a check is faster than an actual GET but it still
seems more than necessary if caching parameters have been specified.

> Your approach is reasonable for your needs. But the backend server
system is letting you down by sending back a new copy every validation.
> If you can get it to present 304 not-modified responses between file
update times this will work as intended.
>
> This would mean implementing some extra logic in the script to handle
If-Modified-Since, If-Unmodified-Since, If-None-Match and If-Match
headers.
> The script itself needs to be in control of whether a local static
duplicate is used, apache does not have enough info to do it as you
noticed. Most CMS call this server-side caching.

Ok, I can return 304 and it gets a cache hit as expected so this is
great. I am not sure I'll waste any time making my test script any
smarter as it's just a simple perl script and the actual implementation
will be in java and be able to make these determinations, but one of the
things that has been throwing me off, is I see no signs in the apache
logs of a HEAD request, they all show up as GETs. I assume this is my
mod_rewrite rule, but I also tried with a direct url to the script and
am not getting the If-Modified-Since header for example (the only one I
know off the top of my head is set by the CGI module).

But either way, this confirms it's just my dumb script to blame :)

>>
>> Lastly, I was unable to setup squid on an alternate port - say 8081,and
>> use an existing apache on port 80, both on the same box. This is for
>> testing so I can run squid in parallel with the existing servicewithout
>> changing the port it is on. Squid seems to want to use the same port
>> for the origin server as itself and I can't figure out how to say
>> "listen in 8081 but send requests to port 80 of the origin server".Any
>> thoughts on this? I am using another server right now to get around=
>> this, but it would be more convenient to use the same box.
>
> cache_peer parameter #3 is the port number on the origin server to
send HTTP requests to.
>
> Also, to make the Host: header and URL contain the right port number
when crossing ports like this you need to set the http_port vport=X
option to the port the backend-server is using. Otherwise Squid will
place its public-facing port number in the Host: header to inform the
backend what the clients real URL was.

Yes I have this but it's still not working. Below are all uncommented
lines in my squid.conf - can you see anything I have that's messing this
up? The imageserver.my.org is an apache virtual host if it matters. With
this, if I go to http://imageserver.my.org:8081/my/image/path.jpg ,
squid calls http://imageserver.my.org:8081/my/image/path.jpg instead of
http://imageserver.my.org:80/my/image/path.jpg

acl all src all
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
acl http8081 port 8081
acl local-servers dstdomain .my.org
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 8081 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access allow http8081
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 8081 vhost vport=80 defaultsite=imageserver.my.org
cache_peer imageserver.my.org parent 80 0 no-query originserver default
hierarchy_stoplist cgi-bin ?
access_log c:/squid/var/logs/access.log squid
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
acl shoutcast rep_header X-HTTP09-First-Line ^ICY.[0-9]
upgrade_http0.9 deny shoutcast
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
always_direct allow all
always_direct allow local-servers
coredump_dir c:/squid/var/cache

thanks,
charles
Received on Tue Feb 22 2011 - 16:36:13 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 23 2011 - 12:00:03 MST