RE: [squid-users] Caching a Script

From: David Groden <squid-cache@dont-contact.us>
Date: Fri, 19 Jul 2002 17:40:28 -0500

Anyone else see my last message and have any comments please? I know its
some pretty long and broad questions, but surely some people have something
to say about it.

Duane, again, the "Expires" header must not be included in squid's response,
and "that particular script" is the only file on my website, so therefore
it's the only one I'm talking about.

Thank you,
David Groden

-----Original Message-----
From: Duane Wessels [mailto:wessels@squid-cache.org]
Sent: Thursday, July 18, 2002 11:45 PM
To: David Groden
Cc: squid-users@squid-cache.org
Subject: Re: [squid-users] Caching a Script

On Thu, 18 Jul 2002, David Groden wrote:

> The only way I managed to get a
> "TCP_HIT" for my script from squid was to drop a couple of lines in
> httpd.conf that made apache send out an "Expires: (now plus 1 month)"
header
> with the script. As soon as I did that, it was all "TCP_HIT"'s. Now, squid

This is the right thing to do, IMO.

> is successfully returning the cached version of hits to my script, but the
> side effect of the "Expires" header, which is to cache the hit at every
> cache along the way as well as the client's browser (and browser memory!),
> is unacceptable.

You can put the "Expires: (now plus 1 month)" directive in an .htaccess
file, or maybe inside a <Location> wrapper in httpd.conf so it
only applies to that particular script.

-----Original Message-----
From: David Groden [mailto:squid-cache@dgroden.com]
Sent: Thursday, July 18, 2002 10:46 PM
To: squid-users@squid-cache.org
Subject: [squid-users] Caching a Script

Packages (RedHat 7.3):
squid 2.4.STABLE6-6.7.3
squirm 1.23-7
apache 1.3.23-14

Need:
My website is only one script (http://xxx/xxx.xxx). The querystring or post
data determines what content is displayed. To save loading time and
processor usage, I want to cache certain hits to my script that match
certain querystrings.

Changes I've made in squid.conf:
1. I want to run apache on another port and squid on port 80 in "httpd
accelerator" mode so I:
   changed some acl's under "ACCESS CONTROLS"
   changed "http_port" from "3128" to "80"
   changed "icp_port" from "3130" to "0"
   changed "httpd_accel_port" from "80" to "virtual"
   changed "httpd_accel_uses_host_header" from "off" to "on"

2. I need it to cache hits with querystrings so I:
   commented out the line "acl QUERY urlpath_regex cgi-bin \?"
   commented out the line "no_cache deny QUERY"
   changed "hierarchy_stoplist" from "cgi-bin ?" to "cgi-bin"

3. I'm using squirm so I:
   added the line "redirect_program /usr/lib/squid/squirm"
   changed "redirect_rewrites_host_header" from "on" to "off"
   changed "redirect_children" from "5" to "10"

So what's the problem(s)?
1. how do I make squid decide whether or not to cache a hit by examining the
querystring? Is that part of squid, squirm, both, or neither? How so? Have
any examples? I plan on exploring this more if I can ever get past my next
problem which is...

2. Squid kept generating a "TCP_MISS" for hits to my script and grabbing a
fresh copy from apache. I AM ASSUMING <--red flag? :) that this was because
the "Last Modified" header returned with my script by apache changes every
time to the current time, causing squid to "miss" on the IMS
(If-Modified-Since) check. Am I right? The only way I managed to get a
"TCP_HIT" for my script from squid was to drop a couple of lines in
httpd.conf that made apache send out an "Expires: (now plus 1 month)" header
with the script. As soon as I did that, it was all "TCP_HIT"'s. Now, squid
is successfully returning the cached version of hits to my script, but the
side effect of the "Expires" header, which is to cache the hit at every
cache along the way as well as the client's browser (and browser memory!),
is unacceptable. I see that squid can remove headers from outgoing requests
with the "anonymize_headers" option, but it can't remove or modify the
headers of outgoing responses. Is there a patch or feature available that
allows squid to do this? ...Or am I just going about this all wrong
(wouldn't be surprising)? I'm assuming that the only reason my script wasn't
being returned from cache was because of the "Last Modified" header, and the
"Expires" header is overriding the IMS check and generating a "TCP_HIT". Is
this correct? If so, maybe I'm supposed to make apache always return a
specific "Last Modified" header for my script instead of generating a new
one each time?

What you need to know before replying : :)
The website is very complex with all sorts of different areas and screens
and involves a couple hundred databases. Please understand that the script
is not flexible, and I am only asking questions about the use and
functionality of squid itself. Also, yes, I understand that different orders
of the values in the querystring will create redundant files in the cache.
It won't be a problem. The only links in the world to my script containing a
querystring are generated by the script itself and are always in the same
order. I know I'm probably like the ten millionth person to want to use a
cache as an accelerator to speed up those (less than) dynamic pages that
normally take 20 seconds or more to process. Every one of those ridiculously
priced website accelerators out there tout this as one of their "most
exciting" features. Yet, the only web pages and discussions I've found that
concern using squid in this manner seem to revolve around arguing about the
principle of caching cgi's, like "fix your slow script" or "make your script
generate a hard coded page, blah blah blah." ...and other such gibberish. Is
it possible with squid and how still seems to be the unanswered question. At
least I haven't found the whole answer in one place yet, though I have seen
the question posed about a hundred times. Could some nice person please
reveal the secret few steps it takes to selectively cache a script using
querystrings with squid to myself and the rest of the people out there who
don't want to spend $10,000 on a $400 server running some "proprietary
software" that looks suspiciously like squid?

Thank you,
David
Received on Fri Jul 19 2002 - 16:40:44 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:09:18 MST