Re: [squid-users] Accelerating proxy not matching cgi files

From: Amos Jeffries <>
Date: Fri, 26 Aug 2011 02:22:34 +1200

On 25/08/11 21:38, Mateusz Buc wrote:
> 2011/8/24 Amos Jeffries<>:
>> Maybe. We would need to see the HTTP headers produced by gen.cgi to be sure.
>> From the description of how index.cgi/gen.cgi interact I think it highly
>> likely the lack of Cache-Control and Last-Modified information from gen.cgi
>> is causing the cache algorithms to determine its unsafe to store.
> I gained access to the code of gen.cgi and made few changes:
> printf("Cache-Control: max-age=600, s-maxage=300\n");
> printf("Last-Modified: %s\n",mdate);
> It now fetches timestamp from the URL, parses it to appropriate format
> and then outputs as Last-Modified header. Plus I added Cache-Control.
> Results are noticable - now I get most of TCP_REFRESH_UNMODIFIED/304
> on my test page (gen.cgi links don't change there, so all timestamps
> remain the same all the time).
> Thank you a lot for these suggestions!
> However, I still can't make these URLs/images cached on my squid. Is
> there any chance they can be served directly from squid cache when
> they do not change? Right now I have reduced network bandwidth
> obviously, but not sure about CPU load - it still takes almost the
> same time to load URL (about 8 seconds).

Halfway there. Stage 1 complete after a fashion.

  - TCP_ = TCP transport used
  - REFRESH = If-Modified-Since sent to origin (aka gen.cgi)
  - UNMODIFIED = full object came back. Headers +body apparently
identical to the known cached copy.
  - /304 = converted to a 304 "no change" response for the client half
of the transaction.

The 304 portion going across client<->Squid is where you are getting
*all* the bandwidth savings right now.

As I said earlier:

>> At this point incoming requests will either be requesting brand new content or
>> have an If-Modified-Since: header containing the cached objects Last-Modified: timestamp.
>> NOTE: You will not _yet_ see any reduction in the 200 requests. Potentially you might
>> actually see an increase as "must-revalidate" causes middleware caches to start working better.

The difference you are seeing to what I predicted is caused by your use
of max-age instead of must-revalidate.

  max-age allows the browsers to cache the graphs for 600 seconds. So
you will get _zero_ repeat traffic for that duration. The exact opposite
of what must-revalidate will do for you.
  On top of that you cannot see Squid serving HIT requests because of
s-maxage. Its set at 300 so Squid will expire before the browser cache
does. When the browser _does_ request an IMS request the Squid copy has
already expired and forces a contact to gen.cgi to check for updates.

Okay fine, use max-age and s-maxage. To get HITs under the current
circumstances set s-maxage larger than max-age. Or omit it and have
Squid cache the same length as any browser. Its shared by all clients,
so you will get some, but not a lot more.

> Do you have any further tips?

Just this: Keep going.

  You are roughly up to the end of Step 1 of my earlier instructions.
Step 2 is where the CPU benefits start appearing.

  Every time gen.cgi can decide If-Modified-Since is newer than graph
data. It saves all the graph production CPU time AND the graph size
worth of bandwidth.


Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.14
   Beta testers wanted for
Received on Thu Aug 25 2011 - 14:22:47 MDT

This archive was generated by hypermail 2.2.0 : Thu Aug 25 2011 - 12:00:02 MDT