Re: [squid-users] Problem with HTTP Headers

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 13 Nov 2011 13:26:03 +1300

On 13/11/2011 12:15 p.m., Ghassan Gharabli wrote:
> Hello Amos,
>
> I understand what you wrote to me but I really do not have any rule
> that tells squid to cache wwww.facebook.com header ..

According to http://redbot.org/?uri=http%3A%2F%2Fwww.facebook.com%2F

FB front page has Expires, no-store, private, and must-revalidate. Squid
should not be caching these at all unless somebody has maliciously
erased the control headers. Or your squid has ignore-* and override-*
refresfh_patterns for them (I did not see any in your config, which is good)

Can you use:
    squidclient -m HEAD http://www.facebook.com/

to see if those headers you get match the ones apparently being sent by
the FB server.

>
> I only used refresh_pattern to match Pictures , Videos& certain
> extensions by using ignore-must-revalidate , ignore-no-store ,
> ignore-no-cache , store-stale .. etc
>
> and howcome this rule doesnt work ?
>
> refresh_pattern -i \.(htm|html|jhtml|mhtml|php)(\?.*|$) 0 0%
>
> This rule tells squid not to cache these extensions if we had static
> URL or dynamic URL.

The refresh_pattern algorithm only gets used *if* there are no Expires
or Cache-Control headers stating specific information.

Such as "private" or "no-store" or "Expires: Sat, 01 Jan 2000 00:00:00 GMT".

>
> As I noticed every time you open a website for example www.mtv.com.lb
> then you try to open it again next day but you get the same news (
> yesterday) which confused me and allow me to think that maybe Squid
> ignore all headers related to website if you cached for example
> pictures and multimedia objects thats why I was asking which rule
> might be affecting websites?.
>
> I cant spend my time on adding list to "cache deny" on websites that
> were being cached so I thought of only removing the rule caused squid
> to cached Websites .
>
> How to ignore www.facebook.com not to cache but at the same time I
> want to cache pictures , FLV Videos , CSS , JS but not the header of
> the main page (HTML/PHP).

With this config:
    acl facebook dstdomain .facebook.com
    acl facebookPages urlpath_regex -i \.([jm]?htm[l]?|php)(\?.*|$)
    acl facebookPages urlpath_regex -i /(\?.*|$)
    cache deny facebook facebookPages

and remove all the refresh_pattterns you had about FB content.

Which will cause any FB HTML objects which *might* have been cachable to
be skipped by your Squid cache.

Note that FLV videos in FB often come directly from youtube, so are not
easily cached. The JS and CSS will retain the static/dynamic properties
they are assigned by FB. You have generic refresh_pattern rules later on
in yoru config which extend their normal storage times a lot.

>
> refresh_pattern ^http:\/\/www\.facebook\.com$ 0 0% 0
>
> I tried to use $ after .com as I only wanted not to cache the main
> page of Facebook but still I want to cache Pictures and Videos at
> Facebook and so on at other websites .

And I said the main page is not "http://www.facebook.com" but
"http://www.facebook.com/"

so you should have added "/$" instead of just "$".

BUT, using "cache deny" as above this becomes not relevant any more.

Amos
Received on Sun Nov 13 2011 - 00:26:08 MST

This archive was generated by hypermail 2.2.0 : Sun Nov 13 2011 - 12:00:02 MST