Re: [squid-users] Problem with HTTP Headers

From: Ghassan Gharabli <sounarose_at_googlemail.com>
Date: Wed, 16 Nov 2011 14:26:17 +0200

Hello again,

Sorry I replied back quickly before without noticing your rule if it
has "/" or not and at first I didnt need to ignore "/?" because I am
caching several websites like name.flv/?.* so now I am using :

acl ExceptExt urlpath_regex -i (mp(3|4)|flv)/(\?.*)
acl facebook dstdomain .facebook.com
acl facebookPages urlpath_regex -i \.([jm]?htm[l]?|php)(\?.*|$)
acl facebookPages urlpath_regex -i /(\?.*|$)
cache deny facebook facebookPages !ExceptExt

Actually , I started to see Facebook.com in cache since they changed
to https://www.facebook.com so till now all servers that have the same
settings are no longer caching facebook main page header except one
server .. maybe one of the clients is infected with a malicious!

It is only being cached when one of clients are opening facebook
because I alredy opened facebook and it is not caching on this server
!.

> As you wish. I added that line because I noticed the front page for FB you
> wanted to non-cache has the URL path starting with the two characters "/?"
> instead of .html or .php.
>

How can I debug or trace the URL path that starts with "/?" and how
did you notice the front page for FB including two characters "/?" ?

BTW , I am trying my best to tune perl script that I did and yes I am
gaining much more traffic & more performance by decreasing rules and
try to match targeted urls with less lines.

Still studying REGEX and honestly Squid has saved me 54% of Traffic
and I can get more than that .

You have saved me more time on debugging it.

Thank you again.

Ghassan

On Mon, Nov 14, 2011 at 12:57 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> On Sun, 13 Nov 2011 19:14:48 +0200, Ghassan Gharabli wrote:
>>
>> Dear Amos,
>>
>> After allowing access  "Head" method in Squid Config
>>
>> I deleted www.facebook.com from cache andthen I tried executing
>>
>> squidclient -m head http://www.facebook.com
>>
>> Results :
>>
>> HTTP/1.0 302 Moved Temporarily
>> Location: http://www.facebook.com/common/browser.php
>> P3P: CP="Facebook does not have a P3P policy. Learn why here:
>> http://fb.me/p3p"
>> Set-Cookie: datr=hfW_TtrAQmi_2SxwAUY4EjPH; expires=Tue, 12-Nov-2013
>> 16:51:17 GMT
>> ; path=/; domain=.facebook.com; httponly
>> Content-Type: text/html; charset=utf-8
>> X-FB-Server: 10.53.10.59
>> X-Cnection: close
>> Content-Length: 0
>> Date: Sun, 13 Nov 2011 16:51:17 GMT
>> X-Cache: MISS from Peer6.skydsl.net
>> X-Cache-Lookup: MISS from Peer6.skydsl.net:3128
>> Connection: close
>>
>> I am not seeing any pragma or cache-control and expires! but redbot
>> shows the correct info there!.
>
> Ah, your squidclient is not sending a user-agent header. You will need to
> add -H "user-Agent: foo"
>
>>
>> BTW .. I am also using store_url but im sure nothing is bad there . I
>> am only playing with Dynamic URL regarding to Pictures and Videos
>> extensions so I have only one thing left for me to try which is unlike
>> to do it ..
>>
>> acl facebookPages urlpath_regex -i /(\?.*|$)
>>
>> First does this rule affect store_url?
>
> This is just a pattern definition. It only has effect where and when the ACL
> is used. The config I gave you only used it in the "cache deny" access line.
>
> That said, "cache deny" prevents things going to the cache, where storeurl*
> happens.
>
>
>>
>> For example when we have url like
>>
>> http://www.example.com/1.gif?v=1244&y=n
>>
>> I can see that urlpath_regex requires Full URL which means this rule
>> matches :
>>
>> http://www.example.com/stat?date=11
>
> The pattern begins with '/' and the "cache" access line I gave you included
> another ACL. Which tested the domain name was *.facebook.com.
>
> It will match things like:
>  http://www.facebook.com/?v=1244&y=n
>
> but *not* match things like:
>  http://www.example.com/1.gif?v=1244&y=n
>
>>
>> I will try to ignore this rule and let me focus on facebook problem
>> since we have more than 60% traffic on Facebook.
>>
>
> As you wish. I added that line because I noticed the front page for FB you
> wanted to non-cache has the URL path starting with the two characters "/?"
> instead of .html or .php.
>
> Amos
>
>
Received on Wed Nov 16 2011 - 12:26:24 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 17 2011 - 12:00:02 MST