Re: [squid-users] reverse proxy filtering?

From: Jeff Sadowski <jeff.sadowski_at_gmail.com>
Date: Sun, 19 Apr 2009 00:29:00 -0600

On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> Jeff Sadowski wrote:
>>
>> On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries <squid3_at_treenet.co.nz>
>> wrote:
>>>
>>> Jeff Sadowski wrote:
>>>>
>>>> I'm new to trying to use squid as a reverse proxy.
>>>>
>>>> I would like to filter out certain pages and if possible certain words.
>>>> I installed perl so that I can use it to rebuild pages if that is
>>>> possible?
>>>>
>>>> My squid.conf looks like so
>>>> <==== start
>>>> acl all src all
>>>> http_port 80 accel defaultsite=outside.com
>>>> cache_peer inside parent 80 0 no-query originserver name=myAccel
>>>> acl our_sites dstdomain outside.com
>>>
>>> aha, aha, ..
>>>
>>>> http_access allow all
>>>
>>> eeek!!
>>
>> I want everyone on the outside to see the inside server minus one or
>> two pages. Is that not what I set up?
>
> By lucky chance of some background defaults only, and assuming that the web
> server is highly secure on its own.
>
> If you have a small set of sites, such as those listed in "our_sites" then
> its best to be certain and use that ACL for the allow as well.
>
>  http_access allow our_sites
>  http_access deny all
>
> ... same on the cache_peer_access below.
>
>>
>>>> cache_peer_access myAccell all
>>>> <==== end
>>>>
>>>> how would I add it so that for example
>>>>
>>>> http://inside/protect.html
>>>>
>>>> is blocked?
>>>
>>> http://wiki.squid-cache.org/SquidFaq/SquidAcl
>>
>> so I want redirector_access?
>> Is there an example line of this in a file
>>
>> I tried using
>>
>> url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl
>>
>> but I guess that requires more to use it? an acl?
>> should "acl all src all" be "acl all redirect all" ?
>
> No to all three. The above is all line you mention trying is all thats
> needed.
>
>  url_rewrite_access allow all
>
> but the above should be the default when a url_rewrite_program  is set.

so how do you tell it to use the url_rewrite_program with the inside site?
Or does it use the script on all pages passing through the proxy?

Is this only a rewrite on the requested url from the web browser?
Ahh that might answer some of my questions before. I never tried
clicking on it after implementing the rewrite script. I was only
hovering over the url and seeing that it was still the same.

>
> What is making you think its not working? and what do the logs say about it?
> Also what is the c:/replace.pl code?
>

<=== start
#!c:\perl\bin\perl.exe
$| = 1;
$replace="<a href=http://inside/login.html.*?</a>";
$with="no login";
while ($INPUT=<>) {
$INPUT=~s/$replace/$with/gi;
print $INPUT;
}
<=== end

I think I see the problem now I guess I am looking for something else
besides url_rewrite maybe a full text replacement :-/

>
>>
>>>> and is it possible to filter/replace certain words on the site
>>>>
>>>> like replace "Albuquerque" with "Duke City" for an example on all pages?
>>>
>>> No. no. no. Welcome to copyright violation hell.
>>
>> This was an example. I have full permission to do the real translations.
>> I am told to remove certain links/buttons to login pages. thus I
>> replace "<a herf=inside>button</a>" with "" Currently I have a
>> pathetic perl script that doesn't support cookies and is gong through
>> each set of previous pages to bring up the content. I was hoping squid
>> would greatly simplify this.
>> I was using www::mechanize I know this isn't the best way but they
>> just need a fast and dirty way.
>
> Ah, okay. Well the only ways squid has for doing content alteration is far
> too much as well for that use. (coding up an ICAP server and processing
> rules or a full eCAP adaptor plugin).
>
> IMO you need to kick the webapp developers to make their app do the removal
> under the right conditions. It would solve many more problems than having
> different copies of a page available with identical identifiers.
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
>  Current Beta Squid 3.1.0.7
>
Received on Sun Apr 19 2009 - 06:29:08 MDT

This archive was generated by hypermail 2.2.0 : Sun Apr 19 2009 - 12:00:02 MDT