Re: [squid-users] reverse proxy filtering?

From: Jeff Sadowski <jeff.sadowski_at_gmail.com>
Date: Sun, 19 Apr 2009 00:37:22 -0600

On Sun, Apr 19, 2009 at 12:29 AM, Jeff Sadowski <jeff.sadowski_at_gmail.com> wrote:
> On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
>> Jeff Sadowski wrote:
>>>
>>> On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries <squid3_at_treenet.co.nz>
>>> wrote:
>>>>
>>>> Jeff Sadowski wrote:
>>>>>
>>>>> I'm new to trying to use squid as a reverse proxy.
>>>>>
>>>>> I would like to filter out certain pages and if possible certain words.
>>>>> I installed perl so that I can use it to rebuild pages if that is
>>>>> possible?
>>>>>
>>>>> My squid.conf looks like so
>>>>> <==== start
>>>>> acl all src all
>>>>> http_port 80 accel defaultsite=outside.com
>>>>> cache_peer inside parent 80 0 no-query originserver name=myAccel
>>>>> acl our_sites dstdomain outside.com
>>>>
>>>> aha, aha, ..
>>>>
>>>>> http_access allow all
>>>>
>>>> eeek!!
>>>
>>> I want everyone on the outside to see the inside server minus one or
>>> two pages. Is that not what I set up?
>>
>> By lucky chance of some background defaults only, and assuming that the web
>> server is highly secure on its own.
>>
>> If you have a small set of sites, such as those listed in "our_sites" then
>> its best to be certain and use that ACL for the allow as well.
>>
>>  http_access allow our_sites
>>  http_access deny all
>>
>> ... same on the cache_peer_access below.
>>
>>>
>>>>> cache_peer_access myAccell all
>>>>> <==== end
>>>>>
>>>>> how would I add it so that for example
>>>>>
>>>>> http://inside/protect.html
>>>>>
>>>>> is blocked?
>>>>
>>>> http://wiki.squid-cache.org/SquidFaq/SquidAcl
>>>
>>> so I want redirector_access?
>>> Is there an example line of this in a file
>>>
>>> I tried using
>>>
>>> url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl
>>>
>>> but I guess that requires more to use it? an acl?
>>> should "acl all src all" be "acl all redirect all" ?
>>
>> No to all three. The above is all line you mention trying is all thats
>> needed.
>>
>>  url_rewrite_access allow all
>>
>> but the above should be the default when a url_rewrite_program  is set.
>
> so how do you tell it to use the url_rewrite_program with the inside site?
> Or does it use the script on all pages passing through the proxy?
>
> Is this only a rewrite on the requested url from the web browser?
> Ahh that might answer some of my questions before. I never tried
> clicking on it after implementing the rewrite script. I was only
> hovering over the url and seeing that it was still the same.
>
>>
>> What is making you think its not working? and what do the logs say about it?
>> Also what is the c:/replace.pl code?
>>
>
> <=== start
> #!c:\perl\bin\perl.exe
> $| = 1;
> $replace="<a href=http://inside/login.html.*?</a>";
> $with="no login";
> while ($INPUT=<>) {
> $INPUT=~s/$replace/$with/gi;
> print $INPUT;
> }
> <=== end
>
> I think I see the problem now I guess I am looking for something else
> besides url_rewrite maybe a full text replacement :-/
>
>>
>>>
>>>>> and is it possible to filter/replace certain words on the site
>>>>>
>>>>> like replace "Albuquerque" with "Duke City" for an example on all pages?
>>>>
>>>> No. no. no. Welcome to copyright violation hell.
>>>
>>> This was an example. I have full permission to do the real translations.
>>> I am told to remove certain links/buttons to login pages. thus I
>>> replace "<a herf=inside>button</a>" with "" Currently I have a
>>> pathetic perl script that doesn't support cookies and is gong through
>>> each set of previous pages to bring up the content. I was hoping squid
>>> would greatly simplify this.
>>> I was using www::mechanize I know this isn't the best way but they
>>> just need a fast and dirty way.
>>
>> Ah, okay. Well the only ways squid has for doing content alteration is far
>> too much as well for that use. (coding up an ICAP server and processing
>> rules or a full eCAP adaptor plugin).
>>

One more thing for the night if squid is written in C I think I can
easily modify it to do what I want. The problem then becomes compiling
it for windows. Can I just use cygwin?
I'm thinking I can have an external program run on the page before
handing it off to the web client. no?

>> IMO you need to kick the webapp developers to make their app do the removal
>> under the right conditions. It would solve many more problems than having
>> different copies of a page available with identical identifiers.
>>
>> Amos
>> --
>> Please be using
>>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
>>  Current Beta Squid 3.1.0.7
>>
>
Received on Sun Apr 19 2009 - 06:37:29 MDT

This archive was generated by hypermail 2.2.0 : Sun Apr 19 2009 - 12:00:02 MDT