Re: [squid-users] url_regex problem

From: Henrik Nordstrom <henrik_at_henriknordstrom.net>
Date: Fri, 18 Jul 2008 23:46:18 +0200

On lör, 2008-07-19 at 00:54 +1200, Amos Jeffries wrote:
> Best option is to give up early and take other easier paths, like
> teaching each and every client not to browse porn in the first place. Or
> just blocking all non-plaintext traffic unless its pre-vetted.

Using a soft block has proven to be quite effectively (soft == user gets
told the content may be unappropriate according to the policy of use and
then gets the choice to continue if he inists, knowing that traffic is
logged and inspected..).

In such setups the filter does not need to be very good, or even
accurate. It's job is solely to remind people that there us a policy of
use they need to follow and that their Internet use is monitored for
abuse.

> The full-word pattern looks like this:
> [^a-zA-Z]([a-zA-Z]+)[^a-zA-Z]
> substitute your word for the bracketed part.

There is also regex word boundary conditions which helps reducing the
above..

\bbadword\b

works on most regex libraries (certainly anything derived from or
related to GNU regex).

but then you need to think of what is a word in the context... most
likely you will find that nearly none of your patterns is a word in the
context, just parial words from their use actually being cocatenated
with some other word without any form of separator..

> *** Note this is ONLY US-ASCII english words. Mostly useless now that
> URI have been internationalized.

And what crap way that has been done... but I guess the porn industry
loves it for it's obscurity.

Regards
Henrik

Received on Fri Jul 18 2008 - 21:46:23 MDT

This archive was generated by hypermail 2.2.0 : Sat Jul 19 2008 - 12:00:03 MDT