Re: [squid-users] Squid content filtering and redirection

From: Alex Crow <alex_at_nanogherkin.com>
Date: Wed, 10 Nov 2010 17:19:37 +0000

> [please don't top post, please don't fullquote - thank you]
>
> One possible way:
>
> In "squid.conf"
>
> # Schmuddelfilter
> include /etc/squid/conf.d/schmuddel.conf
>
>
> with the file "/etc/squid/conf.d/schmuddel.conf"
>
> # Schmuddelfilter
> acl verboten url_regex "/etc/squid/schmuddel"
> acl ausnahme url_regex "/etc/squid/whitelist"
> http_access allow ausnahme
> http_access deny verboten
>
> and the wordlist files "/etc/squid/schmuddel" and "/etc/squid/verboten",
> one entry per line.
>
> As far as I know "squid" can't check the contents but only can check
> URLs etc.
>
> Viele Gruesse!
> Helmut

Dear Toth,

There is not much point for Squid to check in page contents as its focus
is essentially a cache, ie there to speed things up. That's why you
normally plug in other software to do things in and to page content.
However as mentioned above, url_regex is ideal and very powerful for
looking at stuff in urls.

For instance, we found that people were searching google for PHP proxies
and then using them to access facebook.

Rules such as these:
acl all_disallowed2 url_regex -i proxy.php
acl all_disallowed2 url_regex -i cgiproxy
acl all_disallowed2 url_regex -i proxy.cgi
acl all_disallowed2 url_regex -i nph-info.pl
acl all_disallowed2 url_regex -i proxy.pl
acl all_disallowed2 url_regex -i facebook
acl all_disallowed2 url_regex -i myspace
acl all_disallowed2 url_regex -i youtube
acl all_disallowed2 url_regex -i proxy_sites
acl all_disallowed2 url_regex -i unblock
acl all_disallowed2 url_regex -i proxies
acl all_disallowed2 url_regex -i webproxy
acl all_disallowed2 url_regex -i cgiproxy
acl all_disallowed2 url_regex -i hidemy
acl all_disallowed2 url_regex -i google.*\?q\=proxy
acl all_disallowed2 url_regex -i google.*\?q\=proxies
acl all_disallowed2 url_regex -i google.*\?q\=facebook
acl all_disallowed2 url_regex -i google.*\?q\=myspace
acl all_disallowed2 url_regex -i yahoo.*search.*\?p\=proxy
acl all_disallowed2 url_regex -i yahoo.*search.*\?p\=proxies
acl all_disallowed2 url_regex -i ask\.com.*\?q\=proxy
acl all_disallowed2 url_regex -i ask\.com.*\?q\=proxies
acl all_disallowed2 url_regex -i browse.php\?
acl all_disallowed2 url_regex -i button.php\?

Help stop most of that stuff being accessed for, or even googled for, at
least on a casual basis.

NB - be very careful with url_regex - if you, say, just had "proxy" as a
regex, you will not be able to access a huge amount of content behind
CDN's. BBC News is one that comes to mind.

Cheers

Alex
Received on Wed Nov 10 2010 - 17:19:45 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 11 2010 - 12:00:06 MST