Re: [squid-users] Squid Blocking non-listed websites

From: Amos Jeffries <squid3@dont-contact.us>
Date: Sun, 03 Feb 2008 16:17:35 +1300

Go Wow wrote:
> I have attached 3 files named squid.conf sites.txt (whitelist) and
> blacklisted_sites.txt (Blacklist)
>
> okay i will paste my squid.conf here to will be helpful for others
>
> auth_param basic children 5
> auth_param basic realm Squid proxy-caching web server
> auth_param basic credentialsttl 2 hours
> auth_param basic casesensitive off

It looks like auth. But is never activated or used...

> acl all src 0.0.0.0/0.0.0.0
> acl manager proto cache_object
> acl localhost src 127.0.0.1/255.255.255.255
> acl to_localhost dst 127.0.0.0/8
> acl SSL_ports port 443
> acl Safe_ports port 80 # http
> acl Safe_ports port 21 # ftp
> acl Safe_ports port 443 # https
> acl Safe_ports port 70 # gopher
> acl Safe_ports port 210 # wais
> acl Safe_ports port 1025-65535 # unregistered ports
> acl Safe_ports port 280 # http-mgmt
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 591 # filemaker
> acl Safe_ports port 777 # multiling http
> acl CONNECT method CONNECT
> acl BadSites url_regex -i "/etc/blacklisted_sites.txt"
> acl GoodSites url_regex -i "/etc/sites.txt"
> acl lpo_BadSites url_regex -i "/etc/lpo_blacklisted_sites.txt"

Your first thing are these regexes.

If they match ANYWHERE in the URI the site will either be
allowed/denied. You have some very short matches in there that could be
doing strange things to sites passing query parameters in the URI.

Secondly regex is slow and will never scale to anything large enough to
do a content-filter job properly. A list of dstdomain would be better in
some of those cases. It matches just the requested domain name against
the ACL. Some of your matches are for domains.

Thirdly, their positioning at the top of the access list means that the
allow GoodSites has your proxy open to abuse for anyone who can/wants to
fiddle the URIs or use your permitted sites. ie check their gmail
through you.

Consider; does it really matter that people can get to porn through you?
If you are trying to make a child protection filter there is much better
software out there to do it with than squid.

> acl home_network src 192.168.10.10-192.168.10.120
> acl lpo_network src 192.168.10.60-192.168.10.72

Right at the top you are missing a:
     http_access deny !home_network

that will stop most people getting to your proxy to abuse it.
The rest should be about stopping internal abuses.

> http_access deny lpo_BadSites lpo_network
> http_access deny lpo_network

Okay, so the lpo part of the network is not allowed to use the proxy.
In that case checking BadSites here specially is a wast of time. They
never get through to any of those anyway.

> http_access allow GoodSites

Now anyone can get in if they are visiting a site you like (ie gmail) or
want to fiddle the URI to contains certain easily detected things.

> http_access deny BadSites

And the effects of a short-and-anywhere-without-case uri match makes
itself felt...

Turning BadSites into a list of bad domains and a dstdomsin ACL would do
a lot here.

> http_access allow home_network

This should be the second-to-LAST thing you do.

> http_access allow manager localhost
> http_access deny manager
> http_access deny !Safe_ports
> http_access deny CONNECT !SSL_ports

These above denies are really useless unless they are done before the
permits that override them.

The next line will be doing a global deny anyway if they get this far.

> http_access deny all

> http_reply_access allow all
> icp_access allow all
> http_port 192.168.10.1:3128 transparent
> hierarchy_stoplist cgi-bin ?
> acl QUERY urlpath_regex cgi-bin \?
> cache deny QUERY
> cache_mem 16 MB
> access_log /usr/local/squid/var/logs/access.log squid
> cache_log /usr/local/squid/var/logs/cache.log
> cache_store_log /usr/local/squid/var/logs/store.log

This log is not that useful unless you are debugging storage.
You can gain a bit by setting:
   cache_store_log none

> mime_table /usr/local/squid/etc/mime.conf
> pid_filename /usr/local/squid/var/logs/squid.pid
> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440
> refresh_pattern . 0 20% 4320
> acl apache rep_header Server ^Apache
> broken_vary_encoding allow apache
> cache_effective_user squid
> cache_effective_user squid

twice?

> icon_directory /usr/local/squid/share/icons
> error_directory /usr/local/squid/share/errors/English
> hosts_file /etc/hosts
> dns_testnames google.com
> coredump_dir /usr/local/squid/var/cache
>
> If you see my whitelist (sites.txt) I need to add words like
> "examination" to access them although examination word is not added to
> my blacklist. Let me know the problematic thing plz.

I'd say the sites are matching something in BadSites list. Probably
'ass' or 'tit'. (nee Examination ASSignment? embASSy? clASS examination?
examination TITle?). The bad list has some very short words that are
commonly part of other innocent words.
This is the universal problem with content filters.

You would probably have a better rate finding a list of porn websites
and blocking those by IP address (dst) or domain name (dstdomain) than a
short regex.

Amos

-- 
Please use Squid 2.6STABLE17+ or 3.0STABLE1+
There are serious security advisories out on all earlier releases.
Received on Sat Feb 02 2008 - 20:17:31 MST

This archive was generated by hypermail pre-2.1.9 : Sat Mar 01 2008 - 12:00:04 MST