Re: [squid-users] Removing overlapping subdomains from blacklists

From: Marcus Kool <Marcus.Kool_at_urlfilterdb.com>
Date: Wed, 21 Aug 2013 21:48:10 +0200

On Wed, Aug 21, 2013 at 05:27:55PM +0100, Andrew Wood wrote:
> Hi
>
> Can someone please help me work out an algorithm to remove overlapping
> subdomains from a blackclist such as shallalist to prevent errors such as:
>
> ERROR: 'interracialcandy.tumblr.com' is a subdomain of '.tumblr.com'
> 2013/08/21 17:18:41| ERROR: because of this '.tumblr.com' is ignored to
> keep splay tree searching predictable
> 2013/08/21 17:18:41| ERROR: You should remove
> 'interracialcandy.tumblr.com' from the ACL named 'ProhibitedSitesDomains'

Is it your intention to block tumblr.com and all subdomains ?
And are .tumblr.com (non-adult) and interracialcandy.tumblr.com (adult)
both in the same list ?

You can use ufdbguard, a URL filter for Squid.
The ufdbguard software suite has a utility called ufdbGenTable that
converts text files with domains and URLs to a database table and
in this conversion process emits similar errors but behaves different
from Squid: if both subdomain.example.com and example.com are in a
list, ufdbGenTable puts example.com in the table which effectively blocks
example.com and all subdomains.

Marcus

> The problem is that TLDs like .com or .net are easy but some domains
> have two 'tlds' such as .co.uk (yes I know strictly thats not a tld but
> you know what i mean!) and there are so many different country domains
> some with two levels and the possibility of more in the future how can I
> make it future proof?
>
> Im sure Im not the only one to tear my hair out on this but I cant find
> a solution anywhere.
> Perhaps we can calaborate on here to produce a Perl or Python script
> which anyone can use?
>
> Thanks
> Andrew
>
>
Received on Wed Aug 21 2013 - 19:48:18 MDT

This archive was generated by hypermail 2.2.0 : Thu Aug 22 2013 - 12:00:10 MDT