RE: [squid-users] Regex url lists and DNS blacklist acls

From: Henrik Nordstrom <henrik@dont-contact.us>
Date: Fri, 01 Sep 2006 09:04:02 +0200

fre 2006-09-01 klockan 08:22 +0200 skrev Thomas Nilsen:

> As utils like squidguard/dansguardian are able to handle regex files
> with good performance, I was hoping to achieve the same with asqredir or
> similar light tools.

squidguard doesn't handle large regex expression lists any better than
Squid. The problem with large regex lists is not the tool used, but the
fact that it's a large regex list which takes time to match.

> I assume Squid caches any external regex_url file?

If you mean acl xxx url_regex "/path/to/file" then this is the same as
having all the patterns inside squid.conf. It's read into memory and
compiled on startup/reconfigure.

The problem of regex lists is the evaluation of the acl on each request.
As regex patterns cannot be sorted Squid (or any other url regex based
acl lookup) has to walk the complete list of patterns on each request
testing if the request matches the pattern. Because of this lookup time
in a regex list is linear to the number of patterns in the list, while
lookup time in most other acl types is nearly constant independent of
the acl size.

From SquidGuard documentation:

  * While the size of the domain and urllists only has marginal
    influence on the performance, too many large or complex expressions
    will quickly degrade the performance of squidGuard. Though it may
    depend heavily on the performance of the regex library you link
    with.

And it's exacly the same for Squid, except that we don't have a close
match of urllists.

Regards
Henrik

Received on Fri Sep 01 2006 - 01:04:07 MDT

This archive was generated by hypermail pre-2.1.9 : Sun Oct 01 2006 - 12:00:03 MDT