Re: dstdomain and url_regex implementation from Henrik Nordstrom on 2006-06-20 (squid-dev)

From: Henrik Nordstrom <henrik@dont-contact.us>
Date: Tue, 20 Jun 2006 21:40:19 +0200

tis 2006-06-20 klockan 13:47 -0400 skrev Jean-Francois Levesque:
> Hi all,
>
> How are implemented the dstdomain and url_regex acl?

dstdomain is implemented using a splay tree.

url_regex is a linear list of regex patterns. regex:es does not have any
sorting property so there isn't much else we can do about them..

> Is it
> possible to pass a hashed Berkeley DB to squid as dstdomains?

Should be relatively trivial to add a dstdomains_db acl implementing
this. Would require at most N_dots lookups in the DB to determine if the
domain is there or not..

> For the url_regex, is each regex executed on every URL?

Yes, until a match is found or the list of patterns is exhausted. Same
for all regex based matches due to the unstructured nature of regex
patterns.

In theory we could compile all the regex:es into a single regex pattern
to allow the regex compiler to maybe (but most likely not) build some
intelligent pattern out of it, but there is very strict limitations on
regex pattern size and complexity in most regex implementations so this
can not be done easily..

Regards
Henrik

application/pgp-signature attachment: Detta är en digitalt signerad meddelandedel

Received on Tue Jun 20 2006 - 13:40:23 MDT

This archive was generated by hypermail pre-2.1.9 : Fri Jun 30 2006 - 12:00:02 MDT