Re: Initial patch for file suffix acl

From: Robert Collins <robertc@dont-contact.us>
Date: Tue, 04 Mar 2008 04:37:43 +1100

On Mon, 2008-03-03 at 14:31 -0300, Lucas Brasilino wrote:
> Hi Robert:
>
> >
> > Its probably better to do this as a combined regex:
> >
> > acl file_suffix exts .foo .bar .baz
> >
> > creating a regexp based acl .*(\.foo|\.bar|\.baz)$
> >
> > any decent regexp engine will be better at this than your linear search.
>
> Do you think compiling a regex (ok, it's made once) and matching it to
> an url (maybe _huges_ urls, maybe many hundreds times per second) are
> cheapper than this file_suffix implementation ?

for sure. you have a linear scan, comparing all suffixes on all urls.

if you assume a normal distribution, of the urls that fail, you will
have compared all extensions
and urls that match you will have compared 50% of the extensions on
average

checking a single extension requires checking (say) 4 characters
if you have 20 extensions, thats 40 char checks (plus the memory hit to
get the list)

the same 20 extensions, with a good regex compiler should boil down to a
single check for '.' at offset -4, then a single check for the number of
unique next letters etc. basically a perfect tree rather than linear
scan.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Received on Tue Mar 04 2008 - 02:22:56 MST

This archive was generated by hypermail pre-2.1.9 : Tue Apr 01 2008 - 13:00:10 MDT