Re: Bug in url_regex pattern matching?

From: Bernd P. Ziller <bziller@dont-contact.us>
Date: Mon, 26 Jan 1998 16:16:58 +0100 (MET)

> From: "Nick O'Brien" <N.OBrien@canterbury.ac.uk>

> Now in the the banned-list file I had the line:
>
> http://www.sex*
>
> which I expected would mean that acceses to any URL with http://www.sex in
> it would be denied. However I discovered that sites like
> http://www.sedon.co.uk/ were being denied as well. I know that it was this
> line as after I removed it, and restarted Squid - I was then able to
> access the above site.
>
> Is this a bug in the url_regex pattern matching mechanism or simply some
> misunderstanding on my part about how it should work?

It's only a misunderstanding on your part.

The meaning of the '*' is "zero of more accurances of the preceeding
character".

So 'http://www.sex*' will match all from 'http://www.se' to
'http://www.sexxxx..'.

It will also match something like:
'http://some.site/path/http://www.se/etc'

What you want is '^http://www.sex', any URL starting with
'http://www.sex' will match this regexp. No need to add any '.*'.

For more info an regexp see 'man 5 regexp'.

Regards,

-- 
   Bernd                  bziller@ba-stuttgart.de
-------------------------------------------------
      http://www.ba-stuttgart.de/~bziller/
  Perry Rhodan - Blind Guardian - Morwen   Pages
Received on Mon Jan 26 1998 - 07:31:20 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:31 MST