Re: Bug in url_regex pattern matching?

From: Michael Pelletier <mikep@dont-contact.us>
Date: Mon, 26 Jan 1998 10:16:52 -0500 (EST)

On Mon, 26 Jan 1998, Nick O'Brien wrote:

> Now in the the banned-list file I had the line:
>
> http://www.sex*
>
> which I expected would mean that acceses to any URL with http://www.sex in
> it would be denied. However I discovered that sites like
> http://www.sedon.co.uk/ were being denied as well. I know that it was this
> line as after I removed it, and restarted Squid - I was then able to
> access the above site.

That URL should be pronounced:

> The string "http://www", located anywhere in the URL, followed by any
> one charcter, followed by "se", followed by zero or more "x" characters.

The "www.sedon.co.uk" fits this criteria because it starts with
"http://www.se" followed by zero "x" characters. You should use the
following instead:

^http://www\.sex

This means:

> The string "http://www.sex" located at the beginning of the URL.

The unescaped "." in an RE means any character, and the "*" means zero or
more instances of the preceding character or regular expression token.

        -Mike Pelletier.
Received on Mon Jan 26 1998 - 07:28:19 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:31 MST