Re: [squid-users] How are regular expressions handled in SQUIDs acls?

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Tue, 17 Dec 2002 20:51:48 +0100

anton wrote:

> URLs, but unfortunately i can't understand, how does the regexps work and
> how to create an acl for all these files.
> It should looks something like this:
> acl url-regex (.*avi$)|(.*mpeg$)|(.*mp3$)
> - but how exectly???
> Thanks in advance.
> Anton.

Squid uses "extended regexp", i.e. the same as egrep.

What you want is to match .avi, .mpeg and .mp3, right? In such case the
following regexp list will do the job nicely

   \.avi$ \.mpeg$ \.mp3$

And you quite likely want to do it case insensitive (-i flag).

You can also combine the regexes into a single larger regex if you want,
but this is not a requirement to make it work. Combinind multiple
regexes into a larger is sometimes a little faster to compute but makes
your expressions a little harder to maintain

   \.avi$|\.mpeg$|\.mp3$

To explain the above:

  \. matches a dot

  avi matches.. well.. the three letters "a" "v" "i" after each other.

  $ matches "the end".

What this means is that "\.avi$" matches ".avi" at the end of
something.

| is a special thing. regex1|regex2 matches regex1 OR regex2. Note
that | has a very low precendence in the regex language and you do not
need to use () to group things because of | unless you want to have the
"OR" in the middle of something else, such as in "squid is
(free|open)" which matches both the sequence "squid is free" and "squid
is open" but not "squid is for free".

. is also a special thing unless escaped with \. A dot matches any
character.

? is another interesting regex character. It makes the previous regex
atom (single character or () grouped expression) optional. "squid is(
for)? free" matches both "squid is free" and "squid is for free".

* is quite interesting, but is usually combined with . * makes the
previous regex atom repeat 0 or more times. When comined with . into .*
this matches "anything", i.e. the same as * in the shell. You can
combine * with other regex atoms, but this is very rarely done probably
because it is not very often one wants to look for repetetive patterns
and for repetetive matches {} is usually more suitable.

{} allows you to specify a repetetition range. <regexatom>{min,max}

Regards
Henrik Nordström
MARA Systems AB, Sweden
http://www.marasystems.com/
Received on Tue Dec 17 2002 - 14:42:44 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:12:07 MST