Re: Squid 3.4.0.1 configurator problems from Amos Jeffries on 2013-10-01 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 02 Oct 2013 14:28:08 +1300

On 2/10/2013 5:12 a.m., Tsantilas Christos wrote:
> On 09/27/2013 08:47 PM, Alex Rousskov wrote:
>> On 09/27/2013 09:39 AM, Amos Jeffries wrote:
>>> On 28/09/2013 3:18 a.m., Tsantilas Christos wrote:
>>>> On 09/27/2013 08:23 AM, Alex Rousskov wrote:
>>>>> Using approach (2) with flexible RE delimiter, we could write
>>>>>
>>>>> acl foo url_regex /ends[) (]/
>>>>> or
>>>>> acl foo url_regex {ends[) (]}
>>>>> or
>>>>> acl foo url_regex @ends[) (]@
>>>>>
>>>>> and it will all work without double escaping.
>>>>
>>>> Alex, in the "Revised approach to fixing configuration syntax" mail
>>>> thread you are proposing to use "regex::" prefix for regular
>>>> expressions. This is required for grammar consistency.
>>>> This is means that the regex should like :
>>>>
>>>> acl foo url_regex regex::/ends[) (]/
>>>> or
>>>> acl foo url_regex regex::{ends[) (]}
>>>> or
>>>> acl foo url_regex regex::@ends[) (]@
>> Yes, IF that syntax is adopted.
>>
>>
>>> Okay Alex I think we can agree on that flexible-delimiter syntax to
>>> avoid escaping.
>>>
>>> I also agree with that regex:: prefix.
>>>
>>> Is there anything else we have been disagreeing on?
>>
>> As far as REs are concerned, we need to decide
>>
>> 1) Whether we want to support the new regex:: syntax at all or keep
>> using spaceless REs as before (at least for now) while reserving the
>> regex:: prefix.

What benefit would be gained from not using it?

>>
>>
>> 2) If we want to support the new regex:: syntax:
>>
>> 2a) What characters do we allow as RE delimiters? Perl allows virtually
>> any non-whitespace character, even #, but we probably want to be more
>> restrictive.
> Any non whitespace character I think is good choice. Else any
> non-alphanumeric, non-whitespace character.

Any ASCII character 33 through 126 should be okay. The alphanumerical
ones make little sense though. I have no objection if you want to narrow
it down to punctuation characters. You may find it a bit difficult or a
waste of CPU to test for complex boundaries in the character set when
validating the delimiter start byte though.

>
>> 2b) Do we add support for escaping sequences? As discussed a few emails
>> back, that support is necessary if we want to support arbitrary REs,
>> which is somewhat important for automated config generators. It is also
>> needed for (2c).
> Escaping is important. The user will select the delimiters which
> requires the less escaping but may he is not able to avoid it:
> eg select this one
> regex::#A/test/with/one\#and/many/#
> instead of this:
> regex::/A\/test\/with\/one#and\/many\//

If we allow the entire range of 33-126 characters it will be an
exceedingly rare case where this is necessary.
With regex one can always use . in place of a difficult character in the
pattern.

>
>> 2c) Do we add support for character sequences so that one can add
>> special characters and such? This also requires a form of escaping. For
>> example, here are some of the sequences supported by Perl (we do not
>> support all of them immediately, of course, but we need to reserve
>> \-escape if we want them in the future):
>>
>>> Sequence Description
>>> \t tab (HT, TAB)
>>> \n newline (NL)
>>> \r return (CR)
>>> \f form feed (FF)
>>> \b backspace (BS)
>>> \a alarm (bell) (BEL)
>>> \e escape (ESC)
>>> \x{263A} hex char (example: SMILEY)
>>> \x1b restricted range hex char (example: ESC)
>>> \N{name} named Unicode character or character sequence
>>> \N{U+263D} Unicode character (example: FIRST QUARTER MOON)
>>> \c[ control char (example: chr(27))
>>> \o{23072} octal char (example: SMILEY)
>>> \033 restricted range octal char (example: ESC)
>> We could also try to abuse existing character class [[:class:]] syntax
>> for those. For example, we can find and replace [[:squid::octal(32):]]
>> sequences with a space character.
> Looks good idea to me to support perl syntax ...

No. Remember we are not parsing and compiling this pattern ourselves.
There is a library behind it all for the pattern compilation. Those
libraries support these things already in far better way than we can and
there is no reason for us to allow these control codes as syntax
delimiters in squid.conf.

Yes it would be a good idea to support other syntax patterns. But for
that we can change the regex:: token to a new name for a new pattern syntax.

Amos
Received on Wed Oct 02 2013 - 01:28:17 MDT

This archive was generated by hypermail 2.2.0 : Wed Oct 02 2013 - 12:00:09 MDT