Re: [PATCH] Unknown cfg function

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 31 Jul 2013 06:17:58 +1200

On 31/07/2013 5:46 a.m., Tsantilas Christos wrote:
> On 07/30/2013 08:03 PM, Alex Rousskov wrote:
>> On 07/30/2013 10:04 AM, Amos Jeffries wrote:
>>> On 31/07/2013 3:03 a.m., Tsantilas Christos wrote:
>>>> 2) If configuration_includes_quoted_values is set to "on" (new style
>>>> enabled) and token ends on a ( character, consider it as function name.
>>>> If the token is "parameters" return FunctionParameters type else return
>>>> FunctionNameUnknown type and parsing fails
>>
>>> This above logic still means that we are requiring all regex patterns to
>>> be quoted strings whenever they contain () brackets around the final
>>> segment of pattern.
>> To be more accurate with regard to intent, the above implies that folks
>> should not use new syntax (configuration_includes_quoted_values on) with
>> regular expressions for now [until we add proper RE support].
>>
>> Quoting REs using regular strings is too painful and error prone. It is
>> not recommended.
>>
>>
>>> => If we are going to make any regex require special quoting, then I
>>> think it worthwhile being consistent and making them all require quoting.
>>> Are we agreed that making regex "-quoted is a good idea?
>> We need special syntax for RE IMO. I like what Perl does because it is
>> m/simple/ and m_at_flexible@, but perhaps there is a better way.
>>
>>
>>> The '\' escape is already used internally by regex and layering multiple
>>> levels of \-escaping gets nasty quite quickly.
> This is true...
> Maybe we can limit escaping to a set of characters.
> For example if single quotes used, escape only the ' character and if
> double quotes used allow escaping " % and $ characters
>
>
>
>> Agreed.
>>
>>
>>
>>>> 3) If configuration_includes_quoted_values is set to "off" (new style
>>>> enabled) and token ends on a (, if the token is "parameters" return
>>>> FunctionParameters type, else do not end the token on '(' but to the
>>>> next whitespace character.
>>>> For example for the string "test(test) more" will return as token the
>>>> "test(test)"
>>> config set to "off" is legacy parser. Typo in your description?
> Yes!
>
>>> If quoted-values is OFF. I would be expecting to get the token...
>>> "test(test) ... notice the missing end-quote since start-quote should
>>> have been ignored and treated as part of the single word token.
>> I agree with Amos here. Legacy parsers should continue function as
>> before quoted strings changes if possible. We do not even need to (and
>> perhaps should not) support parameters() syntax in a legacy parser. Is
>> it possible to give folks true legacy behavior?
> In my example the quotes should not exist. What I want to say in my
> example is that for the following line:
> test(test) more
> The NextToken will return test(test).
>
> However Amos refers to an other case. For the following line:
> "Simple Tokens"
> we may want to retrieve the token "Simple
>
> Do we have any example where this is required? (Not for regex, for regex
> we have an exception...)

In most places where we accept header values in the legacy parser.

I have seen a few people who have things like "quoted string" for auth
realm, and since the old squid.conf sent the quotes out as-is they ended
up with helpers and browsers security records using the quoted form.
There might be any amount of unknown pain ahead if we suddenly start
stripping quotes off strings in the backward compatible parse mode.

>
>>
>>>> For the new style if someone want to use '(' character on a regex for
>>>> example, he should use quotes:
>>>> 'test(.*/)'
>>>> "test(.*/)"
>> Double quotes do not work well because REs use backslashes of their own.
>> It becomes too messy as Amos noted above. I think we should not support
>> REs in a new parser until we add Perl-like RE quoting (or better).
> OK. This is something we can do it.
> We can add a method ConfigParser::NextWord which will return a raw word
> which will consist by any non whitespace character. Whitespaces can be
> escaped.
> Or use an On/Off global variable which enables/disables for the next
> token this behaviour....

Hmm. Might work.

I have in mind to shuffle the ConfigParser methods into groups. One for
the parser mechanics and one group for the type-specific GetX() methods.
Some of those methods are provided in ConfigParser already, the rest are
mixed up with src/Parsing.cc.

A ConfigParser::RegexPattern() method should be perfectly fine to add.

Amos
Received on Tue Jul 30 2013 - 18:18:04 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 31 2013 - 12:00:07 MDT