Re: [PATCH] Unknown cfg function

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 29 Jul 2013 16:45:57 -0600

On 07/29/2013 02:13 PM, Amos Jeffries wrote:
> On 30/07/2013 6:40 a.m., Alex Rousskov wrote:
>> After this patch, if I type parametres(foo) instead of parameters(foo),
>> will Squid think that I am defining a regular expression instead of
>> importing foo where my true regular expressions are stored?

> Yes. And the only way to avoid that is to either prohibit blah(foo) in
> places where regex is expected, or to ignore such typos at this level
> and rely on the upper level data validation.

I disagree that this is the only way to do it. There are two other ways:

2) Treat regular expressions as any other parameter, requiring them to
be simple tokens or quoted strings. This is how C++ and many other
languages without native regular expressions support handle REs:
compileRe("this is\"my\" RE")

3) Add a special syntax for regular expressions and require all regular
expressions to use that syntax. That is how Perl and other languages
with native regular expressions support handle REs.
/this is "my" RE/

Long-term, I prefer #3 because it is makes life much easier for humans.

Both #2 and #3 break old configurations, of course, but both stop
proliferation of madness and allow for proper squid.conf validation
without guesses.

Even if we manage to agree on something like #3 as the long-term goal,
we still need to come up with a good transition plan.

> The core problem here is that '(' is treated as a primary delimiter just
> like whitespace outside of quoted strings. So *any* use of brackets
> inside a token shifts to the filename loading logics. You suggested on
> IRC that we use perl syntax s/(foo)/ patterns. There is still a '('
> present inside there which will be detected as end of opaque element
> "s/" -> invalid function name -> self_destruct().

Not really. If m/foo/ REs are introduced (which is #3 above), the foo
expression will either not be parsed for functions or an escape
mechanism will be used to support them safely. This is no different from
regular expressions in Perl and other languages with native RE support.

> Regex are inherently dangerous to a very high degree simply from their
> nature and operation. Overall I don't see typos in regex fields as being
> made any more worse or problematic for the existence of "parameters("
> tokens. Other non-regex config details should be able to and actively
> applying more targeted token vlidation on top and detect these typos
> with a better context-aware message - which may or may not result in
> self_destruct() dependign on that extra context.

I am sure it would be useful for regular expressions to contain %macros.
We can certainly exclude support for %macros from regular expressions,
but it does not really buy us anything except backward compatibility. It
does create or sustain problems though (because it keeps configuration
syntax dependent on the acl type name). Same for functions().

Cheers,

Alex.
Received on Mon Jul 29 2013 - 22:46:12 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 30 2013 - 12:00:50 MDT