Re: Revised approach to fixing configuration syntax

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 27 Sep 2013 17:43:49 -0600

On 09/27/2013 08:13 AM, Tsantilas Christos wrote:

> My sense is that it is not possible to have backward compatibility. We
> had discuss in "[PATCH] Unknown cfg function" mail thread many cases
> where the quoted tokens can not be backward compatible.

Yes, we cannot have 100% backward compatibility, but I think we can
achieve 98% compatibility (so to speak), and that would be worth the
pain. The earlier attempt was closer to 1%, which was unacceptable, and
even after fixing a few obvious cases such as percents and REs, it is
still probably below 75%. I think we can do a lot better.

In other words, we should not break common cases at all but should not
worry about breaking a few exceptional cases.

I started cataloging breakages at
http://wiki.squid-cache.org/ConfigSyntax

> If the default quoted tokens mode enabled by default a lot of
> administrator will not have desired behaviour in their systems.

I think we should enumerate all backward incompatible cases that we know
about. I hope they will not apply to "a lot of admins". I hope that we
can be smart about which options should get quoted string parsing by
default and which should assume they are include files.

> For example in legacy mode the token "AToken" includes the quotes but in
> quoted tokens mode the quotes are not included.

Yes, but there should not be many configurations using such things. Is
there some common need for them in authentication or something? If yes,
perhaps we can treat those cases specially.

I looked at all squid.conf options with the TYPE set to string. I did
not find any of them that are remotely likely to use quoted values
without spaces.

There are a couple of options like cache_mgr that set email addresses. I
have a hard time imagining a valid email address that would contain a
quote but will not contain a space. If I am wrong about this and such
cases are not uncommon, let's parse those options in legacy mode and
offer similar workarounds as (1) and (2) discussed below.

> This is not the only problem. In legacy mode, quotes may have different
> meaning in configuration parameters, somewhere supported, in other cases
> are not supported and in acls used to load external files.

Yes, but the revised approach forces the caller (the specific directive
parsing code) to tell the low-level parser which flavor of the value
expression to extract (relaxed, strict, strict-if-possible-or-relaxed,
etc.).

Parsers of configuration options that already support quoted strings can
tell the low-level parser to extract a strict token if possible. Parsers
for conflicting legacy options such as logformat will tell the low-level
parser to extract a legacy value. New options should all use strict syntax.

Furthermore, an http_access parser, for example, does not need to
support quoted strings at all. Thus, it can request that the low-level
parser treats any quoted string as a legacy include statement (with a
warning). Same for "src" ACL parser and, I bet, many others.

Finally, all v3.3 options that use strtokFile() should interpret "quoted
strings" the old way (including a file with values after issuing a
warning). If an admin wants to use an ACL value with a space, they can:

1) Use a 'single quoted string'. This is the recommended approach.

2) Use squid::"double quoted string". This is only needed if they want
Squid to interpret special escape sequences inside that quoted value.

This will solve all the problems with the included parameters, right?!

> Maybe we can do the transition in steps. For example:
>
> - 3.4 release: Add quoted tokens, legacy is enabled by default

The result: Virtually nobody notices any changes and, hence, virtually
no bug reports about us breaking something important.

Might as well not rewrite the v3.4 parser at all (undo that last minute
v3.4 commit), so that Amos does not have to wait for those changes to
stabilize and can make smooth progress with v3.4 release.

> - 3.5 release: legacy does not allow quotes in tokens, force users to
> specify the mode they want to use.

> - 3.6 release: quoted tokens enabled by default

I do not see much value in delaying the new default until v3.6. If we
are going to introduce upgrade pains, it is better to do that in one
release (v3.5) than spread it over two releases IMO.

However, there are many cases where the parser would issue a "you are
using a legacy syntax" warning (or equivalent). I think it would be
beneficial to eventually remove the special code that issues those
warnings (after a few releases, probably not in v3.6). That removal will
simplify the parser considerably, which is good long-term.

To summarize, here is what I am thinking as far as introducing this feature:

  v3.4: Legacy mode. Special code to allow ACLs with spaces.
  v3.5: Strict mode with smart fallbacks to legacy (and warnings).
  v3.6: ... same but make those Warnings scarier ...
  v3.7: ... same but make those WARNINGS even scarier ...
  v3.8: Strict mode without fallbacks. Remove "legacy_mode" directive.

Please review the new wiki page. If you can think of more breakage cases
to add, let's discuss them.

Thank you,

Alex.
Received on Fri Sep 27 2013 - 23:44:11 MDT

This archive was generated by hypermail 2.2.0 : Sat Sep 28 2013 - 12:00:11 MDT