Re: [PATCH] Improve Configuration Checking and Parsing

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 05 Jan 2013 19:12:30 +1300

On 5/01/2013 3:05 p.m., Tianyin Xu wrote:
> Hi, all,
>
> I modified my patches according to Amos's comments and suggestions
> (Thanks a lot, Amos!).
>
> The attached patch includes all the modifications upon 3.HEAD/trunk.
>
> The only thing I didn't follow is to move the xato* functions in the
> compatibility library and to replace all the ato* functions in the
> code. I thought about it for quite a while, but decided not to have a
> xato* series as lib calls. The main reason is that these xato*
> functions are tailored for configuration parsing instead of general
> string-to-number functionality. All the current ato* functions (in
> src/Parsing.cc) will call self_destruct() when error occurs, which in
> turn prints out the bungled line in the configuration file (this's
> really good!).

Getting rid of that bungled message is one of our config parsing upgrade
goals. If we can adjust the xato*() calls to report errors without
aborting Squid it will make them more useful and re-usable. They should
still log an error, but IMO aborting is going too far. The code which
calls them also has more context to decide whether the config option
deserves a fatal() exit or not. You can see a good example of that in
src/HelperChildConfig.cc at the second chunk where you change atoi() to
xatoi(); the calling code is reporting what values are valid, why and
then calling self_destruct() anyway.

> On the other hand, strto* series as lib calls already
> provide safe string-to-number conversion, and anyone can take
> advantage of them. So I don't think there's any need to have a new
> series of lib calls.
>
> I modified some of the ato* functions during configuration file
> parsing using the safe functions we implemented, but didn't change
> others because they might not be used in configuration parsing (at
> least I don't have good understandings of them). My modifications are
> type safe, i.e., they didn't introduce any semantic or format change.
> Backward compatibility is most important for users and customers. For
> example, I use xatoui() only when the type of the dest variable is
> "unsigned int"; GetPercentage() is introduced to support options like
> "20%" (previously atoi ignores %), etc.
>
> Amos proposed the great idea that "In the cases where -1 is used we
> want to update the documentation and parser to accept the string
> "none". Which maps internally to the disabled value (usually still -1
> or -2). After which anything else must be >0." But for now, I don't
> want a tremendous change in my patch because it's quite dangerous to
> abandon all the previous "-1"-like settings.

Ack. Do this one separately as its own patch please.

Now, audit results from reading your patches...

compat/xstrto.cc:
* the first chunk changing strtoul() to strtol() then checking for <0 is
incorrect. Values such as 0xffffffff (with 8 f's) which strtol() maps to
negative numbers and are valid unsigned numbers. In order to use signed
conversion function to produce a unsigned value the output of the signed
function MUST have a type larger than the unsigned functions output (ie
64-bit signed conversion function can be used to parse 32-bit unsigned
values, and cannot be used to parse 64-bit signed values).
  The more correct change would be strtoul() -> strtoll(). However,
these netfilter authored functions have been placed through some strict
unit tests by the netfilter guys. They may have overlooked one case you
can identify, but before you change them please add cppunit tests to
verify that all the input cases are still handled correctly (we
currently have none for these functions).

* the second chunk. does it still compile when you use the C++
static_cast<unsigned long>() operator ?

in src/Parsing.cc:
* the debugs message about "expected characters" is not very helpful. It
should be changed to say what values are expected and what was found.
   For example: debugs(..., "ERROR: '-' is not a digit. '"<< token <<
"' is supposed to be a number.");

* I spot another bug which remains after your fixes. The "end" pointers
passed to strto*() functions is documented as being NULL or checked for
NULL value, but the Squid code is not initializing any of the "char
*end;" variables to NULL. Please fix these as part of your patch. "char
*end = NULL;" should be sufficient.

* in xatoi(). This is C++ code please define the variable at point of
first use when possible. Also, please use int64_t/uint64_t types rather
than "long long".
   For example: int64_t input = xatoll(token, 10);
  The same goes for other functions where you are adding new local
variables.

* in xatoui(). The 64-bit signed conversion function xatol() or xatoll()
needs to be used, not the xatoi() one.

* GetInteger() has the requirement of parsing masks. Please retain the
comment about that. The portion about %i is invalid after changing to
xatoll(), but GetInteger() still has that usage requirement and your
patch only works because our xatoll() actually uses strtoll() as the
backend.

* GetPercentage() can make use of the strto*() function min/max values
to double-check the range accepted.
   Also, please document in Parsing.h that 0%-100% range is validated
and what happens when the value is out of range.

in src/cache_cf.cc:
* please do not add useless empty lines. Such as in the first chunk at
line 1017 between the if statement '}' and function ending '}'.

* the next two chunks doing "m*d/u*2" please add () brackets around the
(d/u) portion to clarify what the formula is calculating.

* the debugs statements "will be deprecated in future releases." is
incorrect. The situation is "is deprecated." from the moment these
debugs lines get merged.

* at line 3475. port is being set by the strtol() call in the if()
condition. There is no need to parse the token twice.

* the debugs you are adding at line 3705 appears to already be present
in my copy of trunk with better text.
    Hmm. I'm wondering if you did get trunk checkout properly now.

in src/wccp2.cc:
* the second chunk is parsing a port value. It should probably be using
xatos() and p defined as a uint16_t.

Question: can you identify what the benefits/costs of using the strto*()
versus the ato*() functions family are? I think now is a good time to
identify which has the best performance and safety to migrate the code
to them. xstrtoui() / xatoui() are overlapping, as are some of the others.

Amos
Received on Sat Jan 05 2013 - 06:12:42 MST

This archive was generated by hypermail 2.2.0 : Mon Jan 07 2013 - 12:00:05 MST