Re: [PATCH] Improve Configuration Checking and Parsing

From: Tianyin Xu <tixu_at_cs.ucsd.edu>
Date: Sat, 5 Jan 2013 16:12:00 -0800

Hi, Amos,

I addressed your audit results one by one (see my reply below each
audit entry). The updated patches is attached.

Two things I want to mention in the very beginning:

1. I checked the version according to your information, and made sure
it's the newest one (revision 12566). :-)

2. For the precedence level, you mentioned "m*d/u == "m*(d/u)". I
checked both C and C++ reference manual*,**.
* M. A. Ellis and B. Stroustrup, "The Annotated C++ Reference
Manual", Addison-Wesley Publishing, Sept. 1999.
** S. P. Harbison and G. L. Steele Jr., "C: A Reference Manual (3rd
Edition)", Prentice Hall, 1991.

Both of them claims that multiplicative operators (include *, /, %)
have the same precedence level, and group left-to-right. So I think it
makes sense to rewrite m*d/u*2 to (m*d/u)*2 but not to m*(d/u)*2. Let
me know if I misunderstood.

Further audit or comments are very welcomed.

Thanks a lot!
Tianyin

On Fri, Jan 4, 2013 at 10:12 PM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> On 5/01/2013 3:05 p.m., Tianyin Xu wrote:
>>
>> Hi, all,
>>
>> I modified my patches according to Amos's comments and suggestions
>> (Thanks a lot, Amos!).
>>
>> The attached patch includes all the modifications upon 3.HEAD/trunk.
>>
>> The only thing I didn't follow is to move the xato* functions in the
>> compatibility library and to replace all the ato* functions in the
>> code. I thought about it for quite a while, but decided not to have a
>> xato* series as lib calls. The main reason is that these xato*
>> functions are tailored for configuration parsing instead of general
>> string-to-number functionality. All the current ato* functions (in
>> src/Parsing.cc) will call self_destruct() when error occurs, which in
>> turn prints out the bungled line in the configuration file (this's
>> really good!).
>
>
> Getting rid of that bungled message is one of our config parsing upgrade
> goals. If we can adjust the xato*() calls to report errors without aborting
> Squid it will make them more useful and re-usable. They should still log an
> error, but IMO aborting is going too far. The code which calls them also has
> more context to decide whether the config option deserves a fatal() exit or
> not. You can see a good example of that in src/HelperChildConfig.cc at the
> second chunk where you change atoi() to xatoi(); the calling code is
> reporting what values are valid, why and then calling self_destruct()
> anyway.
>
>
>> On the other hand, strto* series as lib calls already
>> provide safe string-to-number conversion, and anyone can take
>> advantage of them. So I don't think there's any need to have a new
>> series of lib calls.
>>
>> I modified some of the ato* functions during configuration file
>> parsing using the safe functions we implemented, but didn't change
>> others because they might not be used in configuration parsing (at
>> least I don't have good understandings of them). My modifications are
>> type safe, i.e., they didn't introduce any semantic or format change.
>> Backward compatibility is most important for users and customers. For
>> example, I use xatoui() only when the type of the dest variable is
>> "unsigned int"; GetPercentage() is introduced to support options like
>> "20%" (previously atoi ignores %), etc.
>>
>> Amos proposed the great idea that "In the cases where -1 is used we
>> want to update the documentation and parser to accept the string
>> "none". Which maps internally to the disabled value (usually still -1
>> or -2). After which anything else must be >0." But for now, I don't
>> want a tremendous change in my patch because it's quite dangerous to
>> abandon all the previous "-1"-like settings.
>
>
> Ack. Do this one separately as its own patch please.
>
>
> Now, audit results from reading your patches...
>
> compat/xstrto.cc:
> * the first chunk changing strtoul() to strtol() then checking for <0 is
> incorrect. Values such as 0xffffffff (with 8 f's) which strtol() maps to
> negative numbers and are valid unsigned numbers. In order to use signed
> conversion function to produce a unsigned value the output of the signed
> function MUST have a type larger than the unsigned functions output (ie
> 64-bit signed conversion function can be used to parse 32-bit unsigned
> values, and cannot be used to parse 64-bit signed values).
> The more correct change would be strtoul() -> strtoll(). However, these
> netfilter authored functions have been placed through some strict unit tests
> by the netfilter guys. They may have overlooked one case you can identify,
> but before you change them please add cppunit tests to verify that all the
> input cases are still handled correctly (we currently have none for these
> functions).

Yes, I changed it back.

>
> * the second chunk. does it still compile when you use the C++
> static_cast<unsigned long>() operator ?
>

Yes, it still compiles so I changed it using static_cast, and use
static_cast in other functions.

>
> in src/Parsing.cc:
> * the debugs message about "expected characters" is not very helpful. It
> should be changed to say what values are expected and what was found.
> For example: debugs(..., "ERROR: '-' is not a digit. '"<< token << "' is
> supposed to be a number.");
>

Yes, I refined the log messages.

> * I spot another bug which remains after your fixes. The "end" pointers
> passed to strto*() functions is documented as being NULL or checked for NULL
> value, but the Squid code is not initializing any of the "char *end;"
> variables to NULL. Please fix these as part of your patch. "char *end =
> NULL;" should be sufficient.
>

Yes, I initialized all the end pointers to NULL.

> * in xatoi(). This is C++ code please define the variable at point of first
> use when possible. Also, please use int64_t/uint64_t types rather than "long
> long".
> For example: int64_t input = xatoll(token, 10);
> The same goes for other functions where you are adding new local variables.
>

Yes, I change all the "long long" to "int64_t".

> * in xatoui(). The 64-bit signed conversion function xatol() or xatoll()
> needs to be used, not the xatoi() one.
>

Yes, I changed xatoi() to xatoll().

> * GetInteger() has the requirement of parsing masks. Please retain the
> comment about that. The portion about %i is invalid after changing to
> xatoll(), but GetInteger() still has that usage requirement and your patch
> only works because our xatoll() actually uses strtoll() as the backend.
>

Yes, I recover the previous comment and modified it.

> * GetPercentage() can make use of the strto*() function min/max values to
> double-check the range accepted.
> Also, please document in Parsing.h that 0%-100% range is validated and
> what happens when the value is out of range.
>

Yes, I added the document in Parsing.h

> in src/cache_cf.cc:
> * please do not add useless empty lines. Such as in the first chunk at line
> 1017 between the if statement '}' and function ending '}'.
>

Yes, I deleted that empty line and checked others.

> * the next two chunks doing "m*d/u*2" please add () brackets around the
> (d/u) portion to clarify what the formula is calculating.

Sorry I still do not understand this part. Please see the explanation
at the beginning of this mail.

>
> * the debugs statements "will be deprecated in future releases." is
> incorrect. The situation is "is deprecated." from the moment these debugs
> lines get merged.

Yes, I changed the log.

>
> * at line 3475. port is being set by the strtol() call in the if()
> condition. There is no need to parse the token twice.
>

I kept it.

> * the debugs you are adding at line 3705 appears to already be present in my
> copy of trunk with better text.
> Hmm. I'm wondering if you did get trunk checkout properly now.
>

I checked the newest version and that one is not there.:-)

> in src/wccp2.cc:
> * the second chunk is parsing a port value. It should probably be using
> xatos() and p defined as a uint16_t.
>

I kept for the better logging.

>
>
> Question: can you identify what the benefits/costs of using the strto*()
> versus the ato*() functions family are? I think now is a good time to
> identify which has the best performance and safety to migrate the code to
> them. xstrtoui() / xatoui() are overlapping, as are some of the others.
>
>
> Amos

--
Tianyin XU,
http://cseweb.ucsd.edu/~tixu/

Received on Sun Jan 06 2013 - 00:12:10 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 10 2013 - 12:00:05 MST