[PATCH] Improve Configuration Checking and Parsing

From: Tianyin Xu <tixu_at_cs.ucsd.edu>
Date: Fri, 4 Jan 2013 18:05:23 -0800

Hi, all,

I modified my patches according to Amos's comments and suggestions
(Thanks a lot, Amos!).

The attached patch includes all the modifications upon 3.HEAD/trunk.

The only thing I didn't follow is to move the xato* functions in the
compatibility library and to replace all the ato* functions in the
code. I thought about it for quite a while, but decided not to have a
xato* series as lib calls. The main reason is that these xato*
functions are tailored for configuration parsing instead of general
string-to-number functionality. All the current ato* functions (in
src/Parsing.cc) will call self_destruct() when error occurs, which in
turn prints out the bungled line in the configuration file (this's
really good!). On the other hand, strto* series as lib calls already
provide safe string-to-number conversion, and anyone can take
advantage of them. So I don't think there's any need to have a new
series of lib calls.

I modified some of the ato* functions during configuration file
parsing using the safe functions we implemented, but didn't change
others because they might not be used in configuration parsing (at
least I don't have good understandings of them). My modifications are
type safe, i.e., they didn't introduce any semantic or format change.
Backward compatibility is most important for users and customers. For
example, I use xatoui() only when the type of the dest variable is
"unsigned int"; GetPercentage() is introduced to support options like
"20%" (previously atoi ignores %), etc.

Amos proposed the great idea that "In the cases where -1 is used we
want to update the documentation and parser to accept the string
"none". Which maps internally to the disabled value (usually still -1
or -2). After which anything else must be >0." But for now, I don't
want a tremendous change in my patch because it's quite dangerous to
abandon all the previous "-1"-like settings.

Please review my patch. Any comment are highly welcomed, and I'll be
very happy to modify them.

Best,
Tianyin

On Thu, Jan 3, 2013 at 7:38 PM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> On 2013-01-04 14:50, Tianyin Xu wrote:
>>
>> Hi, all,
>>
>> As proposed earlier, I was working on improving the configuration
>> checking and parsing during the holiday.
>>
>> I mainly focused on two aspects including numeric parsing and enum
>> values by using more strict checkings and more verbosity. I hope to
>> make the software more user-friendly and less error-prone :-)
>> Since all the code are used during user configuration translation, it
>> wouldn't introduce too much overhead.
>>
>> Could anyone tell me how to submit my patches?
>>
>
> Our merge procedure and requirements are all documented here:
> http://wiki.squid-cache.org/MergeProcedure
>
>
>
>> I describe what I did as follows, and I hope it makes sense to you guys.
>>
>> ================
>> 1. Numeric Options
>> ================
>> --------------------------------
>> 1.1. Integer Overflow
>> --------------------------------
>>
>> I mainly modified the parsing functions in squid-3.2.5/src/Parsing.cc
>> and squid-3.2.5/compat/xstrto.cc,
>
>
> FYI: changes built on stable branches such as 3.2 are most likely to have
> problems merging. We will be applying your changes to the 3.HEAD / trunk and
> differences between the branches can cause issues both code bugs on broken
> patching and design problems as the feature differences build up (ie your
> early patches will be in trunk but not 3.2, so your later patches against
> 3.2 will clash with your early ones).
>
>
>>
>> There're several places have integer overflow problems, e.g.,
>>
>> 240 int
>> 241 xatoi(const char *token)
>> 242 {
>> 243 - return xatol(token);
>> 244 + long long input;
>> 245 + int ret;
>> 246 +
>> 247 + input = xatoll(token, 10);
>> 248 + ret = (int) input;
>> 249 +
>> 250 + if (input != (long long) ret) {
>> 251 + debugs(0, 0, "ERROR: The value '" << token << "' is
>> larger than the type 'int'.");
>> 252 + self_destruct();
>> 253 + }
>> 254 +
>> 255 + return ret;
>> 256 +}
>>
>> As this patch illustrated, I mainly check the converted values to make
>> sure it's valid in range.
>>
>> ------------------------------------------------
>> 1.2. String to Number conversion
>> ------------------------------------------------
>>
>> Here, I mainly abandon the direct use of "atoi()", "sscanf()", etc..
>> Squid has already noticed these problem (that's why we have our own
>> xatoi, xatos, etc), so what I do is to make them consistent with what
>> we are expecting. Moreover, some configuration parsing functions still
>> use unsafe code, though we've already had the safe conversion
>> functions. So I simply let it use the available conversion. The
>> following patch is one of the typical modifications:
>>
>> 328 GetInteger64(void)
>> ......
>> 338 - i = strtoll(token, NULL, 10);
>> 339 + input = xatoll(token, 10);
>> ......
>>
>
> We have a compatibility library where these xato*() functions should be
> defined. The versions spread out in the parsing code need to be moved to
> compat/xato.h and compat/xato.cc files. Then we need to provide wrapper
> replacements so the main code can use the "normal" ato*() names and the
> compiler replace with our improved xato*() at build time.
>
> When replacing unsafe functions please ensure the bug in the unsafe one is
> documented as the reason in the commit message. And where relevant please
> add definitions in compat/unsafe.h (with documentation why please) to make
> compile impossible when they are used.
>
>
>> -----------------------------------------
>> 1.3. Signedness conversion
>> -----------------------------------------
>>
>> For signedness issues, I mainly check the patterns which directly
>> assigns a signed value to an unsigned variable, as follows:
>>
>> 165 - s->tcp_keepalive.idle = atoi(t);
>> 166 + s->tcp_keepalive.idle = xatoui(t);
>>
>> In this example, "idle" is defined as "unsigned int" and "t" comes
>> from user input. A negative value (i.e., "-1" to disable the feature)
>> will be converted to a super large positive value if we use atoi()
>> directly.
>
>
> In the cases where -1 is used we want to update the documentation and parser
> to accept the string "none". Which maps internally to the disabled value
> (usually still -1 or -2). After which anything else must be >0.
>
>
>>
>> ============================
>> 2. Enum Options (including Boolean)
>> ============================
>> ----------------------
>> 2.1. Strictness
>> ----------------------
>>
>> For enum options, I mainly strict the if-else checking. I don't know
>> whether this's appropriate.
>>
>> 93 if (!strcasecmp(token, "on") || !strcasecmp(token, "enable"))
>> 94 *var = 1;
>> 95 - else
>> 96 + else if(!strcasecmp(token, "off") || !strcasecmp(token,
>> "disable"))
>> 97 *var = 0;
>> 98 + else {
>> 99 + debugs(0, 0, "ERROR: Invalid option -- Boolean options
>> only accept 'on'/'off' or 'enable'/'disable' to turn on/off the
>> feature.");
>> 100 + self_destruct();
>> 101 + }
>>
>
> It is much like what I would have done. Thank you. Just two notes:
>
> * The debug should probably be: debugs(0, DBG_PARSE_NOTE(DBG_IMPORTANT),
> ...
>
> * we generally document on/off as the values for boolean, enable/disable are
> just also-supported values. You can drop enable/disable out of the debugs
> help documentation and probably add them as extra cases with debugs
> informing the admin to update their config to use values on/off.
>
>
>> ---------------------
>> 2.2. Verbosity
>> ---------------------
>>
>> I personally like to give more error logs when users configuration
>> input has something wrong. For example, in parse_access_log(), if none
>> of the keyword can match the user's input, the following message will
>> be printed out:
>>
>> debugs(3, 0, "Log format '" << logdef_name << "' is not defined");
>>
>> So, I add several log message like this one:
>>
>> 133 - else
>> 134 + else {
>> 135 + debugs(0, 0, "ERROR: Invalid option " << token << ":
>> 'uri_whitespace' accepts 'strip', 'deny', 'allow', 'encode', and
>> 'chop'.");
>> 136 self_destruct();
>> 137 + }
>> 138 }
>>
>
> These are fine, but please add the extra verbose debug details with the
> level parameter being DBG_PARSE_NOTE(2) [or maybe something deeper than 2],
> so that they show up with squid -k parse and not on regular reconfigure
> operations.
>
>
> Amos

--
Tianyin XU,
http://cseweb.ucsd.edu/~tixu/

Received on Sat Jan 05 2013 - 02:05:34 MST

This archive was generated by hypermail 2.2.0 : Sat Jan 05 2013 - 12:00:12 MST