Re: [PATCH] Tokenizer

From: Kinkie <gkinkie_at_gmail.com>
Date: Mon, 2 Jun 2014 19:27:56 +0200

Hi Alex,
welcome back!

On to the meat:

>> +Parser::Tokenizer::token(SBuf &returnedToken, const CharacterSet &delimiters)
>> +{
>> + SBuf savebuf(buf_);
>> + SBuf retval;
>> + SBuf::size_type tokenLen = 0;
>
> This code has changed since the patch was posted for review, but please
> declare both retval and tokenLen as constant.

Done.

>> + SBuf::size_type prefixLen = buf_.substr(0,limit).findFirstNotOf(tokenChars);
>
>> + SBuf::size_type prefixLen = buf_.findFirstNotOf(tokenChars);
>
> Please declare variables as constant whenever possible.

Ok

>> + //fixme: account for buf_.size()
>> + bool neg = false;
>> + const char *s = buf_.rawContent();
>> + const char *end = buf_.rawContent() + buf_.length();
>
> Could you please explain what the above comment means? There is no
> SBuf::size() AFAICT.

length() is an alias of size(). Changed to size() to avoid confusion

>> + * At least one terminating delimiter is required. \0 may be passed
>> + * as a delimiter to treat end of buffer content as the end of token.
>
>> + if (tokenLen == SBuf::npos && !delimiters['\0']) {
>
> The above is from the actually committed code (not the code posted for
> review). Please remove \0 as a special case. As far as I can tell, the
> reason it was added is already covered well enough by the existing API.
> The addition of this special case is likely to just confuse and break
> things. For example, adding \0 to delimiters for the purpose of
> disabling this special case will also tell Tokenizer to skip \0
> characters in input, which is often wrong.

I'll let Amos argue for this as he's the one who has the clearest
picture of the use case.
Given your objection, my 2c suggestion to address the need is to
either have two token() methods, one which requires a separator at the
end of a token and one which doesn't, or to have a bool flag passed to
the method.

If noone objects, I'll commit to trunk the changes agreed above as
soon as it's build-tested; they are harmless from the functional point
of view.

-- 
    Francesco
Received on Mon Jun 02 2014 - 17:28:06 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 03 2014 - 12:00:17 MDT