Re: [RFC] Time to talk about StringNG merge again?

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 30 Jul 2013 18:53:58 -0600

On 07/30/2013 03:56 AM, Kinkie wrote:

> lp:~squid/squid/stringng-cherrypick has the most uptodate code.

I did not finish the review before running out of time, but I found one
or two bugs in r12761 (in the code I have not looked at before). There
are also a few polishing issues:

> typedef int32_t size_type;

I think we should make it unsigned. I agree with your earlier
observation that it is better to do that now rather than later or never.

> void reserveSpace(size_type minSpace) {reserveCapacity(length()+minSpace);}

As I mentioned before, this lacks overflow control if the sum of
length() and minSpace exceeds size_type maximum. Please add a check.

> cow(minSpace+length());

Same here, in rawSpace().

> /** Request to extend the SBuf's free store space.
> /** Request to resize the SBuf's store capacity

The caller does not really request to extend or resize the store because
they do not normally know what the store size is. I suggest replacing
"extend" and "resize" with "guarantee".

> * This method can also be used to prepare the SBuf by preallocating a
> * predefined amount of free space; this may help to optimize subsequent
> * calls to SBuf::append or similar methods. In this case the returned
> * pointer should be ignored.
...
> char *rawSpace(size_type minSize);

Please remove that part of the comment. You can say "See also:
reserveSpace()" if you want.

However, I would _add_ something like this:

    Unlike reserveSpace(), this method does not guarantee exclusive
    buffer ownership. It is instead optimized for a one writer
    (appender), many readers scenario by avoiding unnecessary
    copying and allocations.

> //reserve twice the format-string size, it's a likely heuristic
> rawSpace(strlen(fmt)*2);
>
> while (length() <= maxSize) {
...
> sz = vsnprintf(bufEnd(), store_->spaceSize(), fmt, ap);

The above does not illustrate how to use rawSpace() well. We do not have
to provide good examples in lower level code, but I think it is a good
idea to eat our own dog food. The above code should probably be rewritten as

     // heuristic: we will need twice as much space as the format
     const size_type minSpace = strlen(fmt)*2;

     while (length() <= maxSize) {
...
         const char *space = rawSpace(minSpace);
         sz = vsnprintf(space, spaceSize(), fmt, ap);

BTW, please add SBuf::spaceSize() as a public method so that
optimization like the above will work. Otherwise, non-privileged users
will have to write

    foobar(buf.rawSpace(minSize), minSize);

instead of getting an advantage of the actual space size being larger
than the requested minimum.

> if (sz >= static_cast<int>(store_->spaceSize()))
> rawSpace(sz*2); // TODO: tune heuristics

I would use reserveSpace() here instead of ignoring rawSpace() result
because we know the buffer needs to be reallocated at this point, right?

> /* snprintf on Linux returns -1 on overflows */
...
> if (sz >= static_cast<int>(store_->spaceSize()))
> rawSpace(sz*2); // TODO: tune heuristics
> else if (sz < 0) // output error in vsnprintf
> throw TextException("output error in vsnprintf",__FILE__, __LINE__);

If snprintf on Linux returns -1 on overflows, we should not throw when
sz is -1. We should grow the buffer instead, right?

> while (length() <= maxSize) {
> sz = vsnprintf(bufEnd(), store_->spaceSize(), fmt, ap);
...
> /* snprintf on FreeBSD returns at least free_space on overflows */
>
> if (sz >= static_cast<int>(store_->spaceSize()))
> rawSpace(sz*2); // TODO: tune heuristics
> else if (sz < 0) // output error in vsnprintf
> throw TextException("output error in vsnprintf",__FILE__, __LINE__);
> else
> break;
> }

Nothing in that loop grows length(), which is your guard against an
"infinite" loop on FreeBSD (at least). Can you think of a better loop guard?

> len_ += sz;
> // TODO: this does NOT belong here, but to class-init or autoconf
> /* on Linux and FreeBSD, '\0' is not counted in return value */
> /* on XXX it might be counted */
> /* check that '\0' is appended and not counted */
>
> if (operator[](len_-1) == '\0') {
> --sz;
> --len_;
> }

There is no guarantee that len_ is positive. We could start with a zero
len_ and an empty "" pattern would result in zero sz. The above code may
then access character at -1 offset of the raw storage array...

vprintf() is probably so slow that any code using it is not optimized.
We should use at(len_-1) instead here.

> /* check that '\0' is appended and not counted */
>
> if (operator[](len_-1) == '\0') {

Perhaps this day was too long, but I do not think the above checks
whether \0 was appended but not counted. It checks whether \0 was
appended and counted. Which is probably what you want to check, so the
comment should be adjusted to remove the word "not".

HTH,

Alex.

> On Mon, Jul 29, 2013 at 11:55 PM, Alex Rousskov wrote:
>> On 07/29/2013 01:38 PM, Kinkie wrote:
>>
>>> So to implement this I should:
>>> - rename current reserveSpace to rawSpace, add returning the pointer
>>> as in current rawSpace.
>>> - reimplement reserveSpace as a convenience calling reserveCapacity
>>> - adjust clients of reserveSpace to use rawSpace instead.
>>>
>>> If this is correct, I'm OK with it.
>>
>>
>> Yes, I think the above plan will result in the implementation I am
>> rooting for.
>>
>>
>> Thank you,
>>
>> Alex.
>>
>
>
>
Received on Wed Jul 31 2013 - 00:54:20 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 31 2013 - 12:00:07 MDT