Re: [RFC] Time to talk about StringNG merge again?

From: Kinkie <gkinkie_at_gmail.com>
Date: Wed, 31 Jul 2013 14:18:09 +0200

On Wed, Jul 31, 2013 at 2:53 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 07/30/2013 03:56 AM, Kinkie wrote:
>
>> lp:~squid/squid/stringng-cherrypick has the most uptodate code.
>
>
> I did not finish the review before running out of time, but I found one
> or two bugs in r12761 (in the code I have not looked at before). There
> are also a few polishing issues:

Thanks.

>> typedef int32_t size_type;
>
> I think we should make it unsigned. I agree with your earlier
> observation that it is better to do that now rather than later or never.

Yes, it is agreed. It's however something I'd prefer to do as a
separate change, after everything else has been defined in order to
have a clear impact.

>> void reserveSpace(size_type minSpace) {reserveCapacity(length()+minSpace);}
>
> As I mentioned before, this lacks overflow control if the sum of
> length() and minSpace exceeds size_type maximum. Please add a check.

reserve* calls cow() which - if a resizing is needed - calls reAlloc
which throws SBufTooBigException if the requested size is too big.
Isn't that good enough?

>> cow(minSpace+length());
>
> Same here, in rawSpace().
>> /** Request to extend the SBuf's free store space.
>> /** Request to resize the SBuf's store capacity
>
> The caller does not really request to extend or resize the store because
> they do not normally know what the store size is. I suggest replacing
> "extend" and "resize" with "guarantee".

Done.

>> * This method can also be used to prepare the SBuf by preallocating a
>> * predefined amount of free space; this may help to optimize subsequent
>> * calls to SBuf::append or similar methods. In this case the returned
>> * pointer should be ignored.
> ...
>> char *rawSpace(size_type minSize);
>
>
> Please remove that part of the comment. You can say "See also:
> reserveSpace()" if you want.

Done.

>
> However, I would _add_ something like this:
> Unlike reserveSpace(), this method does not guarantee exclusive
> buffer ownership. It is instead optimized for a one writer
> (appender), many readers scenario by avoiding unnecessary
> copying and allocations.

This is perfect; thanks!

>
>> //reserve twice the format-string size, it's a likely heuristic
>> rawSpace(strlen(fmt)*2);
>>
>> while (length() <= maxSize) {
> ...
>> sz = vsnprintf(bufEnd(), store_->spaceSize(), fmt, ap);
>
>
> The above does not illustrate how to use rawSpace() well. We do not have
> to provide good examples in lower level code, but I think it is a good
> idea to eat our own dog food. The above code should probably be rewritten as
>
> // heuristic: we will need twice as much space as the format
> const size_type minSpace = strlen(fmt)*2;
>
> while (length() <= maxSize) {
> ...
> const char *space = rawSpace(minSpace);
> sz = vsnprintf(space, spaceSize(), fmt, ap);

Done (but space must be a char *)

> BTW, please add SBuf::spaceSize() as a public method so that
> optimization like the above will work. Otherwise, non-privileged users
> will have to write
>
> foobar(buf.rawSpace(minSize), minSize);
>
> instead of getting an advantage of the actual space size being larger
> than the requested minimum.

Done, but I'm quite wary of this: it implies a knowledge about
internal store positioning which I would prefer to hide from clients.

>> if (sz >= static_cast<int>(store_->spaceSize()))
>> rawSpace(sz*2); // TODO: tune heuristics
> I would use reserveSpace() here instead of ignoring rawSpace() result
> because we know the buffer needs to be reallocated at this point, right?

Hand-unrolled code doesn't ignore that return value anymore (see below)

>> /* snprintf on Linux returns -1 on overflows */
> ...
>> if (sz >= static_cast<int>(store_->spaceSize()))
>> rawSpace(sz*2); // TODO: tune heuristics
>> else if (sz < 0) // output error in vsnprintf
>> throw TextException("output error in vsnprintf",__FILE__, __LINE__);
>
> If snprintf on Linux returns -1 on overflows, we should not throw when
> sz is -1. We should grow the buffer instead, right?

Man snprintf(3) says
If an output error is encountered, a negative value is returned.

I guess the comment is broken.
vsnprintf is standard as of C99; it should be therefore quite
consistent in our build environment.

>> while (length() <= maxSize) {
>> sz = vsnprintf(bufEnd(), store_->spaceSize(), fmt, ap);
> ...
>> /* snprintf on FreeBSD returns at least free_space on overflows */
>>
>> if (sz >= static_cast<int>(store_->spaceSize()))
>> rawSpace(sz*2); // TODO: tune heuristics
>> else if (sz < 0) // output error in vsnprintf
>> throw TextException("output error in vsnprintf",__FILE__, __LINE__);
>> else
>> break;
>> }
>
> Nothing in that loop grows length(), which is your guard against an
> "infinite" loop on FreeBSD (at least). Can you think of a better loop guard?

This loop will be executed at most twice via break clause, as on the
second go we _know_ how much we have to write, the first attempt is .
I'm hand-unrolling it.

>> len_ += sz;
>> // TODO: this does NOT belong here, but to class-init or autoconf
>> /* on Linux and FreeBSD, '\0' is not counted in return value */
>> /* on XXX it might be counted */
>> /* check that '\0' is appended and not counted */
>>
>> if (operator[](len_-1) == '\0') {
>> --sz;
>> --len_;
>> }
>
> There is no guarantee that len_ is positive. We could start with a zero
> len_ and an empty "" pattern would result in zero sz. The above code may
> then access character at -1 offset of the raw storage array...
>
> vprintf() is probably so slow that any code using it is not optimized.
> We should use at(len_-1) instead here.

Ok.
This will go away by itself when we move to unsigned size_type.

>> /* check that '\0' is appended and not counted */
>>
>> if (operator[](len_-1) == '\0') {
>
> Perhaps this day was too long, but I do not think the above checks
> whether \0 was appended but not counted. It checks whether \0 was
> appended and counted. Which is probably what you want to check, so the
> comment should be adjusted to remove the word "not".

You're right, the comment as-is muddies the waters. Reworded as "check
whether '\0' was appended and counted.
But in the end, as vsnprintf is a C99 standard (aand we require C99 +
now to compile, so that check is now probably redundant.

-- 
    /kinkie
Received on Wed Jul 31 2013 - 12:18:18 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 31 2013 - 12:00:07 MDT