Re: [RFC] Time to talk about StringNG merge again?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 05 Jul 2013 18:33:22 +1200

Snipped out the Done bits.

On 5/07/2013 5:51 p.m., Kinkie wrote:
> On Tue, Apr 16, 2013 at 12:49 AM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
>> Hi Kinkie,
>>
>> Here is my review of
>> https://code.launchpad.net/~squid/squid/stringng-cherrypick r12753. I
>> found approximately three bugs. I am also concerned about size_type
>> type, after starting to use MemBlob which has the same problem. The rest
>> is polishing.
> Ok, thanks. I have some doubts about size_type as well, I am a bit
> worried about touching it at this stage in development, but it may be
> better now than never.
>
>>> if (InitialStore == NULL) {
>>> static char lowPrototype[] = "";
>>> InitialStore = new MemBlob(lowPrototype, 0);
>>> }
>> If possible, please use "new MemBlob(0)" and remove lowPrototype.
>>

>>> void
>>> SBuf::reserveCapacity(size_type minCapacity)
>>> {
>>> Must(0 <= minCapacity && minCapacity <= maxSize);
>>> reserveSpace(minCapacity-length());
>>> }
>> This does not look right. For example, if minCapacity is smaller than
>> length(), we should do nothing, but will throw instead. I think this
>> method should look something like this:
>>
>> Must(0 <= minCapacity && minCapacity <= maxSize);
>> cow(minCapacity);
>>
>>
>>> // we're not concerned about RefCounts here,
>>> // the store knows the last-used portion. If
>>> // it's available, we're effectively claiming ownership
>>> // of it. If it's not, we need to go away (realloc)
>>> if (store_->canAppend(off_+len_, minSpace)) {
>>> debugs(24, 7, "not growing");
>>> return;
>>> }
>> That comment is probably stale because canAppend() is currently not
>> related to ownership of the buffer: It will happily return true for
>> shared buffers. AFAICT, reserveSpace() should be implemented using
>> something like this:
>>
>> Must(0 <= minSpace && minSpace <= maxSize - length());
>> reserveCapacity(length() + minSpace);
> The comment is badly worded but the code is correct: canAppend() tells
> us if the end of our buffer is at the end of the used portion of the
> MemBlob, and there is enough space in the unused MemBlob portion to
> contain the data we plan to use. If the buffer is shared, this
> guarantee may become stale at the next append operation to any SBuf
> sharing the same MemBlob.
>
> A decision needs to be made if these two methods have different
> purposes or are to be simply different expressions of the same
> underlying concept (as it is now).
> IMO, thinking about it, it does make sense that reserveSpace also
> guarantees that after the call the MemBlob will be exclusively owned
> by this SBuf (aka, blindly cow()), while reserveCapacity can mostly be
> meant as an optimization, to hint the SBuf memory management system
> that an append operation of a known size is about to happen, so that
> only one cow is performed.
> What do you think?

Sounds good.

>
>> The above two changes imply that reserve*() methods ensure
>> single-ownership of the buffer by the caller. This is not something
>> documentation of those methods currently guarantees, but I think it is
>> logical to assume that anybody calling one of these two methods wants to
>> modify the string and, hence, will need exclusive control over that
>> string. I cannot think of any reason to call those methods otherwise.
> See above.
>
>>> typedef int32_t size_type;
>> Would it be better to make this unsigned? I do not think string sizes
>> can go negative. std::string is using an unsigned type, right? FWIW,
>> whenever I try to use MemBlob that has a signed size_type as well, I am
>> forced to do casts that would not exist otherwise. I think both types
>> should be changed to reflect what they are representing (a size of a
>> buffer).
> I have the same doubt. I'm just a bit afraid to change this late. Will
> try to do as an isolated change. So far this is NOT done.
>
>> On another note, would it be better to use MemBlob::size_type instead of
>> int32_t?
> currently they're both int32_t. I believe it's more readable with the
> extra level of indirection, at the cost of having to manually keep
> them in sync.
> Added a comment to highlight this fact.

So long as SBuf is backed by MemBlob defining SBuf::size_type to be
MemBlob::size_type makes more sense. One who needs to know can always
look up both (probably should). The compiler will tell about signedness
errors in new/modified code very quickly.

The problem will be converting from signed to unsigned at this stage. I
think get the rest to a milestone we can agree on merging before
attempting please. Then if everthing goes sour in the change we can
still revert and merge with a TODO about perfecting the polish.
When you do the change ensure you have a latest GCC or clang you can
use, they will be far more pedantic about signedness comparisons and
iterator ranges etc where the issues are most likey to occur.

>>> SBuf::append(const std::string &str, SBuf::size_type pos, SBuf::size_type n)
>>> {
>>> return append(str.data(), pos, n); //bounds checked in append()
>>> }
>> This will break when n is npos because str.data() is not 0-terminated.
>> The low-lever c-string append you call here assumes that the string
>> parameter is 0-terminated if n is npos. Please add the corresponding
>> missing test case.
> This led to restructuring the append operation for C-string, which is
> now missing the pos parameter, just like std::string does.

>
>> Also, I suggest splitting this into two methods, one with a required
>> (first?) SBufCaseSensitive parameter and one without it. This will allow
>> callers to specify n without specifying isCaseSensitive and vice versa.
>> The shorter, inlined method will simply call the longer one with
>> caseSensitive, of course, so the implementation will not require more
>> code or performance overhead.
> Ok. I'm calling the two shorthand versions cmp and casecmp respectively
> (please let me know if you'd prefer the naming-convention compliant
> caseCmp instead)

Please. We need semantic similarity to std::string not symbolic.

> Code is as before at lp:~squid/squid/stringng-cherrypick, ready and in
> sync with trunk.
> test-suite runs OK on eu and ubuntu raring.
>
> I'll be on holiday in a couple of days for a couple of weeks.
> --
> /kinkie
Received on Fri Jul 05 2013 - 06:33:32 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 09 2013 - 12:00:26 MDT