Re: StringNg review request: SBuf

From: Kinkie <gkinkie_at_gmail.com>
Date: Tue, 19 Oct 2010 16:28:59 +0200

On Mon, Oct 18, 2010 at 9:02 PM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 09/26/2010 04:47 PM, Kinkie wrote:
>
>>>>>>      *
>>>>>>      * A SBuf can be empty, but can never be undefined. In other words
>>>>>> there is no equivalent
>>>>>>      * of NULL. defined/undefined semantics need to be implemented by
>>>>>> the
>>>>>> caller.
>>>>>>      */
>>>>>
>>>>> Remove: Avoid documenting what SBuf cannot be. If you insist on
>>>>> discussing
>>>>> the non-existent APIs, move that discussion to the SBuf.dox file.
>>>>
>>>> How about moving a simplified version of it that to the class
>>>> documentation? I have committed such an attempt.
>
>>> You are still documenting negatives, which is counter-productive. I know
>>> what you mean by "can never be undefined" but a new developer will not
>>> know
>>> how to interpret that because there is no defined() function in C/C++.
>
>> Is "A SBuf can be empty (zero-length), but always points to a valid
>> memory address." clearer?
>
> Not really because from the user point of view SBuf does not point to
> anything at all. There are no "*" or "->" operators for SBuf.

Tried clarifying again, "A SBuf can be empty (zero-length), but is
always defined."

>>>>>>     /** Assignment operation with printf(3)-style definition
>>>>>>      * \note arguments may be evaluated more than once, be careful
>>>>>>      *       of side-effects
>>>>>>      */
>>>>>>     SBuf&    Printf(const char *fmt, ...);
>>>>>
>>>>> Add "Avoid this and other ... methods. For transition only."?
>>>>> We should be using ostream formatting instead.
>>>>
>>>> Agreed, if only I could think of a sane way to do so (my ignorance).
>>>
>>> If you point me to a specific (but hopefully rather common) use case
>>> where
>>> you cannot use ostream, I will try to come up with a sane way of using
>>> it.
>>> Most likely, it will require adding a FixedSizeStream class of sorts.
>>
>> My ignorance is in a way to implement an ostream interface with a
>> custom backing store.
>> Amos has shed some light for me (StoreEntryStream), I'll need help or
>> some more detailed documentation to do it on my own.
>
> AFAIK, you need a custom streambuf class that you can then use to create an
> output stream that would write to your buffer. Attached file implements both
> using (const char *buffer, size) pair as a fixed-size buffer. The code is
> known to work in some environments, but may contain bugs or missing
> features. I do not know whether it has been tested outside of the GCC world.
>
> You may import the provided (buffer, size) API as is. If you do, many
> current users would benefit. You may change that API to use SBuf instead.
> Only new users would benefit. Finally, you can leave the old API, add
> SBuf-based API, and implement the old API using SBuf. That way, both old and
> new users would benefit.

I'll add this task after SBuf has been merged, if it's fine by you.

> If you find any bugs, please let me know so that we can fix them in other
> projects as well.

Ok.

> StoreEntryStreamBuf appears to do similar stuff but for a non-fixed buffer.
> You should probably use that code as an example if you decide to make a
> non-fixed SBuf-based stream. It is also likely that the attached code can
> borrow some optimizations from StoreEntryStreamBuf.

IMVHO the osbufstream (or whatever it will be called) should
accumulate data and release them as SBufs. From this point of view, it
wouldn't be fixed-size.

>> As discussed on IRC, will move to
>> const char * rawContent() const
>> (->  raw access to memory for READING)
>> char * rawSpace()
>> (->  raw access to memory for WRITING, does cow(). Return a pointer to
>> the first unused byte in the backend store)
>>
>> There is a problem with this approach though. If we write to the SBuf
>> outside its import methods, we also need to signal it how much we
>> wrote (e.g. a method "haveAppended"). But it's lots of layer-breaking
>> :(
>
> Since we decided to provide write access to the raw buffer, the API has to
> have a minimum set of methods, including appended(). There is nothing wrong
> with that, IMO.

I hope that in time we will be able to do without accessing the raw
buffer, even if it'll mean making the SBuf API wider.

>>> To summarize and polish both for const:
>>>
>>>    // not 0-terminated, writeable internal storage; AVOID
>>>    char *rawStorage();
>>>
>>>    // terminated c-string; may not be internal storage; SLOW
>>>    const char *c_str();
>>>
>>> Eventually, we should make c_str() a const method but let's not go there
>>> now. You may add a TODO about that.
>>
>> Both are now const. Easier to make them non-const if needed than the
>> opposite.
>
> You did not make c_str() const as far as I can see (r9518). It is not
> trivial to do so because c_str() calls terminate() which is not const. Let's
> not do that for now.

yes, it can't be const..
>>>>>>  u_int16_t nreallocs; /// number of reallocs this SBuf has been
>>>>>> subject
>>>>>> to
>>>>>
>>>>> Since nreallocs is copied on assignment, it's true meaning is rather
>>>>> difficult to define. However, it is _not_ the number of reAlloc() calls
>>>>> that _this_ SBuf has seen.
>>>>
>>>> nreallocs is meant to be a hint to the memory management routines to
>>>> define the likeness of needing a grow operation, and thus to define
>>>> how much headroom to leave. It's currently unused in favour of a
>>>> static (20% headroom + rounding) approach.
>>>> Any feedback on optimizing the heuristic and when to propagate the
>>>> hint is welcome.
>>>
>>> My feedback is to remove the unused member until somebody properly
>>> defines
>>> and uses it.
>>
>> I've tried to define it instead.
>
> I do not see any changes in nrealloc description so it is still does not
> reflect the reality. Please do not say that something is done until you
> commit the corresponding change.

Sorry, there must have been some short-circuit. The implementation now
follows the documentation: copy-constructors and assignment operators
no longer copy nreallocs.

>> The algorithm needs tweaking, but
>> this should already
>> give an idea about what it may be about. The useage scenario it is
>> designed for is:
>> SBuf s;
>> while (s.append(something())) {
>>    SBuf read=s.consume(chunk)
>>    doSomethingWith(read);
>> }
>> s will get realloc'ed more and more; by giving more headroom we ensure
>> that it will adapt to the useage pattern.
>>
>> SBuf::size_type SBuf::estimateCapacity (SBuf::size_type desired) const
>> {
>>     if (nreallocs<  10)
>>        return 2*desired;
>>     if (nreallocs<  100)
>>        return 4*desired;
>>     if (nreallocs<  1000)
>>        return 8*desired;
>>     return 16*desired;
>> }
>
> I understand what you are trying to do here, but I have no idea whether such
> optimization would help or hurt. It would be nice to focus on essential
> pieces first and then add questionable optimizations later, after they have
> been tested and proved their effectiveness. It is often much easier to add a
> good new optimization than to find and remove a bad old one.

It'd be interesting to think of some indicators to tune the algorithm.
How about some histograms for sizes and distribution of nrealloc's at
SBuf destruction time, or the distribution of requested sizes (before
algorithms are applied)?

>>>>>>     _SQUID_INLINE_ char * buf() const;
>>>>>>     _SQUID_INLINE_ char * bufEnd() const;
>>>>>
>>>>> Why const if they allow SBuf modification?
>>>>
>>>> They don't allow modifications. They are private functions to get
>>>> pointers and end of "our" store, for interfacing with the legacy C
>>>> world.
>>>
>>> They should return "const char *" if they do not allow modification.
>>
>> I believe I wasn't clear: those are simply private mehtods meant to
>> help with recurring activities.
>
> That does not answer my original question: Why const if they allow SBuf
> modification? Either they should return "const char *" or they should not be
> const themselves.

They don't change the SBuf. But maybe I'm just misunderstanding the
meaning of a const method.
I can de-constify them and adjust calling methods accordingly.

> I will come back to SBuf once MemBlob is committed.

Thanks.

-- 
    /kinkie
Received on Tue Oct 19 2010 - 14:29:06 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 19 2010 - 12:00:04 MDT