Re: StringNg review request: SBuf

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 18 Oct 2010 13:02:29 -0600

On 09/26/2010 04:47 PM, Kinkie wrote:

>>>>> *
>>>>> * A SBuf can be empty, but can never be undefined. In other words
>>>>> there is no equivalent
>>>>> * of NULL. defined/undefined semantics need to be implemented by
>>>>> the
>>>>> caller.
>>>>> */
>>>>
>>>> Remove: Avoid documenting what SBuf cannot be. If you insist on
>>>> discussing
>>>> the non-existent APIs, move that discussion to the SBuf.dox file.
>>>
>>> How about moving a simplified version of it that to the class
>>> documentation? I have committed such an attempt.

>> You are still documenting negatives, which is counter-productive. I know
>> what you mean by "can never be undefined" but a new developer will not know
>> how to interpret that because there is no defined() function in C/C++.

> Is "A SBuf can be empty (zero-length), but always points to a valid
> memory address." clearer?

Not really because from the user point of view SBuf does not point to
anything at all. There are no "*" or "->" operators for SBuf.

>>>>> /** Assignment operation with printf(3)-style definition
>>>>> * \note arguments may be evaluated more than once, be careful
>>>>> * of side-effects
>>>>> */
>>>>> SBuf& Printf(const char *fmt, ...);
>>>>
>>>> Add "Avoid this and other ... methods. For transition only."?
>>>> We should be using ostream formatting instead.
>>>
>>> Agreed, if only I could think of a sane way to do so (my ignorance).
>>
>> If you point me to a specific (but hopefully rather common) use case where
>> you cannot use ostream, I will try to come up with a sane way of using it.
>> Most likely, it will require adding a FixedSizeStream class of sorts.
>
> My ignorance is in a way to implement an ostream interface with a
> custom backing store.
> Amos has shed some light for me (StoreEntryStream), I'll need help or
> some more detailed documentation to do it on my own.

AFAIK, you need a custom streambuf class that you can then use to create
an output stream that would write to your buffer. Attached file
implements both using (const char *buffer, size) pair as a fixed-size
buffer. The code is known to work in some environments, but may contain
bugs or missing features. I do not know whether it has been tested
outside of the GCC world.

You may import the provided (buffer, size) API as is. If you do, many
current users would benefit. You may change that API to use SBuf
instead. Only new users would benefit. Finally, you can leave the old
API, add SBuf-based API, and implement the old API using SBuf. That way,
both old and new users would benefit.

If you find any bugs, please let me know so that we can fix them in
other projects as well.

StoreEntryStreamBuf appears to do similar stuff but for a non-fixed
buffer. You should probably use that code as an example if you decide to
make a non-fixed SBuf-based stream. It is also likely that the attached
code can borrow some optimizations from StoreEntryStreamBuf.

> As discussed on IRC, will move to
> const char * rawContent() const
> (-> raw access to memory for READING)
> char * rawSpace()
> (-> raw access to memory for WRITING, does cow(). Return a pointer to
> the first unused byte in the backend store)
>
> There is a problem with this approach though. If we write to the SBuf
> outside its import methods, we also need to signal it how much we
> wrote (e.g. a method "haveAppended"). But it's lots of layer-breaking
> :(

Since we decided to provide write access to the raw buffer, the API has
to have a minimum set of methods, including appended(). There is nothing
wrong with that, IMO.

>> To summarize and polish both for const:
>>
>> // not 0-terminated, writeable internal storage; AVOID
>> char *rawStorage();
>>
>> // terminated c-string; may not be internal storage; SLOW
>> const char *c_str();
>>
>> Eventually, we should make c_str() a const method but let's not go there
>> now. You may add a TODO about that.
>
> Both are now const. Easier to make them non-const if needed than the opposite.

You did not make c_str() const as far as I can see (r9518). It is not
trivial to do so because c_str() calls terminate() which is not const.
Let's not do that for now.

>>>>> u_int16_t nreallocs; /// number of reallocs this SBuf has been subject
>>>>> to
>>>>
>>>> Since nreallocs is copied on assignment, it's true meaning is rather
>>>> difficult to define. However, it is _not_ the number of reAlloc() calls
>>>> that _this_ SBuf has seen.
>>>
>>> nreallocs is meant to be a hint to the memory management routines to
>>> define the likeness of needing a grow operation, and thus to define
>>> how much headroom to leave. It's currently unused in favour of a
>>> static (20% headroom + rounding) approach.
>>> Any feedback on optimizing the heuristic and when to propagate the
>>> hint is welcome.
>>
>> My feedback is to remove the unused member until somebody properly defines
>> and uses it.
>
> I've tried to define it instead.

I do not see any changes in nrealloc description so it is still does not
reflect the reality. Please do not say that something is done until you
commit the corresponding change.

> The algorithm needs tweaking, but
> this should already
> give an idea about what it may be about. The useage scenario it is
> designed for is:
> SBuf s;
> while (s.append(something())) {
> SBuf read=s.consume(chunk)
> doSomethingWith(read);
> }
> s will get realloc'ed more and more; by giving more headroom we ensure
> that it will adapt to the useage pattern.
>
> SBuf::size_type SBuf::estimateCapacity (SBuf::size_type desired) const
> {
> if (nreallocs< 10)
> return 2*desired;
> if (nreallocs< 100)
> return 4*desired;
> if (nreallocs< 1000)
> return 8*desired;
> return 16*desired;
> }

I understand what you are trying to do here, but I have no idea whether
such optimization would help or hurt. It would be nice to focus on
essential pieces first and then add questionable optimizations later,
after they have been tested and proved their effectiveness. It is often
much easier to add a good new optimization than to find and remove a bad
old one.

>>>>> _SQUID_INLINE_ char * buf() const;
>>>>> _SQUID_INLINE_ char * bufEnd() const;
>>>>
>>>> Why const if they allow SBuf modification?
>>>
>>> They don't allow modifications. They are private functions to get
>>> pointers and end of "our" store, for interfacing with the legacy C
>>> world.
>>
>> They should return "const char *" if they do not allow modification.
>
> I believe I wasn't clear: those are simply private mehtods meant to
> help with recurring activities.

That does not answer my original question: Why const if they allow SBuf
modification? Either they should return "const char *" or they should
not be const themselves.

I will come back to SBuf once MemBlob is committed.

Cheers,

Alex.

Received on Mon Oct 18 2010 - 19:02:34 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 19 2010 - 12:00:04 MDT