Re: SBuf review at r9370

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 27 Feb 2009 16:46:57 +1300

Kinkie wrote:
> On Thu, Feb 26, 2009 at 1:46 AM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
>> Hi Kinkie,
>>
>> Here are a few corrections for
>> http://bazaar.launchpad.net/~kinkie/squid/stringng at r9370.
>>
>> * Remove isLiteral and the corresponding code/logic. We might add this
>> very minor performance optimization as a non-public Blob member once the
>> code is known to work well. For now, the code is confusing enough
>> without it.
>
> The Literal thing was actually added to answer to a very specific
> need. IIRC it was because without it valgrind complained about leaked
> memory in case of a global static SBuf.
> If you're sure, I'll remove it.
>
>> Once isLiteral is removed, compare with the prototype pointer instead of
>> testing isLiteral in cow().
>>
>> * Remove isImported. Copy and then free the buffer when importing
>> instead. Same motivation as in the isLiteral comment above.
>
> This too has a IMO a very practical use: it allows us an easy path to
> get into the SBuf world i/o buffer which have been created by the i/o
> layer.
> If you're sure, I'll remove it.
>
>> * SBuf copy constructor and assignment operator still initialize members
>> differently.
>
> Fixed. nreallocs is now imported by copy-constructor.
>
>> * Some throw() declarations are still left in the code. Grep for them.
>
> Are you sure? There's only one that I could find, in the virtual
> destructor for OutOfBoundsException, and in that case it's needed to
> match the signature of TextException.
>
>> * cow() logic feels wrong because grow() does not just grow(). Grow is
>> discussed below. The cow() code should be something very simple like:
>>
>> if (we are alone)
>> return false; // no need to copy
>
> The only difference I could find is an isLiteral check..
>
>> store_ = new MemBuf(*store_); // get ourselves an exclusive copy of
>> the blob
>> return false;
>> No grow() magic -- we are getting an exclusive copy, not growing something.
>
>
> MemBlob has disabled copy-constructor and assignment operator by
> choice, as it gets handled through (refcount) pointers. (BTW: is this
> a violation of the coding guidelines?)
> It's certainly possible to add it, it's just a different style.
>
> The only extra operation performed by the current implementation is to
> run a re-evaluation of the blob store sizing heuristics: at the end
> it's a cheap operation compared to the copying, and since we're
> copying we may as well check it.
> I'll change if you confirm me that you have also considered this side
> of the argument.
>
> I however agree with you that the current cow()/grow() logic is flawed
> in that a certain amout of growing is always performed, regardless the
> fact that the MemBlob is used or not. Which means that a SBuf which
> gets aliased and cow()ed a lot will grow and waste memory.
>
> My suggestion: reintroduce reAlloc() which gets called by cow() and
> does the low-level work, and have both cow() and grow() perform
> checks, heuristics and call reAlloc(). How would you like this?
>
>> * roundto(5,5) should return 5 but returns 10. The comment and
>> implementation should be fixed. The method probably does not belong here
>> though.
>
> Reimplemented and re-checked. Now works as advertised.
> Moved to libmiscutil.
>
>> * estimateCapacity() is too complex to understand its performance
>> effects, especially since nreallocs is a very strange animal. I suggest
>> that we keep it very simple and add optimizations later, if needed.
>> Something like
>>
>> granularity = desired < PageSize ? 16 : PageSize;
>> return roundto(desired, granularity)
>>
>> is much easier to comprehend, analyze, and optimize. Let's postpone
>> complex optimizations.
>
> I agree, with a SLIGHTLY more complex design to match the current
> standard MemPool String sizes.
> Stuff like:
> desired *= constant_growth_factor;
> if (desired< ShortStringsSize)
> return ShortStringsSize;
> if (desired < MediumStringsSize)
> return ShortStringsSize;
> if (desired < BigStringsSize)
> return BigStringsSize;
> if (desired < 2kBuffersSize)
> return BigStringsSize;
> return roundto(desired,PageSize);
>
> We may want to discuss whether squidSystemPageSize should be static to
> this module (as it is now) or belongs to the global namespace.
>
>> * To me, compare() returns the opposite of what strcmp() does. "This"
>> should be treated as the first parameter of strcmp() and not the second
>> one, IMO.
>
> Gah!
> Fixed and added explicit cross-checks to unit tests.
>
>> * startsWith() is buggy -- it ignores the contents of its parameter's
>> buffer.
>
> Does it?
> Its implementation reads: "if *this is too short to match, then it
> can't match. Otherwise does a SBuf equality test on a a substr() of
> *this long just as the parameter..
>
>> * SBuf::consume() returns a copy of SBuf. It should return a reference
>> to this or void.
>
> Er.. no.
> Consume shortens the SBuf by chopping off the head and returns the chopped part.
> See how it's used in the tokenizer.
>
> Given a:
> SBuf data="some very long string";
>
> then:
> SBuf chopped=data.consume(5);
>
> is equivalent to:
> SBuf chopped=data.substr(0,5);
> data.substrSelf(5,SBuf::npos);
>
> or to:
> SBuf chopped=data.substr(0,5);
> data=data.substr(5,SBuf::npos);
>
> It is just (IMVHO) more readable clear in its purpose.
> a MemBuf-alike consume() makes no sense in that it's just an alias for
> substrSelf(0,target).
>
>> * Move "if !canAppend then grow" logic to a single method. Call it
>> reserve(). More on grow() below.
>
> (I need to check this out)
>
>> * grow() comments are wrong. The method does not always allocate a new
>> buffer.
>
> It says so in the comments, just not at the very beginning.
> I've changed the wording somewhat, but here all my limitations as a
> non-native speaker show up.
> Please suggest a different wording if you feel it's not clear.
>
>> * grow() implementation is odd. It ignores the remaining blob space
>> size. Am I missing something? I have a feeling you implemented something
>> other than grow() so that you can rely on its strange combination of
>> do-not-grow and grow-even-if-you-do-not-have-to logic in the rest of the
>> code. This needs to be fixed in grow() and in the callers().
>
> It's up to the caller to determine whether it needs to grow the store.
> It can't be done in MemBlob itself, since it doesn't know what part of
> the blob it needs to copy (unless we code it up so that it trusts the
> caller, but then it won't really make coupling looser)
> If the caller calls grow() when it's not appropriate, it will degrade
> performance by unnecessarily copying data, but it won't break the
> model or cause harm.
>
>> * I wonder if SBuf::grow() should be moved to MemBlob that may
>> (eventually) have a low-level knowledge on how to do that efficiently.
>> What do you and others think?
>
> see previous comment.
>
>> * Please use .length() everywhere you can. Do not mix .len_ and .length().
>
> Ok, checked.
> I've left len_ where it's being assigned to another object's len_ or
> in very low-level stuff (e.g. object dumping, stuff doing pointer
> maths).

Um, IMO its better to be liberal with length() and very,very
conservative with len_.

Dumping stuff and pointer maths particularly is one place its often
better/safer to use "const length() const" instead of len_, unless the
math is actually intended to change len_.

>
>> * s/shortcut: same length and same store/shortcut: same store, offset,
>> and length/
>
> Ok
>
>> * SBuf::consume() comments are stale, talking about NULL.
>
> Ok.
>
>> * "actual" in SBuf::consume() should be const.
>
> Ok.
>
>> * getStats should return a const reference not a copy of, potentially
>> large, stats object.
>
> Done.
>
>> * Please use XMIN or XMAX instead of manually comparing values with a
>> ?-operator. It is much easier to see and check the logic that way.
>
> now using xmin() and xmax() as per Amos' suggestion.

!! no 'x' :)

>
>> * Exceptions thrown by SBuf will lack source code location info, right?
>> I think we need to add a ThrowHere(ex) macro that supplies that to the
>> ex parameter.
>
> No, their constructor passes it through.
> ThrowHere may make sense in a more general way - see our discussions
> on warns() etc.
>
>> * bufEnd() should return the pointer to the byte after the last one. You
>> can call it end(). If you find yourself accessing the last byte often,
>> add last() that returns a char.
>
> I need to rework a few things then.
> last() is not really needed.
>
>> * s/prototype/starter/ or s/prototype/primer/
>> And no need to prefix the corresponding member with store, IMO.
>> * s/A SBuf is GUARRANTEED to be defined.//
>> Any C++ object is "defined".
>> * cow() declaration comments are stale
>
> Fixed.
>
>> * Please rename substrSelf() as discussed.
>
> I don't recall reaching a decision about the target name.
> We can decide to drop it entirely (make it private), but the last
> agreement I recall is that since it's not part of the std::string API,
> we're on our own in defining it, and we haven't really defined it.
>
>> * Please capitalize static members.
>
>> * s/roundto/RoundTo/
>
> Now in libmiscutil. Should it be capitalized anyways?
>
>> * s/importCString/absorb/ or s/importCString/absorbCString/
>> because you are taking full control over the string memory.
>>
>> * roundto is declared as a global static function in a .cci file. It
>> should probably be declared as an inline.
>
> Fixed.
>
>> * Use static_cast in SBuf::plength() and elsewhere. It would be much
>> easier to find and tune casting later if needed.
>>
>> * s/store/blob/ or s/store/mem/
>> We have enough store* stuff in squid.
>>
>> * exportCopy comments are wrong or stale. It does not return NULL. It
>> should throw on memory allocation errors (but probably does not yet).
>> The comment should say that the returned string is zero terminated.
>
> Fixed.
>
>> * Remove comments after #includes. Some tools break on C++ comments
>> there and you are partially lying and partially not supplying any new
>> info there.
>
> Fixed.
>
>> #include "config.h" //various typedefs
>> #include "Debug.h" //debugs etc
>>
>>
>> At this point, I got "Launchpad is offline for scheduled maintenance. We
>> should be back soon." error so I will have to look at MemBlob later.
>
> Thanks.
>

-- 
Please be using
   Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13
   Current Beta Squid 3.1.0.5
Received on Fri Feb 27 2009 - 03:46:36 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 28 2009 - 12:00:04 MST