Re: SBuf review at r9370

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 26 Feb 2009 11:08:19 -0700

On 02/26/2009 10:24 AM, Kinkie wrote:
> On Thu, Feb 26, 2009 at 1:46 AM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
>
>> http://bazaar.launchpad.net/~kinkie/squid/stringng at r9370.
>>
>> * Remove isLiteral and the corresponding code/logic. We might add this
>> very minor performance optimization as a non-public Blob member once the
>> code is known to work well. For now, the code is confusing enough
>> without it.
>>
> The Literal thing was actually added to answer to a very specific
> need. IIRC it was because without it valgrind complained about leaked
> memory in case of a global static SBuf.
> If you're sure, I'll remove it.
>
I am even more sure now that I know the reason you added it. :-)
>> * Remove isImported. Copy and then free the buffer when importing
>> instead. Same motivation as in the isLiteral comment above.
>>
> This too has a IMO a very practical use: it allows us an easy path to
> get into the SBuf world i/o buffer which have been created by the i/o
> layer. If you're sure, I'll remove it.
>
I understand that it can be a useful optimization. I am sure we should
replace that optimization with copying (in absorb()) for now because we
already have enough bugs to worry about without this special case.
Special cases like this significantly increase the probability of a bug
left unnoticed by a reviewer, IMO.
>> * Some throw() declarations are still left in the code. Grep for them.
>>
> Are you sure? There's only one that I could find, in the virtual
> destructor for OutOfBoundsException, and in that case it's needed to
> match the signature of TextException.
>
I forgot that standard exception class forces us to use throw(). I guess
this one will have to stay then.
>> * cow() logic feels wrong because grow() does not just grow(). Grow is
>> discussed below. The cow() code should be something very simple like:
>>
>> if (we are alone)
>> return false; // no need to copy
>> store_ = new MemBuf(*store_); // get ourselves an exclusive copy of
>> the blob
>> return false;
>> No grow() magic -- we are getting an exclusive copy, not growing something.
>>
> MemBlob has disabled copy-constructor and assignment operator by
> choice, as it gets handled through (refcount) pointers. (BTW: is this
> a violation of the coding guidelines?)
> It's certainly possible to add it, it's just a different style.
>
> The only extra operation performed by the current implementation is to
> run a re-evaluation of the blob store sizing heuristics: at the end
> it's a cheap operation compared to the copying, and since we're
> copying we may as well check it.
> I'll change if you confirm me that you have also considered this side
> of the argument.
>
Here, I am not complaining about what happens under the hood. I am
complaining about the on-the-surface logic of cow(). It is broken.
Whether the implementation actually works is irrelevant. There should be
no grow() in cow().

So you need to replace grow() with copying. If we do not want to expose
a MemBuf copy constructor, you can:

    - Implement MemBuf copy constructor as a private member.
    - Add MemBuf::Copy(const MemBuf &mb) that returns a Pointer to new
MemBuf (mb).

This approach is both clean and efficient.

The copy constructor can optimize, but please do keep it simple.
> My suggestion: reintroduce reAlloc() which gets called by cow() and
> does the low-level work, and have both cow() and grow() perform
> checks, heuristics and call reAlloc(). How would you like this?
>
I believe the Copy approach above is superior as far as cow() is
concerned. I will have to revisit grow() separately.
>> * estimateCapacity() is too complex to understand its performance
>> effects, especially since nreallocs is a very strange animal. I suggest
>> that we keep it very simple and add optimizations later, if needed.
>> Something like
>>
>> granularity = desired < PageSize ? 16 : PageSize;
>> return roundto(desired, granularity)
>>
>> is much easier to comprehend, analyze, and optimize. Let's postpone
>> complex optimizations.
>>
> I agree, with a SLIGHTLY more complex design to match the current
> standard MemPool String sizes.
> Stuff like:
> desired *= constant_growth_factor;
> if (desired< ShortStringsSize)
> return ShortStringsSize;
> if (desired < MediumStringsSize)
> return ShortStringsSize;
> if (desired < BigStringsSize)
> return BigStringsSize;
> if (desired < 2kBuffersSize)
> return BigStringsSize;
>
Can you add some static(?) SuggestMemSize() method to the String pool
and call it from SBuf? We should not duplicate the logic of
string-related memory size calculations, IMO.
> return roundto(desired,PageSize);
>
> We may want to discuss whether squidSystemPageSize should be static to
> this module (as it is now) or belongs to the global namespace.
>
Does any other Squid code use it? If yes, it would be nice to
store/calculate it in a single, shared location.
>> * startsWith() is buggy -- it ignores the contents of its parameter's
>> buffer.
>>
>
> Does it?
>
It does. s/*this/S/.

>> * SBuf::consume() returns a copy of SBuf. It should return a reference
>> to this or void.
>>
> Er.. no.
> Consume shortens the SBuf by chopping off the head and returns the chopped part.
>
Ah, I see, sorry. I would change the implementation to

    const SBuf rv(substr(0, actual));
    ... // chop the beginning from "this" buffer as you do now
    return rv;

then.
>> * Exceptions thrown by SBuf will lack source code location info, right?
>> I think we need to add a ThrowHere(ex) macro that supplies that to the
>> ex parameter.
>>
>
> No, their constructor passes it through.
>
But who supplies the file and line values to the constructor if we just
call throw?
>> * Please rename substrSelf() as discussed.
>>
> I don't recall reaching a decision about the target name.
> We can decide to drop it entirely (make it private), but the last
> agreement I recall is that since it's not part of the std::string API,
> we're on our own in defining it, and we haven't really defined it.
>
I do not remember the two(?) replacement names that were proposed, but
they are all better than substrSelf. Give me a list if you want me to
pick one.
> * s/roundto/RoundTo/
>
>
> Now in libmiscutil. Should it be capitalized anyways?
>
If it is global or static, it should be capitalized (per Squid style).

HTH,

Alex.
Received on Thu Feb 26 2009 - 18:09:01 MST

This archive was generated by hypermail 2.2.0 : Thu Feb 26 2009 - 12:00:04 MST