Re: StringNg review: MemBlob

From: Kinkie <gkinkie_at_gmail.com>
Date: Fri, 8 Oct 2010 15:51:45 +0200

On Thu, Oct 7, 2010 at 12:49 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> Hi Kinkie,

Hi alex!

>
>    Here is another review you have requested. This review is based on
> lp:~kinkie/squid/stringng revision 9504.

Thanks

>> #include "base/TextException.h"
>
> Remove if not needed in MemBlob.h

Moved.

>> class MemBlobStats {
>
> ...
>>
>> };
>
> Not reviewed per Kinkie's request.

Should now be OK. I've removed the reallocs stat, which doesn't really
belong) to MemBlob, as it's a SBuf construct.

>> /**
>>  * Refcounted memory buffer, content-agnostic. Meant to only be used
>>  * by SBuf.
>>  */
>
> Remove "Meant to only be used by SBuf." This class should be generally
> usable.

Removed. Removed all references to SBuf in MemBlob.*

>> class MemBlob: public RefCountable {
>>    public:
>
> Still missing source formatting.

Done.

>>    char *mem;         ///< Raw memory block
>>    size_type capacity;///< Size of the memory block
>>    size_type size; ///< How much of the memory is actually used
>
> Start comment with a small letter or end with a period. I recommend the
> former for short comments that are usually not full sentences.

Done.

>>    //do not meddle with. Use getStats to obtain a copy for dumping
>>    static MemBlobStats stats;
>>    static const MemBlobStats& getStats();
>
> Both missing documentation. What not to do instructions are not enough.

With the removal of nreallocs stats can be (and was) made private.

> I do not recall whether we have discussed this already, but if you can make
> the stats member private, please do.
>
> getStats() name should start with a capital letter since it is static.

Done.

>>    /**
>>     * Create a new MemBlob, containing a copy of the
>>     * contents of the buffer of size memSize
>>     * pointed to by memBlock. It is up to the caller to
>>     * manage the memBlock
>>     */
>>    MemBlob(const char * memBlock, const size_type memSize);
>
> Remove "It is up to ..." because const pointer already implies that.

ok
>
>
>>    /** check whether the supplied pointer is the first unused byte in the
>> MemBlob
>>     */
>
>>     _SQUID_INLINE_ bool isAtEnd(const char *ptr) const;
>
>
> nitpick: the "supplied pointer" is not a "byte".
>
> Please document whether this returns true when there are no unused MemBblob
> bytes. I still do not like this interface itself though. There must be a
> cleaner way to do what you want. How about this:
>
>    /// whether writing to the [start, start + n) area is currently OK
>    bool writable(const char *start, size_type n) const;
>
> This would both solve the problem with the lack of unused bytes and check
> for exclusive control of the area. Would that work for the current callers?

This is what canAppend does.
What I'll do is make private the isAtEnd and willFit methods (they're
only really used by canAppend).
I would prefer not to hand-inline them tho, for readability reasons.

> Moreover, do we need this method, willFit(), and canAppend()? They all do
> pretty much the same kind of checks, from different angles, with different
> bugs, under very different names. This needs to be polished, but I do not
> know which end to start from. Can you reduce the problem space?

Done, I hope.

>>     /** check whether we can append data at the tail of the MemBlob
>>      *
>>      * \param toAppend number of bytes to be appended
>>      * \return true there is at least toAppend free space available in the
>> MemBlob
>>      */
>>     _SQUID_INLINE_ bool willFit(const size_type toAppend) const;
>
> .
>
>>    /** return the number of bytes available at the tail of the MemBlob
>>     *
>>     */
>
> /// the number of allocated but unused bytes at the end of the blob
>
>>    _SQUID_INLINE_ size_type getAvailableSpace() const;
>
> Wrong name. The method does not give the caller the available space, only
> its size. Here are some combinations that work better:
>
>   size(), spaceSize(), capacity()
>   contentSize(), spaceSize(), capacity()
>   usedSize() and availableSize(), capacity()

Fixed. availableSize() is the winner.

>
>>    /** check whether we can append data at the tail of the MemBlob
>>     *
>>     * \param toAppend number of bytes to be appended
>>     * \return true there is at least toAppend free space available in the
>> MemBlob
>>     */
>>    _SQUID_INLINE_ bool willFit(const size_type toAppend) const;
>
> .
>
>>    /** check if there is enough unused space at the tail of the MemBlob
>>     *
>>     * \return true the append would fit
>>     * \param bufEnd first unused byte after the end of the SBuf being
>> appended to
>>     * \param toAppend how many bytes would be nice to append
>>     */
>>    _SQUID_INLINE_ bool canAppend(char const * bufEnd, const size_type
>> toAppend) const;
>
> MemBlob should not know about SBuf; please remove references to SBuf.
>
> If we are really _appending_ to MemBlob, we should not have to specify the
> end of the used buffer area because MemBlob knows where it ends. See above
> for general comments and suggestions to fix the will/can/is* mess: We need
> fewer append-related members with a consistent naming scheme.

canAppend can probably get a better name (suggestions are highly
welcome). We're not appending to the MemBlob, we're asking for
"permission" to append to somethign that's INSIDE the memblob,
something that the MemBlob doesn't know about; thus the need for the
pointer.
>
>
>>    /** append a c-string by copy and explicit size
>
> Consider:
> * Appends exactly n bytes starting with S. Does not check for
> * null-termination of S. Throws if there is not enough space.

Except that it currently doesn't throw. Should it?

>
>
>>     * It is responsibility of the caller to make sure that there is
>>     * enough space
>
> What happens if there is not? I think we should throw. This needs to be
> documented. See above for a suggestion.

You already answered. Now throwing (via Must(); maybe we could use a
more specific exception?)

>>     * \param S the char* to be appended at the end of MemBlob
>
> "at the end of MemBlob" can be interpreted as the end of allocated area. Use
> "after the used area" or similar.

Done.

>
>
>>     * \param Ssize the length of the array to be appended
>>     */
>>    _SQUID_INLINE_ void append(const char *S, const size_type Ssize);
>
> Parameter names should start with a small letter, IIRC. Consider "buf" and
> "n".

n.

>>    _SQUID_INLINE_ size_type memAllocationPolicy(const size_type size)
>> const;
>
> Why is the allocation size called "policy"? The policy could be the
> algorithm that computes the size, but not the size itself. Use something
> like calcAllocSize().

Done.

>> /**
>>  * Given a requested size, return a rounded allocation size to be used
>>  * for the backing store.
>>  * This is a stopgap call, this job is eventually expected to be handled
>>  * by MemPools via memAllocString.
>>  */
>> MemBlob::size_type
>> MemBlob::memAllocationPolicy(const MemBlob::size_type size) const
>
> Do we need MemBlob:: inside parens? Please remove if we do not. Here and in
> other methods.

Ok.

> This method should not be inlined. The following malloc is expensive anyway
> and we do not want to drag memory allocation dependencies to users.

Sorry, I don't understand. What do you mean? The allocating method is
not inline.

>> {
>>        if (size < 128) return 128;
>
> Should not this be <= 128?
>
>>        if (size < 512) return 512;
>>        if (size < 1024)
>>       return RoundTo(size,128);
>>   if (size < 4096)
>>       return RoundTo(size,512);
>
> Same here.

I chose to always leave some headroom.
I've changed it; as any policy it's rather arbitrary. This is also a
place to possibly account for malloc overheads.

>> bool
>> MemBlob::willFit(const MemBlob::size_type toAppend) const
>> {
>>    return toAppend < getAvailableSpace();
>> }
>
> Should not this be <=?

Yes.

>
> Can you just place that in the .h file, inlining during the declaration?

isn't placing it in the .cci the same?

>> bool
>> MemBlob::canAppend(char const *bufEnd, const MemBlob::size_type toAppend)
>> const
>> {
>>    return isAtEnd(bufEnd) && willFit(toAppend);
>> }
>
> Why cannot I append if bufEnd is not at the end of the used area?

It's to handle the case of multiple SBuf sharing the same MemBlob.
As a practical example
SBuf a="1234567890";
SBuf b=a.substr(0,5);
a.append("a"); //canAppend=true, no cow required: This SBuf owns the
shared blob's tail
b.append("a"); //canAppend=false; cow required; this SBuf doesn't own the tail

A SBuf doesn't know what other SBufs are sharing its MemBlob; it CAN
(via MemBlob->RefCountCount()) know that there ARE some.

> Can you just place this and similar basic one-line methods in the .h file,
> inlining them during the declaration? If you do that and move expensive
> methods to .cc, then we will not have to deal with MemBlob.cci at all!

Sure, I can do that. I just do not see the advantage in not dealing
with the .cci..

>> void
>> MemBlob::append(const char *S, const MemBlob::size_type Ssize)
>> {   /** \note memcpy() is safe because we copy to an unused area
>>    */
>
> Use /// comments for one-line comments or at least do not make them
> two-line.

Ok.

>>    memcpy(mem+size,S,Ssize);
>>    size+=Ssize;
>>    ++stats.append;
>> }
>
> Please add a Must(!overflow) check.
> Please add a NULL S check.

Done.

> This method should not be inlined.
Deinlined.

>> /**
>>  * constructor, creates an empty MemBlob of size >= requested
>>  * \param size the minimum size to be guarranteed from the MemBlob
>>  */
>
> Duplicate documentation? This is a public method, right?
Yes. Fixed this and others.

>
>> MemBlob::MemBlob(const char * memBlock, const MemBlob::size_type memSize)
>> :
>>        mem(NULL), capacity(0), size(0) // will be set by memAlloc
>> {
>>    memAlloc(memSize);
>>    append(memBlock,memSize);
>>    debugs(MEMBLOB_DEBUGSECTION,9, "MemBlob constructed, this="
>>            << static_cast<void*>(this)
>>            << " memBlock=" << static_cast<const void*>(memBlock)
>>            << " memSize=" << memSize << " capacity=" << capacity);
>
> Move debugs() higher, at least before the append() which may throw.

Ok.

>> }
>>
>> MemBlob::~MemBlob()
>> {
>>    debugs(MEMBLOB_DEBUGSECTION,9, "MemBlob destructed, this="
>>            << static_cast<void*>(this)
>>            << " capacity=" << capacity
>>            << " size=" << size);
>>    --stats.live;
>>    stats.liveBytes-=capacity;
>> #if MEMBLOB_USES_MEM_POOLS
>>    //no mempools for now
>>    // \todo reinstate mempools use
>>    memFreeString(capacity,mem);
>> #else
>>    xfree(mem);
>> #endif
>> }
>
> Move debugs() as low as you can.

Ok. May I ask what's the benefit?

>> /**
>>  * create a new empty MemBlob the requested size
>>  * To  be called only by constructors
>>  *
>>  * The block is NOT initialized.
>>  */
>> void
>> MemBlob::memAlloc(const MemBlob::size_type memSize)
>> {
>>    size_t actualAlloc;
>
> Please initialize (to zero).

What's the purpose? One of the points in the whole SBuf thing is not
to have to do it..
Also, right now MemBlob doesn't use MemPools. But once it is
instrumented to properly use them.. (tried to, but need to accomodate
static vars so I rolled back)

> Add Must(!mem).
Done

>
>> #if MEMBLOB_USES_MEM_POOLS
>>    //for now, do without mempools
>>    mem=(char *)memAllocString(memSize,&actualAlloc);
>> #else
>>    //\todo reinstate mempools use
>>    actualAlloc=memAllocationPolicy(memSize);
>>    mem=static_cast<char*>(xmalloc(actualAlloc));
>> #endif
>>    Must(mem);
>>    capacity=actualAlloc;
>>    size=0;
>>    debugs(MEMBLOB_DEBUGSECTION,8,
>>        HERE << "this=" <<(void *)this <<", requested=" << memSize <<
>>        ", received=" << capacity);
>>    ++stats.live;
>>    ++stats.alloc;
>>    stats.liveBytes+=capacity;
>> }
>
>
> Thank you,

Thanks,

-- 
    /kinkie
Received on Fri Oct 08 2010 - 13:51:52 MDT

This archive was generated by hypermail 2.2.0 : Mon Oct 18 2010 - 12:00:05 MDT