Re: pseudo-specs for a String class: char *buf

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 03 Sep 2008 09:49:21 -0600

On Wed, 2008-09-03 at 16:53 +0200, Kinkie wrote:
> On Wed, Sep 3, 2008 at 3:59 PM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
> >
> > I looked at your StringNg wiki page and noticed that your string has
> > a "char *buf" pointer into the memory buffer (in addition to the buffer
> > pointer itself). I think it would be better to use an offset instead of
> > the pointer into internal buffer area:
>
> Yes, I had a discussion with Adrian about the same issue earlier on on IRC.
> You make excellent points - as Adrian did :)
> Here's my take
>
> > - cleaner design: no peeking into other object's privates
> Yes. At the same time the KBuf::Buf class (the "other object") is a
> private member class of the KBuf class;
> it's actually little more than a glorified struct, and shouldn't be
> thought as a first-level citizen on its own.

Private or not, it is still another object.

And, FWIW, I doubt the memory buffer class will remain inside the string
class.

> > - easier to change memory buffer internals
> If you mean "change the buffer contents", that's an operation which
> should be quite rare.
> If you mean "change the code" that should be even rarer.

I meant "change the code". I do expect those changes in the foreseeable
future.

> > - easier to support several buffer types with different internals
> I didn't really think of different buffer types. Do you have in mind
> any scenario where it would be useful?

Yes, I do (e.g., small versus large, thread-safe versus not, and
contiguous versus chunked).

In fact, you kind of documented different buffer implementations
yourself: "small Bufs (<8Kb) should be managed by MemPools. - Bufs
bigger than 8Kb should be allocated in sizes compatible with the system
page size"

> > - easier to support re-allocation of buffer memory
> > - easier to provide a thread-safe implementation.
>
> On the other hand, char* are significantly more efficient for common
> operations, consistently with the design goals..

I do not think an offset would be significantly less efficient in this
context. I bet 90+% of operations that require raw data access are far
more expensive than adding an offset to a pointer.

> I'm not saying that I won't change them, I'd just like to be shown
> scenarios where it makes a difference.

I believe I provided more than enough reasons and you agreed with at
least some of them. You have provided one so far ("significantly more
efficient for common operations"). I think the burden of proof should be
on you in this case.

> On an unrelated issue, since it was of interest to some of us, here's
> a sample of the caller code for tokenization functions (actual live
> code):
>
> KBuf s1;
> cout << "tokenization: \n";
> {
> s1="The quick brown fox jumped over the lazy dog";
> char *needle=" ";
> KBuf cs1(needle);
> while (!s1.isNull()) {
> cout << "token: " << s1.nextToken(cs1) << endl;
> }
> }
> cout << endl;

FWIW, I still think that tokenization should be a external to the buffer
or string and should not modify them. Please see my earlier posts for
details.

Thank you,

Alex.
Received on Wed Sep 03 2008 - 15:49:47 MDT

This archive was generated by hypermail 2.2.0 : Thu Sep 04 2008 - 12:00:04 MDT