Re: pseudo-specs for a String class: char *buf

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 03 Sep 2008 23:30:42 -0600

On Thu, 2008-09-04 at 06:47 +0200, Kinkie wrote:
> On Thu, Sep 4, 2008 at 12:31 AM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
> > On Thu, 2008-09-04 at 00:12 +0200, Kinkie wrote:
> >
> >> > I do not think an offset would be significantly less efficient in this
> >> > context. I bet 90+% of operations that require raw data access are far
> >> > more expensive than adding an offset to a pointer.
> >>
> >> The most common one is a NULL check, which is hard to express using
> >> (offset/length).
> >
> > As you know, I do not know what you mean by a NULL check (buffers or
> > strings are not pointers). If you mean an isEmpty() check, then it is
> > implemented as (!length) as the offset is irrelevant for an empty
> > string.
>
> There are uses for declaring an object as undefined, which is a
> different thing than a zero-length string

"NULL" and "undefined" are different things for many developers. If you
want to propose a special undefined String state, you can add an
isDefined or isSet method.

isDefined check is implemented as (!bigBuffer), where bigBuffer is the
reference counting pointer to the primary buffer. Thus, the argument
that isNULL or isDefined requires storing a raw string pointer for
efficiency reasons is invalid. Neither check needs access to string
contents.

FWIW, I would not recommend adding isDefined though because special
states is the primary cause for bugs (developers always forget about
them). Dereferencing NULL pointers in C is a well-known example of that.

> Take the tokenizer for example. a null (we may call it invalid,
> undefined, no-store) KBuf is a very conveniente way to signal
> "end-of-stream", as opposed to "a token of zero length".

It is also a convenient way to send null, invalid, undefined, etc.
strings to code that does not expect them.

> Without this, an exception will have to be raised to signal
> the end-of-stream condition.

No exceptions are necessary or desired. Here is a simple tokenizer
API/usage sketch:

 for (Tokenizer tzer(string, delimiter); !tzer.atEnd(); ++tzer) {
    String token = *tzer;
    ...
 }

We can also look at std::iterator but that interface may be slightly
more complex than we need. We can use it if we want to be compatible
with std algorithms, but we probably do not care about those at this
point.

HTH,

Alex.
Received on Thu Sep 04 2008 - 05:31:10 MDT

This archive was generated by hypermail 2.2.0 : Thu Sep 04 2008 - 12:00:04 MDT