Re: pseudo-specs for a String class

From: Henrik Nordstrom <henrik_at_henriknordstrom.net>
Date: Wed, 27 Aug 2008 01:44:48 +0200

Sorry, MemoryChunk should read MemoryBlob to be consistent with my first
post...

On ons, 2008-08-27 at 01:33 +0200, Henrik Nordstrom wrote:
> On ons, 2008-08-27 at 00:24 +0200, Kinkie wrote:
>
> > This is quite different from my current approach, by which Strings get
> > created and drive the instantiations of Bufs (MemoryRegions).
> > I feel that you'd be trying to reimplement parts of the memory
> > manager. Maximum efficiency, at the expense of quite a bit of
> > flexibility.
>
> MemoryChunk (not Region).
>
> Both modes is needed. It depends on the use.
>
> String will create the MemoryChunk automatically if passed the data.
>
> But some data sources such as networking has other needs and works
> better the other way around, providing the data and then creating
> Strings from that same buffer. But yes, it's possible to build an
> interface for this using only String by introducing a special truncate
> operation which frees data in the MemoryChunk (Buf) via String but it
> exposes an operation which is not always safe..
>
> > Hm... interesting for annotation purposes, but is it really significant?
>
> The difference between String and MemoryRegion? Not sure. But it also
> doesn't hurt as you can cast freely between the two (even when using
> references).
>
>
>
> > My thoughts: \0 is special, and would only be significant when strings
> > need to be exported from the memory-managed code onto nonmanaged code.
>
> Yes.
>
> > Generally speaking, the safest way to do so is by copy rather than by
> > reference, but I'd rather also keep the ability to export by reference
> > - hoping the caller knows what they're doing. In that case the \0 is a
> > must-have safeguard, in some cases might require copying. Unfortunate
> > but unavoidable.
>
> Agreed.
>
> > > I think we are at the point
> > > where we can fully drop the \0 without too much headache, but but it's
> > > also true that in all cases where we tokenise a string there is
> > > separators we can nuke and replace by \0's... However, with the \0
> > > casting between MemoryRegion and String is tricky (needs to copy if
> > > there is no \0) and tokenising gets destructive as it destroys the
> > > original string by replacing separators by \0..
> >
> > Well, tokenising should be replaced by substringing really.. it could
> > mean having to drop strtok().
>
> substringing is a form of tokenising. Split a long String in it's
> components. How that's done is an implementation detail.
>
> > > Other modifications of String/MemoryRegion content generally requires a
> > > COW operation.
> >
> > It depends: I expect a rather common case to be when only one String
> > owns a Buf/MemoryBlob. In that case modifications are cheap.
>
> That's a very common COW optimization, and assumed..
>
> Regards
> Henrik

Received on Tue Aug 26 2008 - 23:45:01 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 27 2008 - 12:00:06 MDT