Re: Question on String design

From: Kinkie <gkinkie_at_gmail.com>
Date: Sun, 14 Dec 2008 08:52:20 +0100

On Sun, Dec 14, 2008 at 6:28 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On Sat, 2008-12-13 at 14:10 +0100, kinkie wrote:
>
>> The issue is the interaction between buffers, strings and encodings.
>> There's two possibilities:
>> 1- a StringNg holds a Buf and references an Encoding. It's thus blob-ish
>> and transcoding is done on demand.
>> Advantages: transcoding is done done lazily
>> Disadvantages: certain common operations are highly expensive when
>> variable-length encodings are used ( e.g UTF-8 ), since they require
>> parsing the whole String from the beginning
>> 2- StringNg are always encoded in a fixed-length encoding (e.g. UCS-2)
>> and only reference a SBuf. Transcoding is done on creation and export.
>> Advantages: StringNg maniupulation is easy
>> Disadvantages: this approach basically nullifies one of the advantages
>> of SBufs for Strings, which is their ability to share storage.
>
> I would not do either at this stage. Let's have a basic String with
> ASCII length/compare/search operations first.
>
> When the code settles and we want to add support for non-ASCII encodings
> and locale, we will take the next step. The API is unlikely to change
> much, so the vast majority of String users will not be affected by the
> increased internal complexity and special operations.
>
> For now, I would just do a basic String that references a Buffer, with
> an offset and length members, and "one octet = one character" ASCII
> interpretation of the content (where interpretation matters).

Ok, can do - and mostly already done.

-- 
    /kinkie
Received on Sun Dec 14 2008 - 07:52:23 MST

This archive was generated by hypermail 2.2.0 : Sun Dec 14 2008 - 12:00:03 MST