Re: pseudo-specs for a String class

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 27 Aug 2008 17:08:08 +1200

Adrian Chadd wrote:
> (My pre-breakfast 2c, so forgive me if I'm less clear than normal.)
>
>
> 2008/8/27 Kinkie <gkinkie_at_gmail.com>:
>
>> My thoughts: \0 is special, and would only be significant when strings
>> need to be exported from the memory-managed code onto nonmanaged code.
>> Generally speaking, the safest way to do so is by copy rather than by
>> reference, but I'd rather also keep the ability to export by reference
>> - hoping the caller knows what they're doing. In that case the \0 is a
>> must-have safeguard, in some cases might require copying. Unfortunate
>> but unavoidable.
>
> Although plenty of current code assumes a NUL terminated, string, its
> assumed primarily for two things:
>
> * debug(); which can be replaced with %.*s or whatever it is, to pass
> in a length before the string buffer;

Not relevant in Squid-3. debugs() uses stream operator of String class
which can do exactly whatever it wants to produce a sequence of bytes.

> * iterating/parsing; which can be replaced by using the length
> parameter in pointer arithmetic (you can toss the pointer arithmetic
> too in like 99% of the cases; the parser is about where the possible
> speed boosts from pointer arithmetic would even matter)

Thats the kicker, who Henrik pointed out. It requires the pre-filled
buffer being broken into String by the parser which will need some
custom replacement for strtok() (actually faster, but more bug prone).

>
> Both of which can be eliminated without too much trouble. In fact, I
> ended up with NUL terminated strings as a special flag case during
> transition work so the existing code assuming NULs could still work
> whilst I converted stuff over.
>
>> Well, tokenising should be replaced by substringing really.. it could
>> mean having to drop strtok().
>
> .. and in reality, writing replacement str*() routines for your String
> class instead of using C string.h functions makes everything much
> easier. Including the above.
>
> Kinkie, s27_adri has a whole lot of additional String.c functions for
> manipulating strings.
>
>>> Append operation on String/MemoryRegion objects is easy in this model,
>>> but if the region is not at the end of the MemoryBlob or if the result
>>> gets too large the it will need to trigger a copy to a new MemoryBlob of
>>> sufficient size.
>> Yes.
>
> Which won't happen in like >99% of the cases.
>
>> It depends: I expect a rather common case to be when only one String
>> owns a Buf/MemoryBlob. In that case modifications are cheap.
>
> Actually, the most common operation for Squid once you've fully
> reworked the whole environment to use this model is "lots of Strings
> referencing a large buffer" (ie, the request and reply socket buffer;
> the URL strings once those are converted over.) Almost all of the
> strings in-play are the http header entry strings, and most of -those-
> are never modified.
>
> Most of the -rest- are one String referencing an entire buffer.
>
> In any case, I agree with the general model of:
>
> * Memory: some chunk of contiguous memory somewhere;
> * MemoryRegion: some reference to { Memory, offset, length }
> * String: a MemoryRegion and some routines to manipulate it
>
> Adrian

True, BUT, BUT everything behind MemoryReagion is memory
allocator/management business and should not be involved with String.
Only the MemoryRegon API affects String.

Amos

-- 
Please use Squid 2.7.STABLE4 or 3.0.STABLE8
Received on Wed Aug 27 2008 - 05:08:18 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 27 2008 - 12:00:06 MDT