Re: pseudo-specs for a String class

From: Adrian Chadd <adrian.chadd_at_gmail.com>
Date: Tue, 26 Aug 2008 09:55:53 +0800

2008/8/26 Kinkie <gkinkie_at_gmail.com>:

> They not differ significantly, and I'm currently coding an out-of-tree
> merge-friendly prototype.

Ok.

> The basic idea is somewhat similar to JIT strings:
> two classes work in tandem: String and String::Buf.
> String is the public interface, String::Buf (which is private)
> performs memory management.

Ok.

> Data-wise, String is a triplet (char* data, len, Buf*). Data points
> into memory managed by the Buf. Multiple strings can share a Buf,
> possibly at different offsets. Bufs are allocated slightly bigger than
> needed, and some optimization strategies can be performed to make life
> easier for the memory manager.

Sensible.

> Basic design goals: refcounted, cheap copying, string parsing and
> slicing; reasonably-cheap appending (due to strings being
> almost-immutable).
>
> I'll put the incomplete code somewhere (Launchpad, probably) for
> everyone to review in a few days at most. So far it seems promising,
> in about 100 lines of code I have implemented memory management,
> instantiation, assignment, appending and some debugging (and certainly
> quite a few bugs).

Well the implementation itself is going to be relatively simple. The
problem isn't the implementation - the problem, as I've said before,
is fixing all the existing Squid String users.

Doing scatter/gather type IO tricks on refcounted buffer regions isn't
all that difficult. Squid's IO patterns make a naive-ish
implementation fine - since the majority of "large" buffers will have
to hang around until the end of the request (to keep the request/reply
headers intact) you're not going to waste any RAM by just keeping the
one large buffer in memory versus trying to break it up and only keep
what is being referenced in memory.

I think I mentioned this on IRC earlier - the design which worked for
Squid-2 is a reference counted buffer and then a string layer on top
mapping in the {offset, length} into the backend buffer. I also
thought about a buffer "reference" which included the {offset, length}
so you can use it in places like stmem without having to use Strings.
I tended to want to use "strings" where string-like semantics existed
(ie, all the data manipulation methods that help define a "string"
behaviour) and buffers for socket data.

I suggest you check out the changes I made in s27_adri to achieve all
of this in a sensible fashion. The bulk of the String related usage is
the same between Squid-2 and Squid-3. You may also have C++ compiler /
language related stuff to "get right" which I didn't have to deal with
in C (copy constructors, type conversion, etc) but I'm sure it won't
be difficult to do if done piecemeal.

Adrian
Received on Tue Aug 26 2008 - 01:55:55 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 26 2008 - 12:00:07 MDT