Re: Ref-counted strings in Squid-2/Cacheboy

From: Adrian Chadd <adrian_at_squid-cache.org>
Date: Wed, 21 Jan 2009 10:35:23 -0500

I'd like to avoid having to write to those pages if possible. Leaving
the incoming data as read-only will save another write-back pass for
those pages through the cache/bus, and in the case of tiny objects
(ie, where parsing becomes a -big- part of the overhead), that may end
up hurting.

NUL terminated strings make iteration easier (you only need an address
register and a check for 0) but current CPUs with their
plenty-of-registers and superscalar execution mostly make that point
moot. You can check, increment the pointer and decrement a length
value pretty damned quickly. :)

There aren't all that many places that assume C buffer semantics for
String. Most of it isn't all that hairy (access_log, etc); some of it
is only hairy because of the use of _C_ string library functions with
String.buf() (ftp); the biggest annoyance is the vary code and the
client-side code. Oh, and one has to copy the buffer anyway for regexp
lookups (POSIX regex API requires a NUL terminated string), at least
until we convert to PCRE which can and does take a length parameter to
a regex run function. :)

The point is, once you've been forced to tidy up the String users by
removing the assumption that NUL will occur, you'll (hopefully) have
been forced to write nicer replacement code, and everyone benefits
from that.

Adrian

2009/1/21 Henrik Nordstrom <henrik_at_henriknordstrom.net>:
> fre 2009-01-16 klockan 12:53 -0500 skrev Adrian Chadd:
>
>> So far, so good. It turns out doing this as an intermediary step
>> worked out better than trying to replace the String code in its
>> entirety with replacement code which doesn't assume NUL terminated
>> strings.
>
> Just a thought, but is there really any parsing step where we can not
> just overwrite the next octet with a \0 to get null-terminated strings?
> This is what the parser does today, right?
>
> The HTTP parser certainly can in-place null-terminate everything. Header
> names always ends with a : which we always throw away, and the data ends
> with a newline which is also thrown away.
>
> Regards
> Henrik
>
>
Received on Wed Jan 21 2009 - 15:35:33 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 21 2009 - 12:00:26 MST