Re: [PATCH] Make SBuf::find_first_of O(N) and implement find_first_not_of

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 17 Dec 2013 08:32:35 -0700

On 12/17/2013 03:12 AM, Kinkie wrote:
>>> The choice to make the vector size fixed does have some sense.
>>> std::vector::operator[] doesn't resize. If we want to make
>>> variable-size vectors, then add() becomes heavy because it must
>>> reserve() to resize chars_ if the vector is not big enough, and then
>>> we should de-inline add.. I'm fine either way, I leave the choice to
>>> you.
>>
>> Sorry, I was confused: I started thinking "number of characters in
>> CharacterSet" instead of "number of internal array elements". All
>> CharacterSets will have exactly 256 elements, of course (that condition
>> applies to many methods not just this one).
>
> This is not strictly necessary: at the expense of some computation the
> array should have at least (pseudo-code)
> max(asc(char[]))-min(asc(char[])) slots. At the expense of some (but
> less than in the previous case less) computation, at least
> max(asc(char[])). We'd have to check bounds in operator[],
> chars_.reserve() in add() and in operator += but that's about it. The
> expected saving is 1/2 to 2/3 of the memory (assuming we will mostly
> deal with 7-bit printable characters). Your call if it's worth it.

I think it would be wrong to optimize CharacterSet RAM usage at the
expense of code speed and complexity. I recommend leaving all
CharacterSet storage containers at 256 bytes, at least for now.

>>>> Rewrite the += operator loop to simply
>>>> this->add() every character in the src set. Use std::for_each or another
>>>> <algorithm> for that if possible.
>>
>> I think the above is still valid though.
>
> Sure, it can be done. But it'd mean building a position-to-character
> mapping method (currently we only map character-to-position). This in
> turn would mean using a position-based loop and operator[] instead of
> an iterator. Is it worth it for something which is essentially a dumb
> copy on a probably-not-too-used operation?
> Note: I'm not resisting the change. Just pointing to the drawbacks I
> see so that an informed decision can be made :)

Just using add() in an explicit for-loop is probably the best we can do
for now.

HTH,

Alex.
Received on Tue Dec 17 2013 - 15:33:08 MST

This archive was generated by hypermail 2.2.0 : Tue Dec 17 2013 - 12:00:11 MST