Re: [PATCH] Make SBuf::find_first_of O(N) and implement find_first_not_of

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 16 Dec 2013 15:36:46 -0700

On 12/16/2013 02:59 PM, Kinkie wrote:
> On Mon, Dec 16, 2013 at 5:40 PM, Alex Rousskov wrote:
>> On 12/15/2013 04:54 AM, Kinkie wrote:
>>> attached patch is from branch lp:~squid/squid/characterset/

>>> + /// add a given char to the character set. c must be >= 0.
>>> + CharacterSet & add(const unsigned char c) {chars_[static_cast<uint8_t>(c)] = true; return *this; }
>>
>> The "c must be >= 0" comment is redundant given that "c" is unsigned. If
>> you are worried about folks adding negative signed chars, try adding a
>> corresponding add(const signed char) method with a check.
>
> Yup. Relic from when c was signed.
> Regarding signed-to-unsigned conversions, I believe we have the
> compiler catch them now, don't we?

I do not know, sorry (but you can easily test).

>> * CharacterSet creation and merging (+=) are heavy operations that do
>> not benefit from being inlined. I know it is a pain to deal with
>> Makefile dependencies, and I do not insist on this change, but please do
>> seriously consider placing them in .cc file.
>
> If we were to remove them from the class definition but leave them in
> the .h-file, would they still be inlined? (that's the easy way out).

I believe you would have to use an "inline" declaration keyword in that
case (to avoid "redefinition" errors), which would make them inlined.

> Otherwise, I'll try to put them in a new .cc file.

Thank you.

>> FWIW, the following problems can be solved easier or better in a .cc
>> file. With a little bit of additional work, you can even remove #include
>> <vector> from CharacterSet users that way!
>
> I don't see how if we want to leave CharacterSet::operator[] inlined,
> and that's arguably the most time-critical operation this code
> contains.

Good point. The [] operator should stay inlined, dragging <vector> with it.

>>> + //precondition: src.chars_.size() == chars_.size()
>>
>> There is no such precondition. I should be able to merge ALPHA and
>> NUMERIC sets, for example.

> The choice to make the vector size fixed does have some sense.
> std::vector::operator[] doesn't resize. If we want to make
> variable-size vectors, then add() becomes heavy because it must
> reserve() to resize chars_ if the vector is not big enough, and then
> we should de-inline add.. I'm fine either way, I leave the choice to
> you.

Sorry, I was confused: I started thinking "number of characters in
CharacterSet" instead of "number of internal array elements". All
CharacterSets will have exactly 256 elements, of course (that condition
applies to many methods not just this one).

>> Rewrite the += operator loop to simply
>> this->add() every character in the src set. Use std::for_each or another
>> <algorithm> for that if possible.

I think the above is still valid though.

Thank you,

Alex.
Received on Mon Dec 16 2013 - 22:37:18 MST

This archive was generated by hypermail 2.2.0 : Tue Dec 17 2013 - 12:00:11 MST