Re: StringNG SBuf fixes for rfind()

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 05 Mar 2013 11:24:08 -0700

On 03/03/2013 10:45 PM, Amos Jeffries wrote:

> http://master.squid-cache.org/~amosjeffries/patches/SBuf_fixes_mk3.patch

* SBuf::rfind(const SBuf &needle, SBuf::size_type endPos) const

> + // on empty hay std::string returns npos
> + if (length() < needle.length())
> + return SBuf::npos;

The comment is inaccurate: empty std::string returns npos only when the
needle is not empty. Besides, the if-statement does not check for an
empty hay at all. The above code is correct -- if needle does not fit we
can only return npose, of course, but I do not think we need a reference
to std::string in this case.

> // on empty needle std::string returns the position the search starts
> if (needle.length() == 0)
> return endPos;

The comment is a little misleading: std::string actually returns its
length when searching start exceeds that length:

> std::string("a2").rfind("", 0) = 0
> std::string("a2").rfind("", 1) = 1
> std::string("a2").rfind("", 2) = 2
> std::string("a2").rfind("", 3) = 2

The code itself is correct because you adjust endPos before getting to
that if-statement. Instead of trying hard to describe what std::string
does correctly, I suggest just saying something like this:

    // match std::string::rfind() behavior on empty needles

* SBuf::rfind(char c, SBuf::size_type endPos) const

> + // on empty hay std::string returns size of hay
> + if (length() < 1)
> return SBuf::npos;

The comment is out of sync with the correct code. Here we do not need to
refer to std::string IMO.

> There is a rather strange problem with using memrchr(). If you pass it
> the accurate count of bytes to check it fails to match start/end of hay
> properly in sub-string cases.

It does not in my tests:

> memrchr("a2", '2', 0) = 0
> memrchr("a2", '2', 1) = 0
> memrchr("a2", '2', 2) = 0x8048ba1
> memrchr("a2", '2', 3) = 0x8048ba1

> memrchr("2a", '2', 0) = 0
> memrchr("2a", '2', 1) = 0x8048c61
> memrchr("2a", '2', 2) = 0x8048c61
> memrchr("2a", '2', 3) = 0x8048c61

> memrchr("a2a", '2', 0) = 0
> memrchr("a2a", '2', 1) = 0
> memrchr("a2a", '2', 2) = 0x8048c02
> memrchr("a2a", '2', 3) = 0x8048c02

All of the above seem to work as expected when the "accurate count of
bytes to check" is passed.

> I have had to use endPos+1 and artificially manipulate the situations
> where endPos == length().

Your mk3 code seems to be correct, with no strange problems in this
area: You are correctly converting character position to the number of
bytes to scan. I recommend replacing the "weirdness" comment with
something like "convert character position to the number of characters
to scan".

HTH,

Alex.
Received on Tue Mar 05 2013 - 18:24:15 MST

This archive was generated by hypermail 2.2.0 : Wed Mar 06 2013 - 12:00:05 MST