Re: [PATCH] Use SBuf for tunnel.cc I/O

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 03 Jun 2014 07:39:41 -0600

On 06/02/2014 11:20 PM, Amos Jeffries wrote:
> On 3/06/2014 6:39 a.m., Alex Rousskov wrote:
>> On 06/01/2014 02:37 AM, Amos Jeffries wrote:
>>> Lets take a step aside and cover why the buf2 is a pointer instead of an
>>> separate SBuf.
>>
>> I do not think those reasons are important enough in this context. It is
>> like suggesting "let's examine the benefits of using uninitialized
>> variables and creating dangling pointers" -- whatever the benefits are,
>> they are insufficient to knowingly add such code to Squid.
>>
>> Giving Comm a raw pointer to an internal job member is very similar to
>> "using uninitialized variables and creating dangling pointers". No
>> amount of resulting optimizations can justify adding that kind of API IMO.

> The caller Job has always been providing IoCallback with a buffer to
> fill. Before it was a shared char*. Now it is a shared SBuf*.

I can argue that screwing up an SBuf pointer (which actually points to
something else!) is a lot easier than screwing up a basic char pointer,
but my main argument is that adding a broken API is a bad idea,
regardless of whether we already have similar broken APIs.

There have been bugs where not canceling comm_read (which is still not
possible to do reliably!) led to disasters. There is no compelling
reason to duplicate that experience now IMO.

>>> * simple solution and ony way to guarantee that data-copy is not
>>> performed is to have comm_read operate on a pointer to sb1.
>>
>> AFAICT, among correct solutions, there is no simple solution that avoids
>> data copy. We can either remain "simple" and copy OR add "complexity" to
>> avoid copying.
>
> The solution Squid has always had:
> Caller provides comm_read() with a buffer *pointer* and explicitly
> cancels the read or closes the FD before it dies so the IoCallback /w
> pointer will be cleaned up.

Right. The above solution is both complex and wrong. We have to keep
using it until a better solution is available. No need to double the
problem by adding buf2 pointer to the buf pointer.

>>> comm_write() is a slightly different story. It *is* possible to take sb2
>>> as a copy then the write handler sb1.consume(len) for the written bytes.

Yes, read and write needs are different and probably require a different
optimal API.

> All of the parser-ng progress is being done on the assumed basis that we
> will have zero-copy to get data into SBuf somewhere between TCP read(2)
> and parser. If we have to revert to TCP->char*->SBuf we incur +1 or +2
> data copies *per packet*.

It is, of course, possible to eliminate copies. However, there should be
no assumption that such elimination is done using raw SBuf pointers. I
suspect performance gains from better parsers will outweigh extra read
copies, but I may be wrong, and it is possible to add a good API that
eliminates those extra copies.

> Optimizing performance using SBuf to manage 3 bytes of "GET" method
> makes no sense if squid has to re-allocate and memmove() several
> up-to-64KB chunks to parse them in the first place.
>
> IMO it will be worth some time now sorting this out properly before we
> add more processing code using SBuf.

Agreed. The best time for sorting this out was before buf2 commit, but
going through the motions now is better than doing it later.

> If you want an IRC meet-up to make this faster and save some time I can
> do that any of the next few days.

Sure, I will ping you.

Cheers,

Alex.
Received on Tue Jun 03 2014 - 13:39:45 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 03 2014 - 12:00:17 MDT