Re: [PATCH] Use SBuf for tunnel.cc I/O

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 03 Jun 2014 17:20:52 +1200

On 3/06/2014 6:39 a.m., Alex Rousskov wrote:
> On 06/01/2014 02:37 AM, Amos Jeffries wrote:
>> Lets take a step aside and cover why the buf2 is a pointer instead of an
>> separate SBuf.
>
> I do not think those reasons are important enough in this context. It is
> like suggesting "let's examine the benefits of using uninitialized
> variables and creating dangling pointers" -- whatever the benefits are,
> they are insufficient to knowingly add such code to Squid.
>
> Giving Comm a raw pointer to an internal job member is very similar to
> "using uninitialized variables and creating dangling pointers". No
> amount of resulting optimizations can justify adding that kind of API IMO.
>

The caller Job has always been providing IoCallback with a buffer to
fill. Before it was a shared char*. Now it is a shared SBuf*.

 ===> rev.13324 simply replaces a char[] with a SBuf.

Any "dangling pointers" will be affecting Squid regardless of the SBuf
I/O patch.

I did go though carefully and make sure references to the
IoCallback::buf member were handled properly inside comm and IoCallback.
So uninitialized variables should also not be an issue for the patch as
comitted, if we find any they need fixing rather than revert.

So where is the problem?

>
>> * simple solution and ony way to guarantee that data-copy is not
>> performed is to have comm_read operate on a pointer to sb1.
>
> AFAICT, among correct solutions, there is no simple solution that avoids
> data copy. We can either remain "simple" and copy OR add "complexity" to
> avoid copying.

The solution Squid has always had:
  Caller provides comm_read() with a buffer *pointer* and explicitly
cancels the read or closes the FD before it dies so the IoCallback /w
pointer will be cleaned up.

Complexity is added due to two reasons:
 1) the need to retain char* API for comm_read/comm_write. If we dared
convert all I/O buffers to SBuf immediately that complexity could be
removed.
 2) the natural complexity difference between char[] and SBuf. That is
unfortunate, but the limited SBuf API usage needed does not appear to be
a bad trade for the zero-copy benefit to be gained end-to-end through Squid.

>
>> comm_write() is a slightly different story. It *is* possible to take sb2
>> as a copy then the write handler sb1.consume(len) for the written bytes.
>>
>> However, since read has the "SBuf *buf2" as a pointer
>
> ... so we are back to the discussion above: Since I do not think having
> buf2 pointer should be allowed, any conclusion based on having it there
> becomes irrelevant.
>
>
>> If you can provide a way to avoid all the above problems while
>> performing comm_read() on a non-pointer IoCallback "SBuf buf2;" I am
>> very interested.
>
> I do not think me providing a solution to all problems should be a
> precondition to removing unaudited code that introduces a raw pointer
> shared between Comm and jobs. If I understand the intent behind the
> audit procedures correctly, it suggests the opposite: removal of the
> code until a better solution is agreed upon.
>
> There are no simple solutions here and the complex ones require serious
> discussions and significant investment of time. IMO, there are better
> ways to spend those resources right now. In other words, there are more
> important problems to fix. Removing a data copy while crossing Comm
> boundary can wait, especially if nobody wants to take a lead on that
> project. Meanwhile, we can add a safe SBuf-friendly API and continue to
> copy.

All of the parser-ng progress is being done on the assumed basis that we
will have zero-copy to get data into SBuf somewhere between TCP read(2)
and parser. If we have to revert to TCP->char*->SBuf we incur +1 or +2
data copies *per packet*.
 Optimizing performance using SBuf to manage 3 bytes of "GET" method
makes no sense if squid has to re-allocate and memmove() several
up-to-64KB chunks to parse them in the first place.

IMO it will be worth some time now sorting this out properly before we
add more processing code using SBuf.

If you want an IRC meet-up to make this faster and save some time I can
do that any of the next few days.

Amos
Received on Tue Jun 03 2014 - 05:21:02 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 03 2014 - 12:00:17 MDT