Re: [PATCH] Move timeout handling from ConnOpener::connect to ConnOpener::timeout

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 24 Jan 2013 16:40:40 -0700

On 01/18/2013 04:54 AM, Amos Jeffries wrote:
> On 18/01/2013 6:47 p.m., Alex Rousskov wrote:
>> Will comm_close() break if we call it instead of reaching into Comm guts
>> and manipulating its tables? If yes, what is Comm missing? Perhaps we
>> can give that missing piece to Comm so that we can comm_close()?

Amos,

    First of all, thank you for taking the time to answer my questions.
I believe this issue is an important part of ConnOpener cleanup.

> * Notice that until ConnOpener has finished the FD is not open, as in
> "if(isOpen(fd)) return;" produces false, so comm_close() will exit
> before it gets to any cleanup.

AFAICT, Comm marks FD open upon successful comm_openex() exit.
ConnOpener even asserts that the temporary FD is flagged as open:

> // Update the fd_table directly because conn_ is not yet storing the FD
> assert(temporaryFd_ < Squid_MaxFD);
> assert(fd_table[temporaryFd_].flags.open);

Connection::isOpen() will still return false, but we do not care about
that. We just want to call comm_close() as everybody else and
comm_close() does not require Connection.

> * The registered close handler is earlyAbort which flags the FD as error
> instead of timeout or success.

Yes. We would have to remove our close handler before closing if we do
not want it to be called when we close, just like all other code using
Comm I/O does. The close handler is there for ConnOpener to be notified
if some _other_ code closes our FD (e.g., shutdown).

> * fd_close(temporaryFd_), which we use already, cleans up all the
> non-ConnOpener state which is associated with the FD.

AFAICT, here is some of the other things we are missing by poorly
duplicating comm_close() sequence:

> ++ statCounter.syscalls.sock.closes;
>
> /* When one connection closes, give accept() a chance, if need be */
> Comm::AcceptLimiter::Instance().kick();

> fdd_table[fd].close_file = file;
> fdd_table[fd].close_line = line;

> comm_empty_os_read_buffers(fd);

Are these critical? Probably not, but they do invalidate the point that
our cleanup code is equivalent to comm_close(). Sooner or later, some of
these will come back to hunt us.

> It is simpler just to prune out the two bits of ConnOpener state in
> ConnOpener itself rather than rely on the generic close sequence, which
> has to do a lot of checking for irrelevant things.

ConnOpener state should be pruned in ConnOpener, of course, but I think
it is clear that we are not doing a good job cleaning Comm state. We
duplicate some comm_close() code but missing some details. I agree that
comm_close() sequence has many irrelevant things, but it is also used
only in a rare case when ConnOpener fails. We do not need to optimize
that path!

>> I am just trying to find a way to call comm_close() to clean up Comm
>> state without violating Comm APIs. We will clean the job's state in
>> addition to calling comm_close().
>
> ConnOpener lives in the limbo area between fd_open()/fd_close() which
> sets up the TCP state on an FD and comm_open()/comm_close() which setup
> and tear down the mountain of comm transaction I/O state on an FD.

I do not see what our temporary FD lacks that the rest of Squid code
using comm_open() and comm_close() have. I do not see why it is in that
limbo area. Perhaps it was at some point but not anymore?

> On the other error/timeout exit points from ConnOpener fd_close()+unwinding
> is the more appropriate way to cleanup [because Comm failed to setup the
> FD]

It did not fail. Comm_openex() was successful.

> You may indeed be able to re-write comm_close() such that it will work
> on any FD, including one which failed to reach success in ConnOpener.
> But I think that is a much more difficult/larger project outside the
> scope of the priority bug-fixes Rainer is working on here.

Perhaps I am missing something, but I do not see why we cannot call
comm_close() without any rewrite. We would still need to solve several
ConnOpener-specific problems, of course, the ones that are orthogonal to
us not properly cleaning up Comm state on failures.

Thank you,

Alex.
Received on Thu Jan 24 2013 - 23:40:51 MST

This archive was generated by hypermail 2.2.0 : Fri Jan 25 2013 - 12:00:09 MST