Re: [PATCH] Comm cleanup (Async ConnOpener)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Mon, 19 Jul 2010 23:26:42 +1200

Alex Rousskov wrote:
> On 07/18/2010 04:52 AM, Amos Jeffries wrote:
>> This updated my previous submission for the comm connection opener model.
>>
>> diff since last time:
>>
>> - removed the multi-path pathways. They are now isolated at the upper
>> level. ConnOpener now handles only one Comm::Connection.
>>
>> - implemented ConnOpener as an AsyncJob child.
>>
>>
>> What has changed from HEAD:
>>
>> ConnectionDetails objects have been renamed Comm::Connection and been
>> extended to hold the FD and Squids' socket flags.
> ...
>> On COMM_ERR_CONNECT and COMM_TIMEOUT the conn will be !isOpen() or NULL
>> and should not be relied on.
>
> If the connection pointer is NULL, a job opening multiple connections
> would not be able to tell which connection just failed. Let's not hide
> connection pointer from the initiator.
>
>
>
> * Can Adaptation::Icap::Xaction::closeConnection set connection to NULL?

In the reuseConnection == false case, it can after the connection->close().

>
> * Connection::close should set fd to -1 only when open (since we have to
> spent time doing the check, let's use the result).

Done. Using conn->isOpen() which was already present.

>
> * declare ConnOpener::start() as virtual, especially since you declare
> other API methods virtual.
>

Done.

> * ConnOpener::start(), doneAll(), and swanSong() should remain protected
> as they are in the AsyncJob.
>

Done. Apart from doneAll(). It is required to be accessible to the
ConnectRetry wrapper until comm write callbacks are turned into methods.

> * IMO, "********" distract and encourage others to put even more
> fanciful decorations into the code.

Sigh. So pretty... Removed.

>
> * ConnOpener::operator = should return a reference to ConnOpener, not
> just ConnOpener.
>

Done.

>
> * You document private and protected methods in .h files where they are
> declared. Did not we agree to document private and protected methods in
> .cc files, where they are defined? Or was I doing it wrong for the last
> few patches?

We did. Sorry. Fixed.

>
> * s/solo/conn/ and adjust its documentation since ConnOpener no longer
> handles multiple connections?

Done.

>
> * Move connect_timeout declaration lower, just above connstart to reduce
> padding?

Done.

>
> * s/struct _calls/struct Calls/

Done.

>
> * Naming consistency:
>
> s/connstart/connStart/
> s/earlyabort/earlyAbort/
> s/connect_timeout/connectTimeout/
> s/fail_retries/failRetries/
>

clash with member methods. Using camelCase _ suffixed private naming
instead.

>
> * cbdataReferenceDone() already checks for NULL pointer. You can save
> two lines and some cycles if you do not do it in
> Comm::Connection::~Connection().

Done.

>
> * If you do not want Connections to be copied and assigned, make the
> corresponding methods private. It would produce errors earlier and the
> error messages will be a little more clear.

Done.

>
> * No need to pre-declare "class Connection" in
> src/comm/ListenStateData.h because you already included comm/forward.h.
>

Done.

> * Do you need "Comm::" prefixes inside Comm namespace in comm/forward.h?
>
>
> * Please remove the following line. It is a waste of CPU cycles, not to
> mention a blunt layering violation:
>
> memset(&calls, 0, sizeof(calls));

Done.

>
>
> * Please remove these redundant and inconsistent lines in
> ConnOpener::~ConnOpener():
>
> + solo = NULL;
> + calls.earlyabort = NULL;
> + calls.timeout = NULL;
>

Done.

> * I do not recommend putting debugging in doneAll(). The best place to
> report state is in status(), which is called automatically before and
> after every job call (and can be used in the debugging statements inside
> the job, of course).

Killed.

>
> * doneAll() should call its parent, even if the current parent always
> returns true.

Done.

>
> * Should ConnOpener::swanSong() call comm_close instead of just close()?

No. Until ConnOpener() has completed successfully there is no other
valid user of the conn->fd. The only handlers to deal with are
ConnOpeners' earlyAbort and timeout handlers, which are explicitly
cancelled and removed.
  comm_close does a lot of additional handler and callback state
updating over multiple callbacks. This is all unnecessary waste of
cycles on a closed or maybe half-open FD.

  As you noticed the earlyAbort and timeout handler removal and
cancellation only happens if the ConnOpener knows the FD is sane enough
to do a removal before scheduling the callback. Otherwise the calls are
left untouched and swanSong cleans up.

You said I was to account for swanSong being called under any unknown
circumstances, so I've made it handle any possible state of ConnOpener
prior to destructor. The important local fields are removed in an order
to reduce the work done by callCallback() to absolute minimum in case of
more errors.
   callCallback is the "normal" exit point for regular operations and
needs to cleanup with comm_remove_* and touch a lot of complex fde table
state if things have run properly. On some of the swanSong error states
(half-open FD hits Must() somewhere in INPROGRESS state) this is not
possible and WILL crash Squid.

>
> * ConnOpener::callCallback() clears the close handler and timeout
> depending on solo state, which is kind of wrong, but the best thing to
> do is to remove that cleanup code from callCallback() completely because
> the same code is already present in swanSong() and because it is
> unrelated to calling the callback.

Ideally yes. I'm not certain enough about the state of the Job on the
second (ConnectRetry) legacy entrance to remove either at this point.
When the Write handling is able to be a method properly then the
swanSong() pre-cancellations can be removed.
  For now we have the NULL checks saving cycles where possible without
adding any risky situations.

>
> * It is better not to not assert() inside async job code. Use Must() and
> let Squid kill the job instead of killing Squid. One could make an
> exception for before-start() code and for the destructor, but even those
> are debatable.

You mean the one in callCallback()? Removed.

>
> * solo->fd > 0 is inconsistent with other checks. I would recommend
> replacing (solo != NULL && solo->fd >= 0) with an inlined
> hasConnection() or similar method to avoid these kind of problems.

Updated to conn_->isOpen() after renaming.

>
> * The /* handle connecting to one single path */ comment is obsolete.
>
Updated.

> * It may not be a good idea to separate calls.earlyabort and
> calls.timeout existence from the existence of the corresponding Comm
> close or timeout handler. Set them when setting the Comm handler. Clear
> them when clearing the handler.

What do you mean? they _are_ the close and timeout handlers for the
duration of the ConnOpener activity. Set on comm_openex and removed on
ConnOpener::callCallback()

>
> * I still think that keeping an array of unopened connections and using
> the first entry as "real connection" in FwdState is wrong. However, if
> you insist on doing that, let's at least hide that ugliness using a few
> simple inlined methods:
>
> - s/paths.size() > 0 && paths[0]->fd > -1/connectedToServer()/
> - s/paths[0]->/serverConnection()/
>
> The serverConnection() method should throw or assert if paths are empty.
>
> Actually, you already have conn() method so you can just fix that: make
> it return a reference to the pointer and check that paths are not empty.
> this is especially important since we call conn() from outside the
> FwdState object.
>

Fixed and changed to serverConnection(). All code pulling paths[0] for
use changed to use the wrapper.

>
> * "if (paths[0]->fd > -1)" statements in the reforwarding code do not
> check that the paths are not empty. Other FwdState code does. Should we
> check here too? Or do not check in other contexts?
>
> Same question for FwdState::serverClosed() and some other FwdState code.
> Let's be safe and consistent everywhere.

Added FwdState::isServerConnectionOpen() to unify the tests.

>
>
> * FwdState::unregister() used to prevent FwdState from closing the
> unregistered connection. The new code does not. Was the old code buggy?
>

Only in Comm::Connection semantics. Added an alternative version and
deprecated the old unregister(int) with a hack to keep it going on the
old semantics.

>
> * The following code is redundant as the compiler will do similar stuff,
> just more efficiently before exiting the "conn" context:
>
> conn = NULL; // maybe release the conn memory

Small optimization. Pinned and persistent connections can nest fairly
deeply. With a conn at each level. This releases those resources as soon
as we are surely done with them.

>
> * Shifting the hosts array is quite expensive as a lot of refcounted
> pointers need to be re-assigned. Would it be better to replace 0 with
> connIdx and increment that index instead of shifting?

For performance yes. I'll look at implementing that boost later.

>
> * Empty comment near:
> + char nextHop[256]; //
>

Removed.

> * In the description below,
> s/possible paths .../connections to open, in order, until successful/
> because it is not clear from the context what "path" is.
> + /** possible paths which may be tried (in sequence stored) */
> + Comm::Paths paths;

Done.

>
> And why is it called "paths"? Each paths element (i.e., a path) only
> contains one point (the next hop). A path is a sequence of points.
> Should we call the "paths" member "hops", "peers", or "servers"?

Sequence of two location data points are stored in each conn:
   (local,remote)

The paths vector<> holds a series of such.

Though I agree paths is not a good name, and I am leaning towards
remoteDestinations or somesuch now.

>
> * ftp.cc contains duplicate FtpStateData::ftpPasvCallback creation code
> that leads to AsyncJob::AsyncStart(cs). Please move it into a new
> startPassiveFtp() method or some such.

Disagree. The conn details are not the same. Only the name of the
callback and the fact its now starting a ConnOpener. Which means a new
file-global function is needed just for four lines of code.

Maybe when FtpState is cleaned up further.

>
> * FwdState::complete may shift paths (via reforward). If that happens,
> the following will close a different/new connection or assert/throw (or
> even segfault in the current patch):
>
> fwd->complete();
> - comm_close(fd);
> + fwd->conn()->close();

I gopher.cc? Good catch. Rolling that change back.

>
> Overall, exposing connection pointer to Servers seems wrong, especially
> because the pointer may change from one call to another. A Server that
> needs that pointer, should store it at initialization time, just like it
> stores fd in the current code.
>
> You have "TODO: store Comm::Connection instead of FD". This is an XXX
> and not just a TODO because of the above concerns.
>

This auditing task is scoped to do the connection setup and get
split-stack IPv6 a chance of going operational. Which means as little
upper layer change in the Servers as we can cope with.

The following steps are to cleanup ListenStateData and CommCalls a bit
more then to roll conn across all the objects which as FD between
themselves.

Which reminds me:
   Is there any reason why the CommCalls have unique params classes and
dialers for CommTimeoutCb* and CommCloseCb* ? I'd kind of like to roll
them and CommConnectCb* into passing CommCommonCb* if thats possible,
and cut ~25% of the CommCalls code on the next cleanup step.

>
> * Will we start leaking peers after the change below?
>> - if (fwd->servers)
>> - _peer = fwd->servers->_peer; /* might be NULL */
>> + if (fwd->conn() != NULL)
>> + _peer = cbdataReference(fwd->conn()->getPeer()); /* might be NULL */
>
> I cannot find the corresponding cbdataReferenceDone added somewhere.

Haven't seen any. Though it may be happening. Added to the matching
destructor.

>
> * Creating a temporary refocunted pointer to a local variable is a bad
> idea, no matter what Ident::Start does. The object will be deallocated
> twice: once at the end of the if-statement, and once when the refcounter
> reaches zero:
>
> + Comm::Connection cc; // IDENT will clone it's own copy
> + Comm::ConnectionPointer ccp = &cc;
>

Okay. Made it dynamic.

>
> * Declare Comm::ConnectionPointer p inside the for-loop in
> peerSelectDnsResults() if that variable is not used outside the loop.

Done.

>
>
> That is all I had time for, sorry.
>
> Alex.
>
>

All changed. Running my checks again now before commit and re-submit.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.5
Received on Mon Jul 19 2010 - 11:27:45 MDT

This archive was generated by hypermail 2.2.0 : Mon Jul 19 2010 - 12:00:08 MDT