Re: [PATCH] Comm cleanup (Async ConnOpener)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sun, 18 Jul 2010 19:15:37 -0600

On 07/18/2010 04:52 AM, Amos Jeffries wrote:
> This updated my previous submission for the comm connection opener model.
>
> diff since last time:
>
> - removed the multi-path pathways. They are now isolated at the upper
> level. ConnOpener now handles only one Comm::Connection.
>
> - implemented ConnOpener as an AsyncJob child.
>
>
> What has changed from HEAD:
>
> ConnectionDetails objects have been renamed Comm::Connection and been
> extended to hold the FD and Squids' socket flags.
...
> On COMM_ERR_CONNECT and COMM_TIMEOUT the conn will be !isOpen() or NULL
> and should not be relied on.

If the connection pointer is NULL, a job opening multiple connections
would not be able to tell which connection just failed. Let's not hide
connection pointer from the initiator.

* Can Adaptation::Icap::Xaction::closeConnection set connection to NULL?

* Connection::close should set fd to -1 only when open (since we have to
spent time doing the check, let's use the result).

* declare ConnOpener::start() as virtual, especially since you declare
other API methods virtual.

* ConnOpener::start(), doneAll(), and swanSong() should remain protected
as they are in the AsyncJob.

* IMO, "********" distract and encourage others to put even more
fanciful decorations into the code.

* ConnOpener::operator = should return a reference to ConnOpener, not
just ConnOpener.

* You document private and protected methods in .h files where they are
declared. Did not we agree to document private and protected methods in
.cc files, where they are defined? Or was I doing it wrong for the last
few patches?

* s/solo/conn/ and adjust its documentation since ConnOpener no longer
handles multiple connections?

* Move connect_timeout declaration lower, just above connstart to reduce
padding?

* s/struct _calls/struct Calls/

* Naming consistency:

  s/connstart/connStart/
  s/earlyabort/earlyAbort/
  s/connect_timeout/connectTimeout/
  s/fail_retries/failRetries/

* cbdataReferenceDone() already checks for NULL pointer. You can save
two lines and some cycles if you do not do it in
Comm::Connection::~Connection().

* If you do not want Connections to be copied and assigned, make the
corresponding methods private. It would produce errors earlier and the
error messages will be a little more clear.

* No need to pre-declare "class Connection" in
src/comm/ListenStateData.h because you already included comm/forward.h.

* Do you need "Comm::" prefixes inside Comm namespace in comm/forward.h?

* Please remove the following line. It is a waste of CPU cycles, not to
mention a blunt layering violation:

memset(&calls, 0, sizeof(calls));

* Please remove these redundant and inconsistent lines in
ConnOpener::~ConnOpener():

+ solo = NULL;
+ calls.earlyabort = NULL;
+ calls.timeout = NULL;

* I do not recommend putting debugging in doneAll(). The best place to
report state is in status(), which is called automatically before and
after every job call (and can be used in the debugging statements inside
the job, of course).

* doneAll() should call its parent, even if the current parent always
returns true.

* Should ConnOpener::swanSong() call comm_close instead of just close()?

* ConnOpener::callCallback() clears the close handler and timeout
depending on solo state, which is kind of wrong, but the best thing to
do is to remove that cleanup code from callCallback() completely because
the same code is already present in swanSong() and because it is
unrelated to calling the callback.

* It is better not to not assert() inside async job code. Use Must() and
let Squid kill the job instead of killing Squid. One could make an
exception for before-start() code and for the destructor, but even those
are debatable.

* solo->fd > 0 is inconsistent with other checks. I would recommend
replacing (solo != NULL && solo->fd >= 0) with an inlined
hasConnection() or similar method to avoid these kind of problems.

* The /* handle connecting to one single path */ comment is obsolete.

* It may not be a good idea to separate calls.earlyabort and
calls.timeout existence from the existence of the corresponding Comm
close or timeout handler. Set them when setting the Comm handler. Clear
them when clearing the handler.

* I still think that keeping an array of unopened connections and using
  the first entry as "real connection" in FwdState is wrong. However, if
you insist on doing that, let's at least hide that ugliness using a few
simple inlined methods:

  - s/paths.size() > 0 && paths[0]->fd > -1/connectedToServer()/
  - s/paths[0]->/serverConnection()/

The serverConnection() method should throw or assert if paths are empty.

Actually, you already have conn() method so you can just fix that: make
it return a reference to the pointer and check that paths are not empty.
this is especially important since we call conn() from outside the
FwdState object.

* "if (paths[0]->fd > -1)" statements in the reforwarding code do not
check that the paths are not empty. Other FwdState code does. Should we
check here too? Or do not check in other contexts?

Same question for FwdState::serverClosed() and some other FwdState code.
Let's be safe and consistent everywhere.

* FwdState::unregister() used to prevent FwdState from closing the
unregistered connection. The new code does not. Was the old code buggy?

* The following code is redundant as the compiler will do similar stuff,
just more efficiently before exiting the "conn" context:

    conn = NULL; // maybe release the conn memory

* Shifting the hosts array is quite expensive as a lot of refcounted
pointers need to be re-assigned. Would it be better to replace 0 with
connIdx and increment that index instead of shifting?

* Empty comment near:
+ char nextHop[256]; //

* In the description below,
s/possible paths .../connections to open, in order, until successful/
because it is not clear from the context what "path" is.
+ /** possible paths which may be tried (in sequence stored) */
+ Comm::Paths paths;

And why is it called "paths"? Each paths element (i.e., a path) only
contains one point (the next hop). A path is a sequence of points.
Should we call the "paths" member "hops", "peers", or "servers"?

* ftp.cc contains duplicate FtpStateData::ftpPasvCallback creation code
that leads to AsyncJob::AsyncStart(cs). Please move it into a new
startPassiveFtp() method or some such.

* FwdState::complete may shift paths (via reforward). If that happens,
the following will close a different/new connection or assert/throw (or
even segfault in the current patch):

         fwd->complete();
- comm_close(fd);
+ fwd->conn()->close();

Overall, exposing connection pointer to Servers seems wrong, especially
because the pointer may change from one call to another. A Server that
needs that pointer, should store it at initialization time, just like it
stores fd in the current code.

You have "TODO: store Comm::Connection instead of FD". This is an XXX
and not just a TODO because of the above concerns.

* Will we start leaking peers after the change below?
> - if (fwd->servers)
> - _peer = fwd->servers->_peer; /* might be NULL */
> + if (fwd->conn() != NULL)
> + _peer = cbdataReference(fwd->conn()->getPeer()); /* might be NULL */

I cannot find the corresponding cbdataReferenceDone added somewhere.

* Creating a temporary refocunted pointer to a local variable is a bad
idea, no matter what Ident::Start does. The object will be deallocated
twice: once at the end of the if-statement, and once when the refcounter
reaches zero:

+ Comm::Connection cc; // IDENT will clone it's own copy
+ Comm::ConnectionPointer ccp = &cc;

* Declare Comm::ConnectionPointer p inside the for-loop in
peerSelectDnsResults() if that variable is not used outside the loop.

That is all I had time for, sorry.

Alex.

> FD opening, FD problems, connection errors, timeouts, early remote
> TCP_RST or NACK closure during the setup are all now wrapped out of
> sight inside ConnOpener.
> NP: I was not able to determine accurately the ICAP code which handled
> some of those errors. So it is likely to be sitting around still an
> unused. If so it needs to be called from the connect-callback handler on
> the appropriate result state as described above.
>
> The main-level component may set FD handlers as needed for read/write
> and closure of the link in their connection-done handler where the FD
> first becomes visible to them.
>
>
> Future work remains the same, once this is stable in 3.HEAD is to (in no
> particular order):
>
> * make ICAP do DNS lookups to set its server Comm::Connection properly.
> For now it's stuck with the gethostbyname() blocking lookup.
>
> * push the IDENT, NAT, EUI and TLS operations down into the Comm layer
> with simple flags for other layers to turn them on/off as desired.
>
> * make the general code pass Comm::Connection around so everything like
> ACLs can access the client and server conn when they need to.
>
> * implement SOCKS as just another connection mode flag.
>
>
> Amos
>
Received on Mon Jul 19 2010 - 01:16:43 MDT

This archive was generated by hypermail 2.2.0 : Mon Jul 19 2010 - 12:00:08 MDT