Re: [AUDIT PATCH] comm cleanup - pconn changes

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 04 Oct 2010 14:31:25 -0600

On 10/02/2010 10:55 AM, Amos Jeffries wrote:
> Implements the pconn and IdleConnList as storage lists of
> Comm::Connection.
>
> With Comm::Connection the key definition has to change. The old keying
> based solely on domain would produce a Connection object whose FD did
> not exactly match the remote IP:port selected by the new routing logics.
>
> I've kept the concept that the pconn key is the details to describe the
> remote end of the link used to fetch any given domain. As such its
> become much simpler now. Just the remote IP:port from the TCP details
> and requested domain (post re-writing) or peer hostname for peer pconn.
>
> The change also removes the client address from relevance (enhancement
> bug fix). Instead we scan the list of pconn going to our desired
> remote-end for the local-end details (tcp_outgoing_addr or tproxy IPs)
> to pick an exact pconn match (bug fix).

.

> - list->removeFD(fd); /* might delete list */
> - comm_close(fd);
> + /* might delete list */
> + if (list && list->remove(conn)) {
> + Comm::ConnectionPointer nonConst = conn;
> + nonConst->close();
> + }

The list member cannot be NULL until the remove is called. The nonConst
conversion is also weird but I will discuss that later. Should probably
be rewritten as:

     Comm::ConnectionPointer deletedConn =
         list->remove(conn); // might delete list
     if (deletedConn != NULL)
         deletedConn->close();

However, this is still not equivalent to the old code because the old
code was closing the connection regardless of whether it was found in
the list. Is that an intentional change? Perhaps add a comment about it?

> - list->removeFD(fd); /* might delete list */
> - comm_close(fd);
> + Comm::ConnectionPointer temp = new Comm::Connection; // XXX: transition. make timeouts pass conn in
> + temp->fd = fd;
> + if (list->remove(temp)) {
> + temp->close();
> + } else
> + temp->fd = -1; // XXX: transition. prevent temp erasure double-closing FD until timeout CB passess conn in.

Lost that critical, IMO, "might delete list" comment.

Also, the creation of a temporary connection seems unnecessary when we
have a findIndex method. Use findIndex or, better, add a findConn method
if you want to close the found connection.

The same concerns about not-closing of the not-found connection apply
here as well.

> + for (int index = 0; index < nfds; index++)
> + theList[index] = oldList[index];

This makes IdleConnList::push slower that the old code that used
xmemcpy(). The shifting overhead is one of the reasons we should not use
[C] arrays here, IMO.

> -IdleConnList::read(int fd, char *buf, size_t len, comm_err_t flag, int xerrno, void *data)
> +IdleConnList::read(const Comm::ConnectionPointer &conn, char *buf, size_t len, comm_err_t flag, int xerrno, void *data)

Please capitalize static Read() if you are changing its profile anyway.

> + int used = strlen(buf);

Please use const.

> + Comm::ConnectionPointer nonConst = conn;
> + Comm::ConnectionPointer nonConst = conn;
> + Comm::ConnectionPointer nonConst = conn;
...

This needs to be fixed somehow as it is only going to get worse. There
are two questions here:

1) Does "const Pointer" always mean that we cannot change the class the
Pointer points to? No, it does not. It only means that we can never
change the Pointer itself. Our RefCount is rather confused about
const-correctness, but we can (and should) fix that.

2) Does Comm have a right to change Connection or similar classes
submitted to it for I/O management? Here, the decision is up to us, but
I think it is pretty clear that a lot of our Comm code already fiddles
with FD state (via fd_table and such) and, hence, with the Connection state.

If you agree, then calling conn->close() for "const Connection::Pointer
&conn" passed to a Comm I/O handler is just fine. If you need help
fixing RefCount to support that, please let me know. Otherwise, just
follow auto_ptr or any semi-standard smart pointer layout.

> + /**
> + * Search the list for an existing connection. Matches by FD.
> + *
> + * \retval false The connection does not currently exist in the list.
> + * We seem to have hit and lost a race condition.
> + * Nevermind, but MUST NOT do anything with the raw FD.
> + */
> + bool remove(const Comm::ConnectionPointer &conn);

You forgot to mention that remove() removes :-).

As discussed above, the "MUST NOT do anything" caveat seems to
contradict the current code which does something. Can you describe a
race condition that would make the Comm callback state essentially
invalid? Perhaps we can avoid spreading this FUD throughout the handlers
code...

> + /**
> + * Updates destLink to point at an existing open connection if available and retriable.
> + * Otherwise, return false.
> + *
> + * We close available persistent connection if the caller transaction is not
> + * retriable to avoid having a growing number of open connections when many
> + * transactions create persistent connections but are not retriable.
> + */
> + bool pop(Comm::ConnectionPointer &serverConn, const char *domain, bool retriable);

Why not always return the [possibly nil] popped connection instead?
Seems like a simpler and safer interface then separating the serverConn
value from the pop() status value.

> - int *fds;
> + Comm::ConnectionPointer *theList;

Please use

   * std::vector if you want expensive extractions occasional expensive
insertions due to memory [re]allocations; or
   * std::list if you want small memory allocations/deletions on every
insertion/extraction.

I would probably use std::list and later either make an intrusive
Connection list to avoid all allocations or try to MemPool list item
wrappers (not sure whether such MemPooling is possible).

This container member should also be documented, including the order of
the elements.

> + * A pool of persistent connections for a particular service type.
> + * HTTP servers being one such pool type, ICAP services another etc.

Just like HTTP, ICAP connections are to servers, not services. They are
identified by the IP:port address and not service URIs. Technically, we
can maintain one pool for both HTTP and ICAP servers/services and
everything would work. In some (most?!) environments, it may actually be
the right thing to do because it will keep the total number of idle
pconns constant.

Besides, PconnPool does not exactly pool individual connections but
rather IdleConnLists.

Consider:
* Manages idle persistent connections to a caller-defined set of
* servers (e.g., all HTTP servers). Uses a collection of IdleConnLists
* internally. Controls lists existence and limits the total number of
* idle connections across the collection.

It would also be nice to document that IdleConnLists assumes that their
PconnPool parent lives forever, but that is outside your patch scope.

Thank you,

Alex.
Received on Mon Oct 04 2010 - 20:31:38 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 05 2010 - 12:00:04 MDT