Re: Comm::TcpAcceptor::doAccept fd limit handling is broken

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 16 Jul 2014 13:40:13 -0600

On 07/16/2014 11:11 AM, Rainer Weikusat wrote:

> void
> Comm::TcpAcceptor::doAccept(int fd, void *data)
> {
> try {
> debugs(5, 2, HERE << "New connection on FD " << fd);
>
> Must(isOpen(fd));
> TcpAcceptor *afd = static_cast<TcpAcceptor*>(data);
>
> if (!okToAccept()) {
> AcceptLimiter::Instance().defer(afd);
> } else {
> afd->acceptNext();
> }
> SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 0);

> This is broken because it re-enables readable notifications even if no
> connection was accepted.

In other words, it re-enables a check for new connections when

* we know that we currently cannot accept any new connections,
* Squid is under stress already, and
* our code does not handle multiple deferences of the same TcpAcceptor
object well.

A triple whammy of a bug! And this looks like a very old bug, going back
to at least 2009 (I have not checked further). Fortunately, the bug
primarily affects misconfigured or DoSed Squids.

> Judging from what I've seen
> from the code, the deferred queue could probably be ripped
> completely.

Well, we need to restart listening when the number of descriptors goes
down again. And since there may be many different waiting listeners some
kind of queue seems to be needed to store them, right? We are using the
wrong structure for that storage, but that is a separate issue.

> More conservative change which fixes this for my test case
>
>
> --- mad-squid/src/comm/TcpAcceptor.cc 8 Jan 2013 17:36:44 -0000 1.1.1.2
> +++ mad-squid/src/comm/TcpAcceptor.cc 16 Jul 2014 16:51:25 -0000
> @@ -204,7 +204,6 @@
> } else {
> afd->acceptNext();
> }
> - SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 0);
>
> } catch (const std::exception &e) {
> fatalf("FATAL: error while accepting new client connection: %s\n", e.what());
> @@ -257,7 +256,8 @@
> notify(flag, newConnDetails);
> mustStop("Listener socket closed");
> return;
> - }
> + } else if (!isLimited)
> + SetSelect(conn->fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, this, 0);
>
> debugs(5, 5, HERE << "Listener: " << conn <<
> " accepted new connection " << newConnDetails <<

Looks good to me. AFAICT, the "else" keyword is not needed, and we
should move the SetSelect() call _after_ logging the already accepted
connection. Let's wait for other opinions though.

> (code modified to accept only one connection at the same time):

AFAICT, both official and patched code accept only one connection at a
time -- our Select code does not support multiple listeners on a single
FD. Your change avoids queuing "billions" of pointers to the same
"disabled" listener in the deferred queue [that we do not handle well].

> NB^2: This is an attempt to be polite and helpful, not an invitation to
> use me for target practice.

Sorry to hear that -- polite and helpful are the best targets :-)!

Thanks a lot,

Alex.
Received on Wed Jul 16 2014 - 19:40:37 MDT

This archive was generated by hypermail 2.2.0 : Thu Jul 17 2014 - 12:00:12 MDT