Re: ICAP connections under heavy loads

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 09 Sep 2012 20:30:03 +1200

On 9/09/2012 5:03 a.m., Alex Rousskov wrote:
> On 09/07/2012 09:51 PM, Amos Jeffries wrote:
>> On 8/09/2012 3:17 a.m., Alex Rousskov wrote:
>>> On 09/07/2012 08:33 AM, Alexander Komyagin wrote:
>>>>>> However, as I stated earlier, the comm.cc problem (actually semantics
>>>>>> problem) persists. I think it should be documented that second and
>>>>>> subsequent calls to comm_connect_addr() do not guarantee connection
>>>>>> establishment unless there was a correct select() notification.
>> Which to me means we cannot rely on it at all for the timeout callback
>> cases
> Moreover, there is no need to call comm_connect_addr() in timeout cases.
> The connection has timed out and should be closed. End of story. We do
> not need to try extra hard and check whether the connection was
> established at the OS level just when the timeout handler was scheduled;
> we will not be able to detect some such cases anyway and all such cases
> will be rare with any reasonable timeout.

We are agreed.

IIRC the race is made worse by our internal code going timout event
(async) scheduling a call (async), so doubling the async queue delay.
Which shows up worst on heavily loaded proxies.

The patch woudl be this:

=== modified file 'src/comm/ConnOpener.cc'
--- src/comm/ConnOpener.cc 2012-08-28 19:12:13 +0000
+++ src/comm/ConnOpener.cc 2012-09-09 08:20:09 +0000
@@ -139,6 +139,10 @@
          // it never reached fully open, so cleanup the FD handlers
          // Note that comm_close() sequence does not happen for
partially open FD
          Comm::SetSelect(temporaryFd_, COMM_SELECT_WRITE, NULL, NULL, 0);
+ if (calls_.earlyAbort_ != NULL) {
+ calls_.earlyAbort_->cancel("Comm::ConnOpener::doneConnecting");
+ calls_.earlyAbort_ = NULL;
+ }
          calls_.earlyAbort_ = NULL;
          if (calls_.timeout_ != NULL) {
calls_.timeout_->cancel("Comm::ConnOpener::doneConnecting");
@@ -309,14 +313,14 @@
      doneConnecting(COMM_ERR_CLOSING, io.xerrno); // NP: is closing or
shutdown better?
  }

-/**
+/** Timeout during connection attempt.
   * Handles the case(s) when a partially setup connection gets timed out.
- * NP: When commSetConnTimeout accepts generic CommCommonCbParams this
can die.
   */
  void
  Comm::ConnOpener::timeout(const CommTimeoutCbParams &)
  {
- connect();
+ debugs(5, 5, HERE << conn_ << ": * - ERR took too long to receive
response.");
+ doneConnecting(COMM_TIMEOUT, errno);
  }

  /* Legacy Wrapper for the retry event after COMM_INPROGRESS

>> The entire point of calling comm_connect_addr() from timeout callback is
>> to determine whether select() notification is in flight. If that fails
>> (as it does for your cases) it is useless doing, we may as well treat
>> them all as timeouts and abort quickly.
> Exactly. I hope we are all in agreement that the latter is the right
> thing to do. There is no point in complicating and slowing down the code
> to optimize handling of rare low-level race conditions. We can handle
> them by closing the connection.
>
>
>> I think from all this discussion the ConnOpener::DelayedConnectRetry
>> handler needs to be investigated as well.
> Agreed. Why is that handler reusing the same failed socket? Should not
> we close the failed socket descriptor upon detecting a failure and then
> open a new one just before trying again?

Hmm. Not sure. This is a straight copy of the older code.

If you want a new socket FD most of what start() does needs to be
re-done for it. Except that the start timer needs to be left unchanged,
and the async calls need to be switched to teh new FD (would that be a
reassignment, or a cancel + replace?)cancelled before replacing on the
new FD.

>>> comm_connect_addr() is called by ConnOpener in at least two cases:
>>>
>>> 1. To initiate the connection, after comm_openex in ConnOpener::start().
>> ConnOpener::connect() - called synchronously from start().
>>
>>> 2. After select(2) or equivalent said that the socket is ready for
>>> writing.
>> This is connect() scheduled from ConnOpener::InProgressConnectRetry().
> Agreed.
>
>> ** we could cancel the timeout and earlyAbort calls in the wrapper when
>> scheduling the connect() call to happen later. This will reduce the
>> async lag when those might kick in.
> I agree that we could do that, but it will only complicate the code as
> we will have to make sure that we do not delay reconnecting beyond the
> timeout (which may be in 0.000 seconds), and that we re-establish the
> timeout handler once we restart connecting again. I really do not think
> that the increased complexity is worth occasionally eliminating one
> timeout notification in an already rare failing case.
>
> Here is the sketch for connect retries on failures, as I see it:
>
> default:
> ++failRetries_; // TODO: rename to failedTries
>
> // we failed and have to start from scratch
> ... close the temporary socket ...
>
> // if we exhausted all retries, bail with an error
> if (failRetries_ > Config.connect_retries) {
> ... send COMM_ERR_CONNECT to the caller ...
> return;
> }
>
> // we can still retry after a small pause
> // we might timeout before that pause ends, but that is OK
> eventAdd("Comm::ConnOpener::DelayedConnectRetry", ...)
>
> More changes would be needed to make this work correctly:
>
> * the timeout handler should not be associated with the socket
> descriptor (since it may change). We can use eventAdd instead.
>
> * the Comm::ConnOpener::DelayedConnectRetry() wrapper should call a
> method that will open a new socket.

Sounds okay to me.

>
>> 3. after system-connect(), before timeout or select() notification is
>> received
>>
>> This is ConnOpener::DelayedRetryConnectRetry()
>>
>> * select maybe ready or not, timeout may also be pending or not. Thus
>> the if() timeout checks inside connect() swicth cases to abort on timeout.
> Yes, but we should just let the timeout handler do its thing instead of
> trying to predict what it will do and when it will do it, IMO. The
> current retry code may already schedule an event notification beyond the
> timeout horizon so while it tries to be super accurate with timeouts, it
> still misses a case. KISS instead.
>
>
> HTH,
>
> Alex.
>
>
>
>> 4. on timeout notification before select() notification is received
>> (maybe ready, maybe not)
>>
>> This is ConnOpener::timeout().
>>
>> * only the OK switch case from connect() is useful, all other events
>> COMM_TIMOUT callback notice needs to be sent.
>>
>>>
>>> Let's focus on the second context. We know the socket is writeable. This
>>> can mean that either (2a) the connection was established (or at least
>>> ready for writing) OR (2b) that there was a problem establishing the
>>> connection.
>>>
>>> Can a getsockopt() call in case 2b guarantee problem detection? If my
>>> reading of Stevens explanation is correct, it cannot (because Stevens
>>> suggests three ways to get that guarantee instead of calling getsockopt,
>>> implying that getsockopt is not reliable in this context).
>>>
>>> If I am right, the new documentation should reflect this uncertainty.
>>> However, as discussed before, the higher level code will still work even
>>> if we do not notice the error right away (we will get an error when we
>>> try to write or read).
>>>
>>>
>>> HTH,
>>>
>>> Alex.
>>>

Amos
Received on Sun Sep 09 2012 - 08:30:17 MDT

This archive was generated by hypermail 2.2.0 : Mon Sep 10 2012 - 12:00:03 MDT