Re: ICAP connections under heavy loads

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 28 Aug 2012 09:26:36 -0600

On 08/28/2012 08:56 AM, Alexander Komyagin wrote:

> It seems that I've found the problem related to the case when listen
> queue of ICAP server is full and it can accept new connections no more.

Hi Alexander,

    It may help if you summarize the problem before (or after) giving
the details. It is clear that the ICAP server is overloaded, but I
assume you are referring to some Squid-specific problem instead, and it
is not clear which of the Squid events you describe below are a problem
you are trying to address.

> This way Squid has a bunch (like 800 or so) of connections in SYN_SENT
> state, and after their timeout we encounter exceptions in noteCommRead
> (errno 110 and 104 for some) and noteCommWrote (errno 32) in the Xaction
> object. And then Squids temporarily turns off the icap av.

I assume the above is _not_ the problem you are trying to describe but
the correct behavior (i.e., the overloaded ICAP server is marked as
"bad" and is avoided for a while, as expected). Please correct me if I
am wrong.

> I noticed two VERY strange things:
>
> 1) in noteCommRead connection->isOpen() returns true, and Xaction thinks
> that the connection is established, but prior to the exception that
> connection was marked as SYN_SENT in `netstat -lant` output.

Connection::isOpen() means that the connection has a valid/opened socket
descriptor and nothing more. If you prefer, it means that the connection
is "not closed". It is more of a socket-level check rather than a
TCP-level check.

> 2) client (httperf) has request timeout set to 5 sec, so all Xaction
> objects should be destroyed by the time I got those exceptions (through
> noteInitiatorAborted() ), right?

Yes (if your ICAP transaction timeout is much longer than 5 seconds),
provided the HTTP client closes the connection in a way that Squid can
notice immediately.

> Then why I still do get noteComm* callbacks and it looks like
> noteInitiatorAborted() was never called for corresponding "nasty"
> Xaction objects?
>
> It worth mentioning that most Xaction objects are "good" and I can see
> noteInititiatorAborted() being successfully called for them. But those
> "nasty" objects are really breaking the stuff.

I do not know what you mean by "nasty objects" or "breaking stuff" but
it is possible that there are Squid bugs in detecting client termination
conditions, propagating client termination notifications to ICAP, and/or
processing received notifications, of course.

To triage, trace a single HTTP client transaction and a single
"nasty"(?) ICAP transaction using debug_options ALL,9 to see why
noteInitiatorAborted() is not scheduled or is not acted upon. Does the
HTTP code detect that the client has closed the connection? If yes, does
it sent an abort notification to ICAP? If yes, does the ICAP code
correctly reacts to that notification?

Post that trace here if you need help analyzing it (but, as usual, there
is no guarantee somebody will help, especially if the trace requires a
lot of work to isolate the interesting bits).

Thank you,

Alex.
Received on Tue Aug 28 2012 - 15:26:50 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 29 2012 - 12:00:18 MDT