Re: ICAP connections under heavy loads

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 29 Aug 2012 10:27:08 -0600

On 08/29/2012 04:45 AM, Alexander Komyagin wrote:

> I have Squid 3.2.1 (transparent) + c-icap (e.g. clamav). Currently I'm
> testing performance of this setup with httperf tool.
>
> When the number of client http requests is so big that c-icap becomes
> overloaded, some connections from Squid to icap are staying in SYN_SENT
> tcp state because c-icap listen backlog is full.

OK, this sounds normal.

> Corresponding Xaction
> objects are not destructed after client request timeout (I use 5 secs
> for httperf requests)

I am not intimately familiar with httperf (we use Web Polygraph), but I
assume that httpperf immediately closes any timed out connection and
Squid client-side code promptly notices the timeout. Please correct me
if my assumptions are wrong.

If ICAP Xaction stays alive long after the corresponding HTTP client
transaction is gone, then this could be a bug or deficiency. However,
please note that Squid tries to relay all available information to the
ICAP service (in case the ICAP service is logging it or needs it for
some other important reason -- not all services just check for viruses).
If an ICAP transaction still has things to do, it may outlive the
corresponding HTTP transaction.

We could make this "try as hard as you can to relay information to ICAP"
behavior conditional on whether the service is "essential" or "optional"
but a separate, dedicated setting may be warranted: I can imagine an
optional logging or leak detection ICAP service that does not want to
kill transactions if it is not working but still wants to receive all
HTTP messages even if their corresponding HTTP clients are gone.

> causing a lot of noteCommRead and noteCommWrote
> exceptions at the same time (in 2-4 mins after the test) - when r/w
> operations on the socket time out.

That sounds normal to me (because your ICAP service is not responsive in
this case).

> As a consequence, Squid leaks FD's (for a while)

By "leaks FD's (for a while)" do you mean that Squid uses more FDs than
it would if the ICAP service was working? Or that there is actually an
FD loss? The former is expected. The latter would be a bug.

> and meaninglessly switches icap status in minutes after the test.

Why do you describe a "down" status of an overloaded ICAP service as
"meaningless"? The status sounds appropriate to me! Again, I do not know
much about httperf internals, but in a real world (or a corresponding
Polygraph test), the ICAP service may be always overwhelmed if there is
too much traffic so a "down" state would be warranted in many such cases
(Squid, of course, cannot predict whether the service overload is
temporary).

> I found out that all Xaction objects not being destructed are initiated
> by HttpStateData ('avi' adaptation check). Looks like Squid is missing
> HttpStateData cleanup when client request timeouts.

HttpStateData launches RESPMOD transactions. If the ICAP service wants
to see response data even after the client is gone, it may prefer that
Squid keeps sending it. Again, I agree that this "persistence" is not
always desirable but there are important cases where it is. This is
similar to quick_abort settings for HTTP, but applies to ICAP that does
not have the corresponding setting (yet?).

> In Xaction new connections are created with ConnOpener job. ConnOpener
> sets connection FD iff he _really_ thinks the connection is now
> established (comm_connect_addr returned COMM_OK). But according to
> `netstat` that connection was always in SYN_SENT state. Maybe I just
> missed the point where it became ESTABLISHED. I will try to check it
> later.

Since Squid does not work on TCP packet level, it may think that the
connection is "open" when, in fact, it is not fully established. In
Squid context, isOpen() means that we can do things like schedule I/O
for that connection or extract peer address. It does not mean
ESTABLISHED in TCP sense.

HTH,

Alex.
Received on Wed Aug 29 2012 - 16:27:25 MDT

This archive was generated by hypermail 2.2.0 : Thu Aug 30 2012 - 12:00:12 MDT