Re: Slow ICP under high loads (squid-1.2beta20 & 21) from Andres Kroonmaa on 1998-05-25 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Mon, 25 May 1998 18:39:34 +0300

--MimeMultipartBoundary
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT

On 27 May 98, at 11:26, Stewart Forster <slf@connect.com.au> wrote:

> For some time now I've been encountering very slow ICP responses from caches
> under high load (sustained > 30 TCP & > 100 ICP requests/sec). I've managed
> to trace this down to some code that was added a little while ago in the
> effort to reduce CPU usage.

> ** if ((incoming_counter++ & 15) == 0)
> ** comm_poll_incoming();

> ie. the above code calls comm_poll_incoming every 16 read or write
> requests. I've tried lower values, but the only one that seems to work
> okay is if I remove the "if ((incoming_counter++ & 15) == 0)" test and just
> call comm_poll_incoming for EVERY read or write event. This then results in
> timely ICP transactions and packets not being dropped. I've tested this in
> a production environment to a sustained 65 TCP hits/sec and 200 ICP hits/sec
> with no ICP slowdown evident yet.
>
> I'm not sure what the best fix is, but the "every 16" approach is only
> fine for low-medium loaded caches.

This is not all that easy. For every open tcp session, it takes many
reads/writes to fulfill even the smallest requests. For NOVM type of
squid, there are at least 4 open files for each open client session.
If we do not service at least 4 reads/writes per each new accept, we
tend to run out of files and slow down service times for already
pending sessions. In fact, this is the only protection against DoS in
squid, I guess. This "poll incoming every few R/W" was introduced to
avoid TCP queues blowing because they are not emptied fast enough,
causing dropped tcp packets and delays.

In general, this is a matter of preference. For some, servicing ready
FD's of open sessions is much higher priority than answering ICP for
peers, for some it might be the other way. But, if under high loads
ICP response times get high, this is a good indicator that remote peer
is (over)loaded, and peer selection algoritm may select another less
loaded peer. By polling ICP socket "on every corner" you get perfect
ICP response times, but in a way you "lie" to a remote peer that you
are almost idling while after it comes to fetch an object over tcp it
might encounter awful service times of overloaded squid.

I recall that initial patch for this was taking into account number
of ready incoming sockets last seen, and if it was > 0 then did set
repoll frequency to every 2 ordinary FD's, and if it was 0 then every
8. Not perfect, but more flexible.

Perhaps there should be some selftuning variable that changes according
to load pattern, and may be affected by configurable preference, but
I personally really don't like the idea of giving incoming sockets
almost infinite preference to those FD's yet not serviced.

BTW, you didn't mention, what was the very slow ICP responses in real
time measures. If more than 1 sek, then rude math shows that it takes
1000ms/16 = 62.5ms to service any read/write call on average which is
IMHO showing much more serious problems somewhere else.

----------------------------------------------------------------------
Andres Kroonmaa mail: andre@online.ee
Network Manager
Organization: MicroLink Online Tel: 6308 909
Tallinn, Sakala 19 Pho: +372 6308 909
Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
----------------------------------------------------------------------

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:47 MST