Re: Slow ICP under high loads (squid-1.2beta20 & 21)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Mon, 25 May 1998 18:39:34 +0300

--MimeMultipartBoundary
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT

On 27 May 98, at 11:26, Stewart Forster <slf@connect.com.au> wrote:

> For some time now I've been encountering very slow ICP responses from caches
> under high load (sustained > 30 TCP & > 100 ICP requests/sec). I've managed
> to trace this down to some code that was added a little while ago in the
> effort to reduce CPU usage.
 
> ** if ((incoming_counter++ & 15) == 0)
> ** comm_poll_incoming();

> ie. the above code calls comm_poll_incoming every 16 read or write
> requests. I've tried lower values, but the only one that seems to work
> okay is if I remove the "if ((incoming_counter++ & 15) == 0)" test and just
> call comm_poll_incoming for EVERY read or write event. This then results in
> timely ICP transactions and packets not being dropped. I've tested this in
> a production environment to a sustained 65 TCP hits/sec and 200 ICP hits/sec
> with no ICP slowdown evident yet.
>
> I'm not sure what the best fix is, but the "every 16" approach is only
> fine for low-medium loaded caches.

 This is not all that easy. For every open tcp session, it takes many
 reads/writes to fulfill even the smallest requests. For NOVM type of
 squid, there are at least 4 open files for each open client session.
 If we do not service at least 4 reads/writes per each new accept, we
 tend to run out of files and slow down service times for already
 pending sessions. In fact, this is the only protection against DoS in
 squid, I guess. This "poll incoming every few R/W" was introduced to
 avoid TCP queues blowing because they are not emptied fast enough,
 causing dropped tcp packets and delays.

 In general, this is a matter of preference. For some, servicing ready
 FD's of open sessions is much higher priority than answering ICP for
 peers, for some it might be the other way. But, if under high loads
 ICP response times get high, this is a good indicator that remote peer
 is (over)loaded, and peer selection algoritm may select another less
 loaded peer. By polling ICP socket "on every corner" you get perfect
 ICP response times, but in a way you "lie" to a remote peer that you
 are almost idling while after it comes to fetch an object over tcp it
 might encounter awful service times of overloaded squid.

 I recall that initial patch for this was taking into account number
 of ready incoming sockets last seen, and if it was > 0 then did set
 repoll frequency to every 2 ordinary FD's, and if it was 0 then every
 8. Not perfect, but more flexible.

 Perhaps there should be some selftuning variable that changes according
 to load pattern, and may be affected by configurable preference, but
 I personally really don't like the idea of giving incoming sockets
 almost infinite preference to those FD's yet not serviced.

 BTW, you didn't mention, what was the very slow ICP responses in real
 time measures. If more than 1 sek, then rude math shows that it takes
 1000ms/16 = 62.5ms to service any read/write call on average which is
 IMHO showing much more serious problems somewhere else.

----------------------------------------------------------------------
 Andres Kroonmaa mail: andre@online.ee
 Network Manager
 Organization: MicroLink Online Tel: 6308 909
 Tallinn, Sakala 19 Pho: +372 6308 909
 Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
----------------------------------------------------------------------

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:47 MST