Re: Slow ICP under high loads (squid-1.2beta20 & 21)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 27 May 1998 09:41:38 +0300

--MimeMultipartBoundary
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT

On 28 May 98, at 11:17, Stewart Forster <slf@connect.com.au> wrote:

 Hi,
 
> > pending sessions. In fact, this is the only protection against DoS in
> > squid, I guess. This "poll incoming every few R/W" was introduced to
> > avoid TCP queues blowing because they are not emptied fast enough,
> > causing dropped tcp packets and delays.
>
> Your above explanation assumes that your external bandwidth is
> not capable of supporting what your cache can accept and pump through it.
> For us, our external bandwidth is capable, therefore we can drive
> exceptional amounts of traffic through our caches and NOT see TCP queues
> or FD limits blowing.

 Actually, I assumed that squid would not pay attention frequently enough
 for those TCP FD's that have lots of traffic in their input queues. This
 makes it worse in case you have lots of external bandwidth coming in.
 Just as happened to your UDP queues you described below. The difference
 is that TCP input queues cannot be made that large, so they loose faster,
 and TCP packet drops can be pretty expensive for faraway sites. If you
 give too much preference to ICP and incoming, you waste time on that too
 much and these TCP drops _will_ happen. And this is exactly what I try to
 avoid.
 
> I'm not after perfect response times. I'm happy with the ICP
> slowing down a tad. What I'm not happy with is when the default is such
> that the caches ARE practically idling due to circumstances, AND ICP times
> are blowing out. If it's a matter of polling a touch faster to get the
> caches respond roughly to what they can handle, I'm going to do it.

 I don't think that caches can be idling and yet cause ICP times blowing.
 Static divisor 16 means that it won't repoll incoming sockets until
 16 events happen on ordinary FD's in worst case. Its whole purpose
 is to avoid needless poll of idle incoming sockets. If there are no
 other ready FD's overall, then the whole loop will be redone with
 comm_*_incoming. Main poll() will return immediately, if there is
 anything pending on incoming sockets. So, idle cache is servicing
 only incoming sockets as fast as it can, no matter what the divisor
 is.

 I wonder, to be blistering fast cache and service 250 ICP/sec, and at
 the same time few tcp sessions/sec, you'd need to service at most 16
 reads or writes in just 4ms, or, one event every 0.25 ms. Pretty tough.
   I agree that divisor 16 is not for everyone, especially that ICP
 socket queues both remote requests and remote responses so that spike
 in remote requests can blow the queue and cause us to miss positive
 responses.
   The problem though is that squid makes no difference between ICP
 requests and responses. I think it should. We want to get ICP responses
 as fast as possible, and at the same time return ICP responses as fast
 as we can given any current load. HTTP accept socket might also need
 yet a different preference.
   So, I think we should always use different ICP socket for outgoing
 requests from ICP socket for incoming requests. We should use different
 UDP ports for them. We should poll ICP incoming NO more often than
 ordinary sockets, and should poll more frequently ICP socket that has
 responses to our queries? Then again, we won't see more ICP responses
 than there are TCP reqs/sec, so this is equal to HTTP accept rate.

 NB! There is one strange piece of code left in comm_poll:
   if (fdIsHttpOrIcp(fd))
      continue;
 It makes squid ignore incoming sockets _even_ if they are ready
 before incoming_counter wraps. It is kind of artificially slowing
 down ICP servicing, while initially the intent was to poll them
 more frequently, and introducing incoming_counter was only to
 avoid way too excessive polling of them.
 Why not service them right away as they become ready?

 Btw, it would be very nice, if squid could make some kind of priorities,
 something like priority queues in routers, and in a manner that
 configuratin could force by ACL's some sessions to differing priorities.
 For us, it would be wonderful to pinch porn sites to lowest possible
 priority.

 One way, I think, to implement FD priorities could be to add a var in
 fd_table structure, stating relative priority, then do several polls
 and include those FD's in poll whose priority is smaller or equal to that
 of interest in each pass. This way, if we use a max of 10 priorities,
 then full pass is done in 10 polls, highest priority FD's will be polled
 every time, lowest priority FD's every 10th time. This isn't any
 perfect scheme, but I guess we really should think about some. It would
 force lower priority sockets to wait for one or more poll timeouts
 before getting service if there is no activity on higher priority FD's.
 Maybe it would be more appropriate to poll all FD's at once, then only
 service all prio 0 FD's, then repoll, then service 0,1, repoll, etc...

 Squid should have some FD's with highest priority, like disk io,
 dns, even if remote ICP requests don't come through, we really wish
 that ICP responses would, icmp, ..
 I personally would set priorities something like this:
  0) squid internal (dns, disk, redirect, icmp, snmp, logs, ..)
  1) remote ICP responses
  2) accept socket(s) and normal tcp fetch sockets
  3) normal open tcp client sockets
  4) remote ICP requests,
  5) sessions specially marked as low priority by ACL's
  ..
  
> > BTW, you didn't mention, what was the very slow ICP responses in real
>
> We were seeing up to 2.5 seconds. Solaris has the ability to set
> a fairly large UDP receive buffer. Our was set to 256K. This allows for
> about 600 ICP responses to queue up, or at 250 ICP requests/sec, about 2.5
> seconds given squid not sucking up UDP fast enough. That's where the
> problem was. Squid was processing the incoming UDP queue quickly enough.
> ICP packets were otherwise simply dropping which didn't fit in the queue.
 

----------------------------------------------------------------------
 Andres Kroonmaa mail: andre@online.ee
 Network Manager
 Organization: MicroLink Online Tel: 6308 909
 Tallinn, Sakala 19 Pho: +372 6308 909
 Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
----------------------------------------------------------------------

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:47 MST