Re: Slow ICP under high loads (squid-1.2beta20 & 21) from Andres Kroonmaa on 1998-05-27 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 27 May 1998 09:41:38 +0300

--MimeMultipartBoundary
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT

On 28 May 98, at 11:17, Stewart Forster <slf@connect.com.au> wrote:

Hi,

> > pending sessions. In fact, this is the only protection against DoS in
> > squid, I guess. This "poll incoming every few R/W" was introduced to
> > avoid TCP queues blowing because they are not emptied fast enough,
> > causing dropped tcp packets and delays.
>
> Your above explanation assumes that your external bandwidth is
> not capable of supporting what your cache can accept and pump through it.
> For us, our external bandwidth is capable, therefore we can drive
> exceptional amounts of traffic through our caches and NOT see TCP queues
> or FD limits blowing.

Actually, I assumed that squid would not pay attention frequently enough
for those TCP FD's that have lots of traffic in their input queues. This
makes it worse in case you have lots of external bandwidth coming in.
Just as happened to your UDP queues you described below. The difference
is that TCP input queues cannot be made that large, so they loose faster,
and TCP packet drops can be pretty expensive for faraway sites. If you
give too much preference to ICP and incoming, you waste time on that too
much and these TCP drops _will_ happen. And this is exactly what I try to
avoid.

> I'm not after perfect response times. I'm happy with the ICP
> slowing down a tad. What I'm not happy with is when the default is such
> that the caches ARE practically idling due to circumstances, AND ICP times
> are blowing out. If it's a matter of polling a touch faster to get the
> caches respond roughly to what they can handle, I'm going to do it.

I don't think that caches can be idling and yet cause ICP times blowing.
Static divisor 16 means that it won't repoll incoming sockets until
16 events happen on ordinary FD's in worst case. Its whole purpose
is to avoid needless poll of idle incoming sockets. If there are no
other ready FD's overall, then the whole loop will be redone with
comm_*_incoming. Main poll() will return immediately, if there is
anything pending on incoming sockets. So, idle cache is servicing
only incoming sockets as fast as it can, no matter what the divisor
is.

I wonder, to be blistering fast cache and service 250 ICP/sec, and at
the same time few tcp sessions/sec, you'd need to service at most 16
reads or writes in just 4ms, or, one event every 0.25 ms. Pretty tough.
   I agree that divisor 16 is not for everyone, especially that ICP
socket queues both remote requests and remote responses so that spike
in remote requests can blow the queue and cause us to miss positive
responses.
   The problem though is that squid makes no difference between ICP
requests and responses. I think it should. We want to get ICP responses
as fast as possible, and at the same time return ICP responses as fast
as we can given any current load. HTTP accept socket might also need
yet a different preference.
   So, I think we should always use different ICP socket for outgoing
requests from ICP socket for incoming requests. We should use different
UDP ports for them. We should poll ICP incoming NO more often than
ordinary sockets, and should poll more frequently ICP socket that has
responses to our queries? Then again, we won't see more ICP responses
than there are TCP reqs/sec, so this is equal to HTTP accept rate.

NB! There is one strange piece of code left in comm_poll:
if (fdIsHttpOrIcp(fd))
continue;
It makes squid ignore incoming sockets _even_ if they are ready
before incoming_counter wraps. It is kind of artificially slowing
down ICP servicing, while initially the intent was to poll them
more frequently, and introducing incoming_counter was only to
avoid way too excessive polling of them.
Why not service them right away as they become ready?

Btw, it would be very nice, if squid could make some kind of priorities,
something like priority queues in routers, and in a manner that
configuratin could force by ACL's some sessions to differing priorities.
For us, it would be wonderful to pinch porn sites to lowest possible
priority.

One way, I think, to implement FD priorities could be to add a var in
fd_table structure, stating relative priority, then do several polls
and include those FD's in poll whose priority is smaller or equal to that
of interest in each pass. This way, if we use a max of 10 priorities,
then full pass is done in 10 polls, highest priority FD's will be polled
every time, lowest priority FD's every 10th time. This isn't any
perfect scheme, but I guess we really should think about some. It would
force lower priority sockets to wait for one or more poll timeouts
before getting service if there is no activity on higher priority FD's.
Maybe it would be more appropriate to poll all FD's at once, then only
service all prio 0 FD's, then repoll, then service 0,1, repoll, etc...

Squid should have some FD's with highest priority, like disk io,
dns, even if remote ICP requests don't come through, we really wish
that ICP responses would, icmp, ..
I personally would set priorities something like this:
  0) squid internal (dns, disk, redirect, icmp, snmp, logs, ..)
  1) remote ICP responses
  2) accept socket(s) and normal tcp fetch sockets
  3) normal open tcp client sockets
  4) remote ICP requests,
  5) sessions specially marked as low priority by ACL's
  ..

> > BTW, you didn't mention, what was the very slow ICP responses in real
>
> We were seeing up to 2.5 seconds. Solaris has the ability to set
> a fairly large UDP receive buffer. Our was set to 256K. This allows for
> about 600 ICP responses to queue up, or at 250 ICP requests/sec, about 2.5
> seconds given squid not sucking up UDP fast enough. That's where the
> problem was. Squid was processing the incoming UDP queue quickly enough.
> ICP packets were otherwise simply dropping which didn't fit in the queue.

----------------------------------------------------------------------
Andres Kroonmaa mail: andre@online.ee
Network Manager
Organization: MicroLink Online Tel: 6308 909
Tallinn, Sakala 19 Pho: +372 6308 909
Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
----------------------------------------------------------------------

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:47 MST