Re: Patch to fix comm_incoming (squid-1.2beta22) from Stewart Forster on 1998-06-04 (squid-dev)

From: Stewart Forster <slf@dont-contact.us>
Date: Fri, 05 Jun 1998 10:00:15 +1000

--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii

Hi Andres,

> I personally do not like the idea of calling all special sockets directly
> in a row, and here's why:
> 1) what you want to achieve is avoiding extra syscall in form of poll()

Yes. Is poll/select more expensive than read?

> 2) you do at least 2 syscalls in form of read() instead

Yep. That could be a problem. Again, which is cheaper?

> 3) you need to modify each handler (pass a pointer to int to it) just to
> make note of no of ready FD's

Just the two incoming handlers. That's already done. Actually
it could have been done much like read() is, where each handler returns
as an integer the number of events handled. That would have been more
elegant. I see your point, but I also see how the current implementation
could be reworked to make this point moot.

> 4) in case we want to have separate sockets for ICP.client and ICP.server
> and/or several HTTP accept sockets, it will make needed no of syscalls
> far higher than with poll()ing them all first.

Agreed for ICP and HTTP. These will have potentially very different
rates. Separate counters?

> 5) you need special handling for them all, eg. writes, busting somewhat
> overall squid design.

Huh?

> alternatively:
> 1) using poll() for special sockets we'd do at worst 1 more syscall, in case
> some of sockets are ready, and at best _only_ 1 syscall in case there are
> none. (which IMHO is far more likely to happen, especially if you want
> good reaction time to incoming bursts)
> 2) poll returns you number of FD's ready, you need not hack with &incame
> 3) we service only those FD's that are indeed ready
> 4) we can have the same procedure to handle ready FD's regardless of what
> sockets they are.
>
> > incoming_interval = incoming_interval + (no_incoming_sockets / 2)
> > - incame;
> >
> > This allows for fairly rapid dropping on the incoming_interval
> > when things start blowing out, but only increases fairly slowly. Also,
> > since we only want to call the checks when things are probably going to
> > happen we avoid excessive calls and also excessive delays.
>
> Stew, your algoritm has assumptions that may brake the idea. For eg. if
> number of incoming sockets is 1, the above algoritm will move towards 0
> incoming_interval, which is definitely wrong for http-only caches.

I didn't specify it, but the algorithm never lets
(no_incoming_sockets / 2) be less than 1, so HTTP only caches will still
work fine and not appproach a 0 incoming_interval.

> Please let me define the goal from slightly different perspective:
> 1) whatever algoritm, its only prupose is to _predict_ when the next
> incoming event is likely to happen

My algorithm does that well. (Tested).

> 2) we tend to achieve reasonable prediction by measuring average incoming
> event rate (comparing to regular event rate, not real time)

We are not really concerned with the average incoming event rate.
What we are concerned about is how often do we need to recheck/repoll for
new events on the incoming sockets.

> 3) we NEVER can predict actual next event correctly: our prediction will
> make us repoll ALWAYS either "too-late" or "too-early"

My algorithm as posted will move naturally towards a repoll/recheck
rate which has at least the desired average number of events occuring.

> 4) we want to error on too-late rather than too-early (surprise?),
> BUT, VERY slightly. (not to poll FD's just before they become ready)

That's cool, and no surprise. My algorithm does that. Tested.
Actually I've made it a little agressive by making the target average number
of events half of the incoming sockets because I value faster response times
even at the expense of some CPU.

> 5) we surely want the algoritm to track as precisely as possible the real
> world with minimal set of data and assumptions, yet still NOT get
> confused by accidental spikes in traffic rate

By setting INCOMING_FACTOR high enough (6-8) the algorithm becomes
fairly immune to short spikes. It tracks in a VERY simple way the period
required between rechecks. Tested.

> 6) we want it to react to rate increase of incoming events adequately fast

From startup I've tested that my algorithm reacted (on our caches)
from no load to full load in the space of 5 seconds. That's fast enough
and also slow enough to avoid spikes. Because the desired average number of
incoming events is quite low, large numbers of events act fairly quickly
to drag down the incoming_interval.

> 7) we want it to react to lack of incoming events slowly, but still react

Again. The new algorithm does this. If the average number of
events we wanted was 1, then incoming_interval increases by only 1 for every
non-event. Given a INCOMING_FACTOR of 6, that requires 64 non-events in a
row before it increases another whole integer amount.

> 8) whenever any event happens on incoming socket, all we know is that there
> were some (N > 0) FD's ready at the same time.

That's from using poll, right?

> 9) we should not assume any rate from the ready_FD count, unless we measure
> no of messages received on each FD individually.

Which is what's done right now.

> (Thats because all we know is that event happened, we do not know exact
> timing, and assuming >1 events in a row to represent any incoming rate is
> largely unwarranted. For eg. http and ICP can come in at exactly same rate,
> but always in pairs, requests from remote ICP peers can come in at the
> same slow rate, but in chunks of 1-3, counting these may lead to way wrong
> predictions. Do not confuse prediction with statistical average incoming rate)

No, I don't confuse the two, but do you have a better idea to guess
at when a event is going to happen?

> From that I'd propose:
> 1) not to count number of ready FD's on all incoming sockets at all
> ( if we really want such count to influence the algoritm, then count messages
> serviced in a row on each FD and use max of all per FD counts)

We don't now, but go ahread...

> 2) take the only data for algoritm to be ready_incoming_FDs == false/true
>
> 3) in comm_incoming, poll() all incoming sockets, service those ready and
> set next poll time.
>
> 4) use variable (next_poll for eg.) that "predicts" after how many ordinary
> io_events we expect activity on incoming sockets.
>
> 5) when io_events reach next_poll, we do check comm_incoming
>
> 1) if prediction of next_poll was too-late (sockets ready) we reduce next_poll
> by reduce_rate _slightly_ (to avoid over-regulation)
>
> 2) if prediction of next_poll was too-early, (no FD's ready), we increase next_poll
> by increase_rate _slightly_ (to avoid too fast decay)
>
> 3) if prediction was too-late >1 consecutive times, then we know that our
> next_poll reduce_rate is too slow, adapt by upping reduce_rate (say * 2)
>
> 4) if prediction was too-early >1 consecutive times, then we are checking incoming
> sockets too often, but might wish not to change next_poll increase_rate, instead
> allow algoritm to slowly decay linearly.

You have just complicated greatly what the algorithm already does.
And, you have done it unnecceraily by restricting your access to available
information. The current algorithm and as it stands it is extremely good,
and after testing it on our caches I watch the incoming_interval fluctuate
between two integer values quite happily (2 to 3 on high loads on one of our
slower cache boxes), or sit quite happily on 7 for ages on our faster cache
box.

The only things I think could be improved upon currently is this:

1) Re-introduction of poll() if worthwhile. I agree with you here,
it may be that poll()/select() will be cheaper. Need to measure.

        2) Reworking the handlers such they just return as an integer the
           incame value, as opposed to passing a pointer to retrieve that
           value.

Cheers,

Stew.

-- 
Stewart Forster (Snr. Development Engineer)
connect.com.au pty ltd, Level 9, 114 Albert Rd, Sth Melbourne, VIC 3205, Aust.
Email: slf@connect.com.au   Phone: +61 3 9251-3684   Fax: +61 3 9251-3666
--MimeMultipartBoundary--

Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:48 MST