Re: Multiple issues in Squid-3.2.3 SMP + rock + aufs + a bit of load

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 03 Dec 2012 09:07:24 -0700

On 12/03/2012 01:44 AM, Henrik Nordström wrote:
> The store rebuilding does
>
> eventAdd("storeRebuild", RebuildStep, rb, 0.01, 1);
>
> which says both
>
> delay = 0.01, avoid saturating things too much
>
> weight = 1, this is a heavy job, slow down event.cc processing when this
> is seen. Only one heavy job should be scheduled per event loop.

Yes, if implemented correctly, it is essentially the same as the "ASAP
but after the next I/O loop" API I was talking about. In my thinking, I
avoided the existing weight API because I knew its implementation had
been broken for a while, but I now realize and agree that we should just
fix that implementation instead of adding another API.

>> 1) When entering the non-I/O engines loop, remember "now" and only
>> process timed events that are ready at that remembered time. Do not
>> advance that time value until we exit the loop and let I/O engine run.
>> This does not change the structure of the engines loop. Some events will
>> still be late, but there will be no disruptive injections of new I/O
>> activity that might not use async calls. This is probably a
>> smaller/easier change, but it also masks the true problem.
>
> event.cc actually does this (and always have), but gets broken by
> EventLoop::runOnce() doing this:

Yes. IIRC, it has been broken before my AsyncCall changes to the main
loop. I added "XXX: We may be called again..." to src/event.cc, but did
not try to fix it because I was worried that I do not understand
something and decided to leave the apparent bug as is to minimize
damage/surprise.

> do {
> // generate calls and events
> typedef engine_vector::iterator EVI;
> for (EVI i = engines.begin(); i != engines.end(); ++i) {
> if (*i != waitingEngine)
> checkEngine(*i, false);
> }
>
> // dispatch calls accumulated so far
> sawActivity = dispatchCalls();
> if (sawActivity)
> runOnceResult = false;
> } while (sawActivity);
>
> if (waitingEngine != NULL)
> checkEngine(waitingEngine, true);
>
>
> This will poll non-comm event loops until they are completely idle
> having nothing to do.

Yes, and that breaks the weight API. The "exhaust all AsyncCalls"
approach is still needed but we should honor the weight API and stop
processing timed events sooner if needed.

> The sawActivity loop should go away I think, polling each event engine
> once per loop. It's the responsibility of each event engine to run as
> many events as it needs per loop.

I was going to agree with that wholeheartedly, but then I thought about
signals. If we visit each non-waiting engine once, then we will only
process signals once per main loop step. Is that OK? I do not think so
because it will result in Squid sleeping in select(2) while there is a
signal waiting to be honored (the signal was received after the signal
engine was checked but before select(2) was called, so select will not
be interrupted).

I think we can still do that provided the
AsyncEngine::timeTillNextEvent() API discussed below is adjusted to
allow us to skip the waiting engine completely: If the call returns a
"check me again ASAP" AsyncEngine::EVENT_ASAP value, the main loop will
skip the waiting engine. Only the signal engine will return such ASAP
values.

> Looking at the code I think I see why this sawActivity loop got added
> around the non-comm event engines. There is a AsyncCalls induced delay
> procssing of events fired by checkEngine() making it fail to properly
> set the required loop_delay. We need to extend the event engine API with
> a call to return the minimum loop delay each engine needs after
> dispatchCalls() have been run allowing it to account for new events
> added.

I do not think that is why the sawActivity loop was added, but I agree
that the new AsyncEngine::timeTillNextEvent() or similar API would be a
nice solution.

> An interim solution without changing event engine API to include such
> call is to drop loop_delay down to 0 if there is any activity,
> disregarding what the event engine said in such case. This will cause a
> slight CPU penalty under lower loads as the event loop fires more often
> than needed but is a very easy change. I.e. something like this:
> (completely untested)
>
> === modified file 'src/EventLoop.cc'
> --- src/EventLoop.cc 2012-02-05 06:09:46 +0000
> +++ src/EventLoop.cc 2012-12-03 08:42:39 +0000
> @@ -107,19 +107,26 @@
> if (!waitingEngine && !engines.empty())
> waitingEngine = engines.back();
>
> - do {
> - // generate calls and events
> - typedef engine_vector::iterator EVI;
> - for (EVI i = engines.begin(); i != engines.end(); ++i) {
> - if (*i != waitingEngine)
> + // generate calls and events
> + typedef engine_vector::iterator EVI;
> + for (EVI i = engines.begin(); i != engines.end(); ++i) {
> + if (*i != waitingEngine)
> checkEngine(*i, false);
> - }
> + }
>
> - // dispatch calls accumulated so far
> - sawActivity = dispatchCalls();
> - if (sawActivity)
> - runOnceResult = false;
> - } while (sawActivity);
> + // dispatch calls accumulated so far
> + sawActivity = dispatchCalls();
> + if (sawActivity) {
> + runOnceResult = false;
> + /* Delay is unknown if there was events triggered. Schedule next
> + * event loop immediately so it can calculate delay proper
> + * TODO: Maybe replace this with polling each engine on their
> + * requested delay here instead of trying to do so in
> checkEngine()
> + * checkEngine() can't as it needs to be calculated after
> + * dispatchCalls()
> + */
> + loop_delay = 0;
> + }
>
> if (waitingEngine != NULL)
> checkEngine(waitingEngine, true);

I am worried that this will result in 50% or more zero-delay select(2)
calls for nearly all Squids because there is usually some AsyncCall
activity per I/O loop (but possibly no timed-events). And as the code
migrates from zero-delay timed events to AsyncCalls, the situation will
become worse. That does not sound like a good fix because it is not
clear (to me) what kind of side effects this will have on performance.

Could you please fix this properly by adding
AsyncEngine::timeTillNextEvent() or similar API and removing loop_delay
manipulation from checkEvents() as discussed above?

Thank you,

Alex.
Received on Mon Dec 03 2012 - 16:07:35 MST

This archive was generated by hypermail 2.2.0 : Tue Dec 04 2012 - 12:00:05 MST