Re: Multiple issues in Squid-3.2.3 SMP + rock + aufs + a bit of load from Amos Jeffries on 2012-12-03 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 04 Dec 2012 12:18:29 +1300

On 04.12.2012 05:07, Alex Rousskov wrote:
> On 12/03/2012 01:44 AM, Henrik Nordström wrote:
>> The store rebuilding does
>>
>> eventAdd("storeRebuild", RebuildStep, rb, 0.01, 1);
>>
>> which says both
>>
>> delay = 0.01, avoid saturating things too much
>>
>> weight = 1, this is a heavy job, slow down event.cc processing when
>> this
>> is seen. Only one heavy job should be scheduled per event loop.
>
> Yes, if implemented correctly, it is essentially the same as the
> "ASAP
> but after the next I/O loop" API I was talking about. In my thinking,
> I
> avoided the existing weight API because I knew its implementation had
> been broken for a while, but I now realize and agree that we should
> just
> fix that implementation instead of adding another API.
>
>
>>> 1) When entering the non-I/O engines loop, remember "now" and only
>>> process timed events that are ready at that remembered time. Do not
>>> advance that time value until we exit the loop and let I/O engine
>>> run.
>>> This does not change the structure of the engines loop. Some events
>>> will
>>> still be late, but there will be no disruptive injections of new
>>> I/O
>>> activity that might not use async calls. This is probably a
>>> smaller/easier change, but it also masks the true problem.
>>
>> event.cc actually does this (and always have), but gets broken by
>> EventLoop::runOnce() doing this:
>
> Yes. IIRC, it has been broken before my AsyncCall changes to the main
> loop. I added "XXX: We may be called again..." to src/event.cc, but
> did
> not try to fix it because I was worried that I do not understand
> something and decided to leave the apparent bug as is to minimize
> damage/surprise.
>
>
>> do {
>> // generate calls and events
>> typedef engine_vector::iterator EVI;
>> for (EVI i = engines.begin(); i != engines.end(); ++i) {
>> if (*i != waitingEngine)
>> checkEngine(*i, false);
>> }
>>
>> // dispatch calls accumulated so far
>> sawActivity = dispatchCalls();
>> if (sawActivity)
>> runOnceResult = false;
>> } while (sawActivity);
>>
>> if (waitingEngine != NULL)
>> checkEngine(waitingEngine, true);
>>
>>
>> This will poll non-comm event loops until they are completely idle
>> having nothing to do.
>
> Yes, and that breaks the weight API. The "exhaust all AsyncCalls"
> approach is still needed but we should honor the weight API and stop
> processing timed events sooner if needed.
>
>
>> The sawActivity loop should go away I think, polling each event
>> engine
>> once per loop. It's the responsibility of each event engine to run
>> as
>> many events as it needs per loop.
>
> I was going to agree with that wholeheartedly, but then I thought
> about
> signals. If we visit each non-waiting engine once, then we will only
> process signals once per main loop step. Is that OK? I do not think
> so
> because it will result in Squid sleeping in select(2) while there is
> a
> signal waiting to be honored (the signal was received after the
> signal
> engine was checked but before select(2) was called, so select will
> not
> be interrupted).
>
> I think we can still do that provided the
> AsyncEngine::timeTillNextEvent() API discussed below is adjusted to
> allow us to skip the waiting engine completely: If the call returns a
> "check me again ASAP" AsyncEngine::EVENT_ASAP value, the main loop
> will
> skip the waiting engine. Only the signal engine will return such ASAP
> values.
>
>
>> Looking at the code I think I see why this sawActivity loop got
>> added
>> around the non-comm event engines. There is a AsyncCalls induced
>> delay
>> procssing of events fired by checkEngine() making it fail to
>> properly
>> set the required loop_delay. We need to extend the event engine API
>> with
>> a call to return the minimum loop delay each engine needs after
>> dispatchCalls() have been run allowing it to account for new events
>> added.
>
> I do not think that is why the sawActivity loop was added, but I
> agree
> that the new AsyncEngine::timeTillNextEvent() or similar API would be
> a
> nice solution.
>
>
>> An interim solution without changing event engine API to include
>> such
>> call is to drop loop_delay down to 0 if there is any activity,
>> disregarding what the event engine said in such case. This will
>> cause a
>> slight CPU penalty under lower loads as the event loop fires more
>> often
>> than needed but is a very easy change. I.e. something like this:
>> (completely untested)
>>
>> === modified file 'src/EventLoop.cc'
>> --- src/EventLoop.cc 2012-02-05 06:09:46 +0000
>> +++ src/EventLoop.cc 2012-12-03 08:42:39 +0000
>> @@ -107,19 +107,26 @@
>> if (!waitingEngine && !engines.empty())
>> waitingEngine = engines.back();
>>
>> - do {
>> - // generate calls and events
>> - typedef engine_vector::iterator EVI;
>> - for (EVI i = engines.begin(); i != engines.end(); ++i) {
>> - if (*i != waitingEngine)
>> + // generate calls and events
>> + typedef engine_vector::iterator EVI;
>> + for (EVI i = engines.begin(); i != engines.end(); ++i) {
>> + if (*i != waitingEngine)
>> checkEngine(*i, false);
>> - }
>> + }
>>
>> - // dispatch calls accumulated so far
>> - sawActivity = dispatchCalls();
>> - if (sawActivity)
>> - runOnceResult = false;
>> - } while (sawActivity);
>> + // dispatch calls accumulated so far
>> + sawActivity = dispatchCalls();
>> + if (sawActivity) {
>> + runOnceResult = false;
>> + /* Delay is unknown if there was events triggered. Schedule next
>> + * event loop immediately so it can calculate delay proper
>> + * TODO: Maybe replace this with polling each engine on
>> their
>> + * requested delay here instead of trying to do so in
>> checkEngine()
>> + * checkEngine() can't as it needs to be calculated after
>> + * dispatchCalls()
>> + */
>> + loop_delay = 0;
>> + }
>>
>> if (waitingEngine != NULL)
>> checkEngine(waitingEngine, true);
>
> I am worried that this will result in 50% or more zero-delay
> select(2)
> calls for nearly all Squids because there is usually some AsyncCall
> activity per I/O loop (but possibly no timed-events). And as the code
> migrates from zero-delay timed events to AsyncCalls, the situation
> will
> become worse. That does not sound like a good fix because it is not
> clear (to me) what kind of side effects this will have on
> performance.
>
> Could you please fix this properly by adding
> AsyncEngine::timeTillNextEvent() or similar API and removing
> loop_delay
> manipulation from checkEvents() as discussed above?
>

Would it make sense to do one of:

* have the AsyncCall queue drain with a timeout. So that no sequence of
calls can lag any of the sensitive engines too much. That along with the
event engines scheduling their heavyweight actions as Calls instead of
immediately executing.

* maintain a real-time benchmark of each event by the AsyncCalls
engine. Marking a min/max duration of time spent servicing each call. So
that we can identify the heavyweight AsyncCalls and schedule a Call
checking for signals and events after each big one.

* split the single AsyncCalls queue into a Xaction and Bg queues. Such
that active transaction calls, signals, and events are not held up by
the background events which are not sensitive to lag.

With a main loop something like:
  + signals scheduled into one Queue
  + timed events scheduled into one Queue
  + signals scheduled into one Queue
  + Xaction Queue draining
  + signals scheduled into one Queue
  + one Bg Call drained
  + ... repeat

With two queues, we can schedule as many heavy-weight Calls as needed
at any point but not block the loop on all of them happening. While also
'blocking' the loop for the other more important Calls queue.

Amos
Received on Mon Dec 03 2012 - 23:18:32 MST

This archive was generated by hypermail 2.2.0 : Tue Dec 04 2012 - 12:00:05 MST