Re: pconn.cc assert index >= 0, async call queue madness

From: Amos Jeffries <squid3@dont-contact.us>
Date: Tue, 08 Apr 2008 23:50:02 +1200

Henrik Nordstrom wrote:
> mån 2008-04-07 klockan 23:41 -0600 skrev Alex Rousskov:
>> Can you post cache.log lines with full debugging enabled? I would be
>> happy to help with the queue review, but it would speed things up if I
>> do not have to go through the complicated steps required to reproduce
>> the problem.
>
> The cache.log is in ~hno/tmp/async-queue.log on squid-cache.org if you
> want it.
>
> But I think it's easier to discuss the problem on a meta level. What we
> need is more of a discussion on programming & design style than the
> specific case
>
> - There is two events that may happen, 'A' and 'B'.
> - In the processing of one interest of the other is deregistered with
> the silent assumption that after that it won't get invoked, but the
> object it's related to still exists.
> - Both events fire at the same time and gets queued in the async call
> queue.
> - 'A' executes. Deregisters interest in 'B' and assumes it won't get
> called, but it's already in the async call queue.
> - 'B' executes unexpectedly..
>
> There is three ways to deal with this
>
> a) make sure the event interest is registered with a cbdata that gets
> invalidated when the interest is deregistered.
>
> b) shield every such action so that if fails gracefully if it gets
> unexpectedly called due to race conditions with other actions on the
> same object.
>
> c) Make the async call queue always drain between each event processed
> and hope that's sufficient to avoid races (it's not 100%)
>
> 'a' is a lot easier to audit as you only need to look at one type of
> event at a time to judge if it's safe or not and is why I push for that
> model.
>
> 'b' is maybe a little more efficient as it needs less cbdata type
> objects. And also a quicker fix when running into this kind of problems,
> but in many cases it's very hard to identify all possible races..
>
> 'c' would also work I think, but not sure how to fit that in the current
> event engine mess with multiple paths that may each fire multiple events
> and only very loose coupling between them...

If its not easy and its not simple it means more bugs. I don't think 'c'
is a good choice here without a full ground-up recode. Nay on that from
me on general principles.

'a' gains on grounds of easy, but can it be done and gain performance?
'b' sounds like the assert() jungle squid currently has

undecided on the other two.

/2cents

>
> The question is primarily which of the three should we use as the
> default model.
>
> Regards
> Henrik
>

Amos

-- 
Please use Squid 2.6.STABLE19 or 3.0.STABLE4
Received on Tue Apr 08 2008 - 05:49:45 MDT

This archive was generated by hypermail 2.2.0 : Wed Apr 30 2008 - 12:00:07 MDT