Re: [PATCH] helper closing sequence cleanup

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 06 May 2009 10:49:48 +1200

Alex Rousskov wrote:
> On 05/05/2009 06:38 AM, Amos Jeffries wrote:
>> This one will definitely need a review and second opinion.
>>
>> While looking over the helper shutdown sequence to investigate bug 2648.
>>
>> If I understand it now the two states for end-of-life are:
>> - shutdown :: closure is needed, but for some reason it can't be done
>> immediately.
>> - closing :: actual comm_close() has been called on the read socket.
>
> Could you please document the corresponding helper_server flags,
> n_running, and n_active? And the stats members as well? This will help
> those reviewers who do not know the original code intent behind all
> those state variables and flags :-).

A one of those, yes it would. I only looked at the shutdown/closing
pathways. Will give the running pathways a look a well with that in mind.

>
> The patch overall seems to be cleaning up the code quite a bit, thank you.
>
> Should we make _helper_stateful a kid of _helper to avoid duplicate
> declarations and reduce the number of things we need to document? This
> may also allow you to write one DoAbort method for the two helper kinds.

Yea that was part of the idea for next step. I didn't go so far as to
alter the C/C++ nature of the code though.

>
> I am concerned about n_active management in the current code because it
> is decremented in helperShutdown even when we call "continue" and do
> almost nothing for that particular "link". Your patch seems more logical
> than the current code, but still it decrements the n_active counter
> before something specific has happened. Perhaps n_active documentation
> would clarify that, but maybe we should move n_active change to
> helperDoAbort or elsewhere?

One of the first things I looked at was moving it. It's done in the
current place because the shutdown procedure becomes one-way with the
next function call at that point.

They n_active-- should all be now after the continue and directly in
front of the *DoAbort() in helper*Shutdown(). Is there one particular
you noticed out of place still?

>
> Please add a description to helperDoAbort and friends. Can this be
> renamed to match shutdown or closing state terminology (e.g.,
> helperDoShutdown) or are we doing something different than shutting down
> or closing here? I think you have a good beginning of this documentation
> in this email already.

The docs are all at the point of function declaration. :)

helperDoShutdown is already there and one of the processes that call these.

>
> Can we remove flags.closing and rely on the corresponding wfd state
> instead? Is wfd always set to -1 when we set flags.closing to 1?

Did I not comment that some closures are rfd closed first, others are
wfd closed? The helepr closing happens both ways depending on whether
its Abort or Graceful close happening.

> What happens to the helper state after we set flags.closing and
> comm_close closes the descriptor? We do not have a flags.closed state.
> Is that because the helper is completely destroyed when comm_close does
> its job and calls us back?

As far as I can tell this ....

+/**
+ * Abort reading from this helper immediately.
+ *
+ * This is a sharp short closure for the helper which may be used
+ * for quick closure or error aborting.
+ * Any pending requests are abandoned.
+ *
+ * It's used during helprShutdown, but that is just graceful enough
+ * to schedule the close and wait for existing queues to drain.
+ */
+static void helperDoAbort(helper_server * srv);
+static void helperStatefulDoAbort(helper_stateful_server * srv);
+
+/**
+ * Graceful closure of the helper.
+ * Write socket is closed and helper marked as closing so that
+ * further requests are not queued.
+ *
+ * Pending queued items are currently abandoned (need to fix that),
+ * but requests already sent to the helper are waited for
+ * and the sockets only fully closed when none remain to receive back.
+ */
+static void helperDoGracefulClose(helper_server * srv);
+static void helperStatefulDoGracefulClose(helper_stateful_server * srv);

Then the close handler associated with the rfd is *Free.

This is why I named them that way around. Because marking thw wfd and
closing flags and waiting for the state to fall naturally into unused
and do its free looks more graceful than closing read and abandoning data.

I'd really like these to always be done the same way in a simple Close()
call, but can't see a good way to do it easily yet.

>
> Thank you,
>
> Alex.
>
>
>> I found a few things not quite right with the handling assuming both of
>> the above are true ...
>>
>> * line 1553 :: all of the comm_close calls appeared to be matched with
>> flags.closing=1 except this one in the stateful delayed shutdown
>> closure. This is probably the actual cause of bug 2648.
>>
>>
>> * line 1381, 1424 :: The lookups for next available helper were
>> skipping helpers with shutdown pending. But identifying those currently
>> closing as candidates. This leaves the helpers in an error-state closure
>> as candidates for use. It may not be an issue now with immediate comm
>> callbacks, but considering the closing state has a potential async step
>> in the middle I don't think its a good idea to ignore any longer.
>>
>>
>> * helper*Shutdown() :: were marking helpers as non-active before
>> checking to see if they had pending work to do, instead of marking only
>> the closed helpers as inactive. The effects of this are probably hidden
>> behind the comm shutdown closures on reconfigure.
>>
>>
>> * line 1011, 1123 :: error-state closures are called directly on the
>> FD, but the helper state is not updated to indicate the closure state of
>> the helper state object. At the very least flags.closing=1 needs to be
>> done here too. I've opted for making these closures part of the
>> hard-close (see below)
>>
>> The rest of the patch is removing duplicate code around comm_close()
>> into the two states of closure ( x2 for the two types of helper):
>>
>>
>> helper*DoAbortClose()
>>
>> Where the read socket for receiving further data goes down first. The
>> helper and its pending requests are abandoned. The shutdown terminator
>> is pushed to the write socket. The callbacks cleanup the remaining state
>> and close the write.
>>
>> I've pulled that closure code out of the shutdown sequences, this moves
>> a step closer to merging the two. Also allows the error-case closure
>> noted above to use the full close sequence on the state object with
>> windows shutdown etc. Which was missing previously.
>>
>> This one has windows shutdown handlers.
>>
>>
>> helper*DoGracefulClose()
>>
>> Where the write socket is closed so no more data can be pushed at the
>> helper, but the read socket remains open for the last pending data
>> results to come back.
>>
>> This is used for delayed shutdown completion, and for closure of the
>> write socket during read socket shutdown. Also enforces mark of the
>> closing flag for when its the first point of closure.
>>
>> This one I'm not sure of why, but its missing windows shutdown
>> handlers. The unwrapped comm_close may screw with the HANDLE closure.
>> May need to become a NOP for windows.
>>
>>
>> Amos
>>
>

-- 
Please be using
   Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
   Current Beta Squid 3.1.0.7
Received on Tue May 05 2009 - 22:49:57 MDT

This archive was generated by hypermail 2.2.0 : Wed May 06 2009 - 12:00:03 MDT