Re: [RFC] shutdown stages

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 09 Aug 2011 10:13:55 -0600

On 08/08/2011 08:15 PM, Amos Jeffries wrote:

> Okay we seem to agree on all the prerequisites and problems. Just the
> algorithm of choice differs.

Yes, that is rather likely.

> So separating the central issues into changes. Ordered by severity.
>
>
> 1) pending async queue after the grace period is over / shutdown
> functions scheduling calls
>
> Insert a checking member to SignalEngine as an intermediary between
> other code and StopEventLoop().
>
> Problem: Can we get to Async queue size() from SignalEngine members?

Can we just implement the following?

    If in shutdown and there are no events, quit the main loop.
    Otherwise, keep iterating the main loop.

where "event" includes all scheduled async calls and scheduled timed
events, ignoring the ones after the shutdown timeout.

In pseudo code, this can be written as

    main()
    {
        initial setup
        while (!in_shutdown || haveWorkToDo())
            do a single main loop iteration
        return 0;
    }

    onShutdown() {
       in_shutdown = true;
       schedule forceful termination of master transactions in N seconds
       stop accepting new master transactions
       other pre-shutdown steps...
    }

    onMasterTransactionEnd() {
       ...
       if (in_shutdown && there are no more master transactions)
           schedule and async call to perform shutdown steps
    }

haveWorkToDo() will have access to the number of scheduled async calls
and [relevant] timed events. That should not be hard to write.

The key here is that almost everything proceeds as usual while we let
the existing master transactions end. The main loop runs until the very end.

> 2) useless waiting / grace period
>
> How about a shutdown "ticker" style event?
> Rather than a blanket 30-second grace period jump. We schedule a
> shutdownTick() every second starting with shutdown signal arrival. The
> tick event checks to see if a) all transactions are over, or b) grace
> period has expired since its first call. Either way things can be kicked
> along, else it re-schedules itself in one more second.
>
> The alternative of scheduling it as a call could lead to very tight
> loops and CPU consumption. So I'm avoiding it for now despite
> potentially being a better solution in the long run. Events can be
> easily converted in a second round of optimization.

With the design sketched above, you do not need a ticker. If
transactions stop earlier, we quit the main loop earlier. If
transactions keep running, the shutdown timer will expire and its
handler will forcefully kill all running transactions (while the main
loop still runs!).

> Problem: How to best detect the number of active transactions?

We need a master transaction class that will provide a forceful
termination API. There will be a list of master transaction objects. The
size of the list will determine the number of master transactions. The
list and the API will be used to kill master transactions that take too
long.

> Do we have any global clients + internal transactions count already
> which I've been too thick to see?
> If not...every transaction manager object seems to have an
> AccessLogEntry in its members. (Recall that is why I suggested that
> object became Xaction master slab). A static counter of transactions in
> there could be used.

We need more than a counter (see above). Ideally, we need to split
HttpRequest into MasterTransaction and pure HttpRequest. For now, we
could just add a MasterTransaction member to the HttpRequest. We could
reuse and adjust the existing AccessLogEntry for that, but I would
rather not mix the two completely different uses to save a few bytes!

Cheers,

Alex.
Received on Tue Aug 09 2011 - 16:14:13 MDT

This archive was generated by hypermail 2.2.0 : Wed Aug 10 2011 - 12:00:03 MDT