Re: [RFC] shutdown stages

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 08 Aug 2011 11:08:17 -0600

On 08/07/2011 05:22 AM, Amos Jeffries wrote:

> At present shutdown happens over two stages. With server socket and auth
> shutdown immediately and everything else 30 seconds or so later.
>
> We have a number of bugs open about this, ranging from complaints that
> idle squid still take a full 30sec shutdown when they should fast-track
> the process and close quickly. Through to crashes when helper shutdown
> or swap.state dump are started too late and not given enough time to
> complete.
>
> I would like to split that process down into several steps with as much
> stuff being shut down as fast as reasonable.
>
> stage 1:
> * TIMEOUT: shutdown_lifetime/2 converted to a timeout, to trigger
> stage-2 early if needed.
> * close ICP, HTCP, HTTP, HTTPS ports, but leave DNS open
> * prevent keep-alive, and close all connections on completion
> * disable the background cyclic events (ie garbage collection). we can
> trigger these "manually" as part of the shutdown if they are needed at
> certain points.
> * begin swap.state dump of index
> * call stage 2 immediately IF no active client connections AND stage<2.
>
> stage 2:
> * TIMEOUT: shutdown_lifetime/2 converted to a timeout, to trigger
> stage-3 early if needed.
> * disable new reads.
> * disconnect delay pools. push any buffered information out to clients
> at full speed and terminate. (bad idea? good idea?)
> * call stage 3 immediately IF no active client connections AND stage<3.
>
> stage 3:
> * terminate remaining incomplete active clients using their close
> handlers.
> * loop until async call queue is empty.
> * skip to stage N immediately.
>
> stage N:
> * terminate the event loop and exit.
>
>
> Attached is a proposed patch which alters shutting_down into an
> incremental stage counter and alters the documentation. But does not
> alter the fundamentals of the current state.
>
> I believe we can easily add several shutdown functions to the
> SignalEngine. Or move these to the cache manager as series of async
> calls with the SignalEngine simply triggering the cache manager shutdown
> event.

I do not know whether the proposed stages are right or wrong overall
because I do not see what specific principle was used to divide things
up this way and not some other way. I do think the concept of stages is
applicable, but I would define it differently:

Stage 1; "Time-limited no-harm winding down": Stop accepting/creating
new master transactions. All existing master transactions should start
shutting down if possible. No harm to any existing transactions at this
stage. The maximum duration of this stage is controlled by
shutdown_lifetime in squid.conf.

Stage 2; "Controlled or high-level tear down": All existing master
transactions are forcefully terminated. All modules are asked to shut
down if possible. If some cache_dirs need to start writing their index
to disk, they will do this at this time. Event loop ends at the very end
of this stage, when there are no new events to process. The maximum
duration of this stage is unknown/unlimited because a module may take a
long time to cleanup (e.g., write its index file to disk).

Stage 3; "Uncontrolled or low-level tear down": Squid main() exits.
Static destruction/cleanup starts. The maximum duration of this stage is
unknown/unlimited but no slow activities are expected at this stage.

We could limit Stage 2+3 duration by forcefully killing our own process,
I guess. I do not know whether that is needed. It would require a
separate timeout value, IMO.

A few specific low-level comments:

1) We must not disable delay pools. At least in some environments, those
pools protect infrastructure or clients. At other environments the pools
affect billing/expenses. They must be enforced at all times.

2) The event loop (including async call queue) should run as long as
somebody is using them. Their too-early termination is one of the
biggest design flaws causing bugs now. I believe you have documented
that they should run longer, but I am not sure. I tried to be more
explicit about it in my stage descriptions.

3) I would not "disable new reads" as a separate step. At some point
(when the shutdown timeout expires), we have to kill all still-existing
transactions, if any. At that point, all I/O activity will cease. This
should be driven from transactions to Comm, and not by some global Comm
flags, IMO.

HTH,

Alex.
Received on Mon Aug 08 2011 - 17:08:34 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 09 2011 - 12:00:03 MDT