Re: [PATCH] If a worker process crashes during shutdown, dump core and prevent restarts

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 30 Mar 2011 13:35:49 -0600

On 03/29/2011 03:04 PM, Alex Rousskov wrote:
> On 11/23/2010 04:58 PM, Amos Jeffries wrote:
>> On Sat, 20 Nov 2010 00:45:21 +0300, Dmitry Kurochkin
>> <dmitry.kurochkin_at_measurement-factory.com> wrote:
>>> If a worker process crashes during shutdown, dump core and prevent
>>> restarts.
>>>
>>> Before the change, if a worker process crashes during shutdown, death()
>>> handler would exit with code 1, and master process would restart the
>>> worker. Now workers send SIGUSR1 to master when shutting down. When
>>> master process gets the SIGUSR1 signal, it stops restarting workers.
>>>
>>> SIGUSR1 is already used for log rotation, but it is fine to use SIGUSR1
>>> for master process shutdown notifications because master is never
>>> responsible for both log rotation and kid restarts.
>>>
>>> Terminate with abort(3) instead of exit(3) to leave a core dump if Squid
>>> worker crashes during shutdown.
>>>
>>> Also the patch fixes potential infinite loop in master process. Master
>>> finished only when all kids exited with success, or all kids are
>>> hopeless, or all kids were killed by a signal. But in cases like when
>>> part of kids are hopeless and other were killed, master process would
>>> not exit. After the change master exits when there are no running kids
>>> and no kids should be restarted.
>>>
>>> Add syslog notice if kid becomes hopeless.
>>>
>>> Regards,
>>> Dmitry
>>
>>
>> This seems to me a cleaner implementation of the kill-parent hack. Thank
>> you.
>>
>> +0. (seems right by reading, but I can't evaluate it properly yet.)
>
> I am going to commit this change if there are no new objections.

Committed to trunk as r11330.

Thank you,

Alex.
Received on Wed Mar 30 2011 - 19:36:09 MDT

This archive was generated by hypermail 2.2.0 : Thu Mar 31 2011 - 12:00:04 MDT