Re: vanishing coordinator process in squid 3.2

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 03 Feb 2012 14:13:53 -0700

On 02/03/2012 04:21 AM, alex sharaz wrote:

> I’m running 3.2.0.14…..build …91 on a number of servers and I’ve noticed
> that fairly frequently the coordinator process vanishes. There’s nothing
> in the logs to say that (in this case) kid9 ( 8 worker processes)
> terminated for any particular reason.

Hello Alex,

    I trust you checked and system logs (e.g., /var/log/messages) in
addition to the Squid general log (e.g., cache.log).

> I still have worker processes
> active and they still seem to be processing connections.

In most cases, workers should work OK without coordinator until they
need to coordinate with each other. Coordination is needed to respond to
most cache manager queries and for a clean shutdown.

> At the moment I’m killing off the worker processes using kill -9 and
> just restarting everything with /usr/local/squid/sbin/squid –SYC

Do you need the -C option?

> So
>
> 1). Anything I can switch on logging wise to see why the process is
> vanishing

I see two options:

1) Attach a gdb session to a still running coordinator and set up a
break point on _exit or similar. You will need to customize handling of
common signals for this to work. Squid wiki might have an example.

2) Configure coordinator process with debug_options ALL,9 or similar
using squid.conf conditionals. You can even give it a dedicaetd log file.

#1 is preferable for initial triage if you can make it work.

> 2). Is there a better way of restarting the coord process than killing
> everything and starting again?

The Master process should restart Coordinator. I have not checked but
perhaps your -C option is in the way?

Said that, I do not think we fully support Coordinator restarts today.
After a from-Master restart, Coordinator may not notice that there are
other kids running and so coordination may not work well. This is
something we should improve eventually.

Short-term, I would focus on avoiding Coordinator deaths. To figure out
why Coordinator dies would be the first step. See above for a couple of
suggestions.

Thank you,

Alex.
Received on Fri Feb 03 2012 - 21:14:10 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 04 2012 - 12:00:04 MST