Re: [RFC] Squid process model and service name impact

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 28 Jan 2014 11:21:50 -0700

On 01/27/2014 06:44 AM, Amos Jeffries wrote:
> On 27/01/2014 8:18 a.m., Henrik Nordström wrote:
>> How do using a service name for these differ from having a squid.conf
>> set the name (possibly using the same service name as a macro
>> expansion)?

> squid.conf lines effects differ based on all the variability of process
> macros and if..endif conditionals.
>
> Service name is static for the instance.

I suspect this discussion has side tracked a little with parties
discussing rather different things while appearing to argue with each
other about the same concept. Let me try to summarize a solution that, I
hope, will keep everybody happy enough:

1) Provide ${service_name} macro (already done; needs minor polishing).
2) Provide configurable IPC-related path prefixes (in progress?).
3) Make #2 use #1 by default.
4) Make pid_filename use --prefix(?) and #1 by default, without chaning
   the current default when using default service name.

There are two big design questions relevant here:

A) Whether it is a good idea to have hard-coded paths or path prefixes.
The answer is an obvious "no". All paths or, where applicable, path
prefixes should be configurable via squid.conf.

B) Whether it is a good idea to use one configurable option to affect
the default of another configurable option. The answer to that is less
obvious and is probably dependent on the option. It is, however, a
common and good practice to introduce such dependencies for nested paths
(e.g., --prefix affects a lot of other default paths).

The important thing is that (B) does not overwrite (A). If we want to
remove something hard-coded in Squid, we should satisfy (A) and,
optionally, provide the convenience of (B). The summarized solution
above accomplishes both.

> Meaning we can use service name much more definitively as a macro for
> the configurable options that need to be unique,

Yes.

> and as a separator for the non-configurable details as well.

If we are changing those non-configurable details, we should make them
configurable. I understand that it may be very difficult in some cases,
but we should make sure we are dealing with those very difficult cases
before giving up. I do not think we are dealing with a difficult case
here as far as shared memory and IPC sockets are concerned.

> Delaying Mem::Init() until after the config parsing does not appear to
> be an option. There are a handful of components whose squid.conf parsing
> depends on it being done beforehand.

Which is nothing more than a known bug, not a design guideline for
future changes. If something can be done after parsing, it should be
done after parsing. We already have RunnersRegistry API to do
post-parsing configuration without introducing new main.cc dependencies.

> Using service name places the control in users hands without having to
> redesign the squid.conf parsing and initialization in a big way right now.

Yes.

> I also think that it provides sufficient control over all the relevant
> pieces

It does not. The hard-coded path prefix remains a problem, whether the
suffix is using a configurable service name or not. Using a
semi-configurable suffix helps in one rare case of running concurrent
Squid instances, nothing more.

> As it is I have spent the day fighting the code against memory related
> globals involvement with service_name. Since it is not initialized until
> main(), which for some components is very late already. I think we can
> delay them a bit until Mem::Init() time, but later will cause problems
> with the config parsing itself.

Is Mem::Init() really relevant to shared memory? AFAICT, shared memory
initialization happens _after_ squid.conf has been parsed:

> RunnerRegistrationEntry(rrAfterConfig, SharedMemPagesRr);

There are also IPC socket paths, but AFAICT IPC socket creation is not a
part of the Mem::Init() code and happens way after squid.conf parsing,
and just before we enter the main loop:

> if (IamCoordinatorProcess())
> AsyncJob::Start(Ipc::Coordinator::Instance());
> else if (UsingSmp() && (IamWorkerProcess() || IamDiskProcess()))
> AsyncJob::Start(new Ipc::Strand);
>
> /* at this point we are finished the synchronous startup. */
> starting_up = 0;
>
> mainLoop.run();

Thus, all of the SMP-related hard-coded paths that I know about can use
squid.conf settings. Did I miss any?

>>> I am a little doubtful that we should be letting the PID file remain
>>> configurable, due to the same issue.
>>
>> Please enlight me on what the issue really is here.
>>
>
> Same as with the UDS sockets. Having it squid.conf configurable allows
> for one instance of SMP Squid to have multiple *.pid files with any
> permutation of PID file to SMP shared resources.
> Why do we only have one .pid file with SMP-aware Squid and not one for
> each process started?
>
> pid_filename /var/run/squid/kid${process_number}.pid

If an option is useful and has a good default, it should be allowed,
even if it is possible to misconfigure Squid by misusing that option.

> The complexity it creates is really not necessary IMO.

Unlike your misconfiguration example above, "unnecessary complexity"
_is_ a valid argument. If there is no use for an option, it should not
be allowed (or should be deleted).

I am sure there are use cases for the pid_filename option that are as
legitimate/valid as a use case of running concurrent instances of a
single Squid build. The simplest one is "I want to run/test a Squid
executable built by somebody else with a --prefix that does not suit my
needs or access rights".

>> Then we have a broken process model somewhere.
>
> Yup. squid -k and system processes depending on having *the* PID file
> pointing at a process that can reach the coordinator. Versus the PID
> file pointing at a process isolated from the others due to per-worker
> squid.conf settings.

I do not understand the "isolated from the others due to per-worker"
setting part. Is that related to a misconfiguration example? I do not
think we should be discussing misconfiguration examples here. We all
know it is possible to misconfigure Squid badly.

I agree that the process model needs fixing in several places, but I do
not know of any place that would mandate the PID file path prefix to be
hard-coded.

>>> It seems to me the only reason that is being configured
>>> at all is to get around this same (bug 3608) problem of instance
>>> identification,
>>
>> Not at all, not for me at least. It's about being able to relocate the
>> Squid installation, i.e. for running it in a user home directory without
>> needing to rebuild Squid from source each time you change location.
>
> Being able to relocate the running state directory from squid.conf seems
> to me the scope of chroot directive rather than pid_filename.

chroot is too heavy to be used in many cases. It may not even cover the
primitive "I want to run/test a Squid executable built by somebody else
with a --prefix that does not suit my needs. I have no sudo." case.

> What I am thinking for .pid is one of:
>
> A) replace the @DEFAULT_PID_FILE@ with --prefix based path and
> ${service_name} macro for generating default filename.
>
> B) deprecate pid_filename in favour of directive pidfile_path and make
> the name be ${pidfile_path}/${service_name}.pid
>
> C) remove pid_filename directive and use chroot directive and -n command
> line instead.

I do not understand the need for (B) and (C). So far, the only examples
you gave where (B) and (C) might be needed are examples of misconfigured
Squids. Are there any examples not involving a misconfigured Squid?

Thank you,

Alex.
Received on Tue Jan 28 2014 - 18:22:14 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 29 2014 - 12:00:14 MST