Re: [RFC] Squid process model and service name impact

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 29 Jan 2014 13:29:22 +1300

On 2014-01-29 07:21, Alex Rousskov wrote:
> On 01/27/2014 06:44 AM, Amos Jeffries wrote:
>> On 27/01/2014 8:18 a.m., Henrik Nordström wrote:
>>> How do using a service name for these differ from having a squid.conf
>>> set the name (possibly using the same service name as a macro
>>> expansion)?
>
>> squid.conf lines effects differ based on all the variability of
>> process
>> macros and if..endif conditionals.
>>
>> Service name is static for the instance.
>
> I suspect this discussion has side tracked a little with parties
> discussing rather different things while appearing to argue with each
> other about the same concept. Let me try to summarize a solution that,
> I
> hope, will keep everybody happy enough:
>
> 1) Provide ${service_name} macro (already done; needs minor polishing).
> 2) Provide configurable IPC-related path prefixes (in progress?).

  - IPC parts done, patch in trunk.
  - shared-memory parts done (I think) patch in bug 3608 for
testing+review before commit.
  - review of any other pieces overlooked so far, TBD.

> 3) Make #2 use #1 by default.
> 4) Make pid_filename use --prefix(?) and #1 by default, without chaning
> the current default when using default service name.

  #3 == #4

  5) whether to make a new service_path directive to configure /var/run
(or "/" path) separate from chroot that can use #1

>
> There are two big design questions relevant here:
>
> A) Whether it is a good idea to have hard-coded paths or path prefixes.
> The answer is an obvious "no". All paths or, where applicable, path
> prefixes should be configurable via squid.conf.

Ack.

>
> B) Whether it is a good idea to use one configurable option to affect
> the default of another configurable option. The answer to that is less
> obvious and is probably dependent on the option. It is, however, a
> common and good practice to introduce such dependencies for nested
> paths
> (e.g., --prefix affects a lot of other default paths).
>
> The important thing is that (B) does not overwrite (A). If we want to
> remove something hard-coded in Squid, we should satisfy (A) and,
> optionally, provide the convenience of (B). The summarized solution
> above accomplishes both.

This is where I believe the answer to (B) is command-line -n as
configurable and meets the "static" and "global" properties halfway
between --prefix and chroot although in its current iteration is not a
path, but used as a path section.

>> Meaning we can use service name much more definitively as a macro for
>> the configurable options that need to be unique,
>
> Yes.
>
>> and as a separator for the non-configurable details as well.
>
> If we are changing those non-configurable details, we should make them
> configurable. I understand that it may be very difficult in some cases,
> but we should make sure we are dealing with those very difficult cases
> before giving up. I do not think we are dealing with a difficult case
> here as far as shared memory and IPC sockets are concerned.
>
>
>> Delaying Mem::Init() until after the config parsing does not appear to
>> be an option. There are a handful of components whose squid.conf
>> parsing
>> depends on it being done beforehand.
>
> Which is nothing more than a known bug, not a design guideline for
> future changes. If something can be done after parsing, it should be
> done after parsing. We already have RunnersRegistry API to do
> post-parsing configuration without introducing new main.cc
> dependencies.
>

By "depends" I mean depending on the state existing, rather than the
code linker dependencies.

>
>> Using service name places the control in users hands without having to
>> redesign the squid.conf parsing and initialization in a big way right
>> now.
>
> Yes.
>
>
>> I also think that it provides sufficient control over all the relevant
>> pieces
>
> It does not. The hard-coded path prefix remains a problem, whether the
> suffix is using a configurable service name or not. Using a
> semi-configurable suffix helps in one rare case of running concurrent
> Squid instances, nothing more.
>

So what cases are there that you are seeing it not cover?

>
>> As it is I have spent the day fighting the code against memory related
>> globals involvement with service_name. Since it is not initialized
>> until
>> main(), which for some components is very late already. I think we can
>> delay them a bit until Mem::Init() time, but later will cause problems
>> with the config parsing itself.
>
> Is Mem::Init() really relevant to shared memory? AFAICT, shared memory
> initialization happens _after_ squid.conf has been parsed:
>
>> RunnerRegistrationEntry(rrAfterConfig, SharedMemPagesRr);
>
>
> There are also IPC socket paths, but AFAICT IPC socket creation is not
> a
> part of the Mem::Init() code and happens way after squid.conf parsing,
> and just before we enter the main loop:
>
>> if (IamCoordinatorProcess())
>> AsyncJob::Start(Ipc::Coordinator::Instance());
>> else if (UsingSmp() && (IamWorkerProcess() || IamDiskProcess()))
>> AsyncJob::Start(new Ipc::Strand);
>>
>> /* at this point we are finished the synchronous startup. */
>> starting_up = 0;
>>
>> mainLoop.run();
>
> Thus, all of the SMP-related hard-coded paths that I know about can use
> squid.conf settings. Did I miss any?
>
>
>>>> I am a little doubtful that we should be letting the PID file remain
>>>> configurable, due to the same issue.
>>>
>>> Please enlight me on what the issue really is here.
>>>
>>
>> Same as with the UDS sockets. Having it squid.conf configurable allows
>> for one instance of SMP Squid to have multiple *.pid files with any
>> permutation of PID file to SMP shared resources.
>> Why do we only have one .pid file with SMP-aware Squid and not one for
>> each process started?
>>
>> pid_filename /var/run/squid/kid${process_number}.pid
>
>
> If an option is useful and has a good default, it should be allowed,
> even if it is possible to misconfigure Squid by misusing that option.
>

Agreed. So the question for all of this is:
   What is it useful for?
    Is there a way to do that with less possibility of misuse?

>
>> The complexity it creates is really not necessary IMO.
>
> Unlike your misconfiguration example above, "unnecessary complexity"
> _is_ a valid argument. If there is no use for an option, it should not
> be allowed (or should be deleted).
>
> I am sure there are use cases for the pid_filename option that are as
> legitimate/valid as a use case of running concurrent instances of a
> single Squid build. The simplest one is "I want to run/test a Squid
> executable built by somebody else with a --prefix that does not suit my
> needs or access rights".

.pid is not governed directly by --prefix. The pid_filename and
--with-pidfile configure options actually get in the way of the above
use-case by allowing that other person to have built really weird
locations for individual things (aka hard-coded default pid_filename
value) forcing the user to configure yet another squid.conf option
(++annoyance).

The use-case for --with-pidfile is distribution packages with non-FHS
location for .pid. However in such cases the user is familiar with that
distros' path structure and changing that layout is not in the users
interest so much as making sure the path is somewhere they can access.
To me the best way there is chroot directive again, not pid_filename,
since chroot fixes more path issues than just the .pid one.

>
>>> Then we have a broken process model somewhere.
>>
>> Yup. squid -k and system processes depending on having *the* PID file
>> pointing at a process that can reach the coordinator. Versus the PID
>> file pointing at a process isolated from the others due to per-worker
>> squid.conf settings.
>
> I do not understand the "isolated from the others due to per-worker"
> setting part. Is that related to a misconfiguration example?

Yes, misconfiguration due to adding one of these critical squid.conf
directives inside SMP if..endif config lines, or using ${process_*}
macros.

> I do not
> think we should be discussing misconfiguration examples here. We all
> know it is possible to misconfigure Squid badly.

We need to know what can go wrong and how easily to judge accurately
which idea has least problems. The less potential problems we can push
on users the better.

>
> I agree that the process model needs fixing in several places, but I do
> not know of any place that would mandate the PID file path prefix to be
> hard-coded.

"hard-coded" is being used a bit much here. I don't think any of us are
arguing for that.
Lets look at the BNF:

  FOO.pid = chroot pid_filename

  chroot = {squid.conf chroot} | "/"

  pid_filename = {squid.conf pid_filename} | DEFAULT_PID_FILE

  DEFAULT_PID_FILE = {./configure --with-pidfile} | ( PREFIX "/squid.pid"
)

  PREFIX = {./configure --prefix} | "/"

We have no less than 4 configuration points for this one file, some of
which replace others , and some join unless other has been configured. I
am proposing we can do with a simpler setup:

  FOO.pid = chroot DEFAULT_PID_PATH "/" service_name ".pid"

  chroot = {squid.conf chroot} | "/"

  service_name = {squid -n FOO} | "squid"

  DEFAULT_PID_PATH = {./configure --with-pidfile} | PREFIX

  PREFIX = {./configure --prefix} | "/"

Most of the configuration is in path. The file name part is "static" but
configurable via command line with less of the per-worker problems
brought in by squid.conf flexibility (chroot still suffers those).

Benefits:
  * multiple instances with same ./configure can run without clashing on
.pid
  * multiple instances with default squid.conf can run without clashing
on .pid
  * base path is configurable by user from squid.conf (via chroot)
  * default path is configurable by distributor (via --prefix or
--with-pidfile)
  * script/code running Squid is where -n gets set, so if it needs access
to foo.pid or test.pid separately it already knows whether it ran "-n
foo" or "-n test".

So why do we need to configure pid_filename exactly?
  Does a user who is running under their local directory care whether its
called squid.pid or squid3.pid or test.pid with all the rest of the
process overlaps being collisions? I don't think so.

>>>> It seems to me the only reason that is being configured
>>>> at all is to get around this same (bug 3608) problem of instance
>>>> identification,
>>>
>>> Not at all, not for me at least. It's about being able to relocate
>>> the
>>> Squid installation, i.e. for running it in a user home directory
>>> without
>>> needing to rebuild Squid from source each time you change location.
>>
>> Being able to relocate the running state directory from squid.conf
>> seems
>> to me the scope of chroot directive rather than pid_filename.
>
> chroot is too heavy to be used in many cases. It may not even cover the
> primitive "I want to run/test a Squid executable built by somebody else
> with a --prefix that does not suit my needs. I have no sudo." case.
>

Perhapse we need to look into providing a lightweight jail/location
directive to solve that use-case rather than forcing admin to know and
configure each component path individually.

>
>> What I am thinking for .pid is one of:
>>
>> A) replace the @DEFAULT_PID_FILE@ with --prefix based path and
>> ${service_name} macro for generating default filename.
>>
>> B) deprecate pid_filename in favour of directive pidfile_path and make
>> the name be ${pidfile_path}/${service_name}.pid
>>
>> C) remove pid_filename directive and use chroot directive and -n
>> command
>> line instead.
>
> I do not understand the need for (B) and (C). So far, the only examples
> you gave where (B) and (C) might be needed are examples of
> misconfigured
> Squids. Are there any examples not involving a misconfigured Squid?
>

Examples of problems where everything is working fine? no, or it would
not be a problem description.

* When the admin is running a single instance of Squid with default
locations for their build. pid_filename is not needed, and not used.

* When the admin is running in a different setup to the build defaults.
pid_filename is a must, and configured. The problem is that "must"
configure, alongside all other path directives. Squid _works_ but with a
lot of settings that really should not be mandatory / separate just to
change location of the entire instance.

Amos
Received on Wed Jan 29 2014 - 00:29:28 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 30 2014 - 12:00:15 MST