Re: SMP: process-specific options

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 21 Feb 2010 21:52:58 +1300

Alex Rousskov wrote:
> On 02/20/2010 09:24 PM, Amos Jeffries wrote:
>> On Sat, 20 Feb 2010 19:14:48 -0700, Alex Rousskov
>> <rousskov_at_measurement-factory.com> wrote:
>>> Hello,
>>>
>>> If you recall, I am working on Squid that starts multiple processes,
>>> each doing similar things. Even with this simple design, folks want to
>>>
>>> (a) have differently configured processes (e.g., a process that is
>>> dedicated to a given http_port or even a cache_dir option); and
>>>
>>> (b) bind processes to specific CPU cores (i.e., support CPU affinity)
>>>
>>>
>>> I propose the following configuration approach that I think is simple to
>>> implement but allows a lot of flexibility:
>>>
>>> 1. Each forked process gets a unique process name, which is just a
>>> number from 1 to N. The process knows its name. If a forked process dies
>>> and is reforked, the reforked process keeps the original name.
>>>
>>> 2. The squid.conf parser substitutes ${process_name} strings with the
>>> process name doing the parsing. This substitution is performed before
>>> individual options are parsed.
>>>
>>> 3. The squid.conf parser supports if-statement blocks. Each if-statement
>>> must start on its own line (as if there is an option called "if"). Each
>>> if-statement block ends with "endif" on its own line (as if there is an
>>> "endif" option). The only two supported conditions for now are a simple
>>> comparison:
>>>
>>> if ${process_name} = 1
>>> ... regular squid.conf options for the first forked process ....
>>> ... regular squid.conf options for the first forked process ....
>>> ... regular squid.conf options for the first forked process ....
>>> endif
>>>
>>> and a set membership test, for when we want to specify options for
>>> multiple processes:
>>>
>>> if ${process_name} in {1,7,8}
>>> ... regular squid.conf options for the selected forked process ....
>>> ... regular squid.conf options for the selected forked process ....
>>> ... regular squid.conf options for the selected forked process ....
>>> endif
>>>
>>> If the condition is false, the parser skips all regular squid.conf
>>> options inside the block until the matching endif. Otherwise, the parser
>>> behaves as if the if-statement was not there.
>>>
>>> This approach supports process-specific options without rewriting the
>>> existing options or the squid.conf parser. I think the implementation is
>>> straightforward, even if we want to support nested if-statements. We
>>> just push the current if-statement condition on stack and either skip or
>>> honor options until we find endif and pop the current condition.
>>>
>>> As a side effect, we can use the same if-statement approach to quickly
>>> disable large portions of the configuration file using conditions that
>>> are always false.
>>>
>>>
>>> 4. CPU affinity is supported using a new cpu_affinity option that
>>> specifies either a single CPU core ID (1..C) or the affinity mask:
>>>
>>> # start this (and each) process on its own core:
>>> cpu_affinity core=${process_name}
>>>
>>> # use any first four cores for this (and each) process:
>>> cpu_affinity mask=0xF
>>>
>>> # place process5 on CPU core1:
>>> if ${process_name} = 5
>>> cpu_affinity core=1
>>> endif
>>>
>>>
>>> Any objections, improvement suggestions, or better ideas?
>>>
>>> Thank you,
>>>
>>> Alex.
>> +1.
>> Matches almost exactly something I've had in the back of my head for a
>> short while. :)
>>
>> Just on the config UI, the numeric process numbering doesn't really suit
>> the implications of using "_name" tag in the variable. I think it better
>> suits a more generic ${process}, which we can make into anything at a later
>> date.
>
> I agree that number and name do not match well. I struggled with this. I
> did not want to use process_id to avoid the clash with system PID. Just
> "process" sounds too generic and difficult to extend though. Would
> process_number be better than process_name? Any other ideas?
>
>> Other things from my prior thoughts about this design is that it implies
>> squid-N.pid, and cache-N.log (for now) files with N being the process
>> number/name.
>
> I am not sure it implies that, actually (even for now), but it certainly
> would be an option.

Under the current architecture the child is the one in .pid and the exit
code determines whether master re-spawns or closes. I was previously
working from that angle.

If we invert the current design and have master being the one .pid entry...

... you would be altering the .pid to hold the master PID, which somehow
receives the PID for each child back from the child and shuts off the
children when it gets a shutdown signal.

cache.log depends on how you plan to re-plumb the linkage between master
and worker. The current silent process monitoring method is iffy in the
middle.
  What were you planning for monitoring? socket read/write events?

If you have each instance completely independent with its own
current-style monitor talking to a master instance its easy to plumb
cache.log through the link socket, but _that_ means that PID are harder
to manage as the worker child is not connected to the sockets directly.

This is where I get stuck in the what-ifs and am looking forward to
hearing your ideas/plans on that bit.

>
>> IMHO, it's a toss-up whether the exact syntax wants to follow the Apache2
>> <If ...> ... </if> syntax. What are your feelings on making the config more
>> apache-like?
>
> I think Apache httpd syntax is terrible for humans and does not match
> well the current squid.conf syntax either. If you do not like if/endif,
> we can use C-style {curly braces} to mark the block instead. Would you
> prefer that?

IMO that would be uglier and less clear than even the apache way :)

if...endif do make the right kind of sense however they are syntaxed.
Okay, lets go with your initial proposal.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE7 or 3.0.STABLE24
   Current Beta Squid 3.1.0.16
Received on Sun Feb 21 2010 - 08:53:21 MST

This archive was generated by hypermail 2.2.0 : Tue Feb 23 2010 - 12:00:09 MST