Re: squid-smp: synchronization issue & solutions from Amos Jeffries on 2009-11-21 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 22 Nov 2009 00:12:50 +1300

Robert Collins wrote:
> On Tue, 2009-11-17 at 08:45 -0700, Alex Rousskov wrote:
>
>
>>> Important features of OPENMP, you might be interested in...
>>>
>>> ** If your compiler is not supporting OPENMP then you dont have to do
>>> any special thing, Compiler simply ignores these #pragmas..
>>> and runs codes as if they are in sequential single thread program,
>>> without affecting the end goal.
>
> I don't think this is useful to us: all the platforms we consider
> important have threading libraries. Support does seem widespread.
>
>>> ** Programmers need not to create any locking mechanism and worry
>>> about critical sections,
>
> We have to worry about this because:
> - OpenMP is designed for large data set manipulation
> - few of our datasets are large except for:
> - some ACL's
> - the main hash table
>
> So we'll need separate threads created around large constructs like
> 'process a request' (unless we take a thread-per-CPU approach and a
> queue of jobs). Either approach will require careful synchronisation on
> the 20 or so shared data structures.
>
>>> ** By default it creates number threads equals to processors( * cores
>>> per processor) in your system.
>> All of the above make me think that OPENMP-enabled Squid may be
>> significantly slower than multi-instance Squid. I doubt OPENMP is so
>> smart that it can correctly and efficiently orchestrate the work of
>> Squid "threads" that are often not even visible/identifiable in the
>> current code.
>
> I think it could, if we had a shared-nothing model under the hood so
> that we could 'simply' parallelise the front end dispatch and let
> everything run. However, that doesn't really fit our problem.
>
>>>> - Designed for parallelizing computation-intensive programs such as
>>>> various math models running on massively parallel computers. AFAICT, the
>>>> OpenMP steering group is comprised of folks that deal with such models
>>>> in such environments. Our environment and performance goals are rather
>>>> different.
>>>>
>>> But that doesnt mean that we can not have independent threads,
>> It means that there is a high probability that it will not work well for
>> other, very different, problem areas. It may work, but not work well enough.
>
> I agree. From my reading openMP isn't really suitable to our domain.
> I've asked around a little and noone has said 'Yes! you should Do It'.
> The similar servers I know of like drizzle(Mysql) do not do it.
>
>>>> I think our first questions should instead include:
>>>>
>>>> Q1. What are the major areas or units of asynchronous code execution?
>>>> Some of us may prefer large areas such as "http_port acceptor" or
>>>> "cache" or "server side". Others may root for AsyncJob as the largest
>>>> asynchronous unit of execution. These two approaches and their
>>>> implications differ a lot. There may be other designs worth considering.
>
> I'd like to let people start writing (and perf testing!) patches. To
> unblock people. I think the primary questions are:
> - do we permit multiple approaches inside the same code base. E.g.
> OpenMP in some bits, pthreads / windows threads elsewhere, and 'job
> queues' or some such abstraction elsewhere ?
> (I vote yes, but with caution: someone trying something we don't
> already do should keep it on a branch and really measure it well until
> its got plenty of buy in).

I'm also in favor of the mixed approach. With care that the particular
approach taken at each point is appropriate for the operation being done.
For example I wouldn't place each Call into a process. But a thread
each might be arguable. Whereas a Job might be a process with multiple
threads, or a thread with async hops in time.

>
> - If we do *not* permit multiple approaches, then what approach do we
> want for parallelisation. E.g. a number of long lived threads that take
> on work, or many transient threads as particular bits of the code need
> threads. I favour the former (long lived 'worker' threads).
>
> If we can reach either a 'yes' on the first of these two questions or a
> decision on the second, then folk can start working on their favourite
> part of the code base. As long as its well tested and delivered with
> appropriate synchronisation, I think the benefit of letting folk scratch
> itches will be considerable.
>
> I know you have processes vs threads as a key question, but I don't
> actually think it is.

I don't think so either. Sounds like a good Q but its a choice of two
alternatives where the best alternative is number 3: both.

We _already_ have a mixed environment. The helpers and diskd/unlinkd are
perfect examples of having chosen the process model for some small
internal units of Squid and the idns vs dnsserver being an example of
the other choice being made.

We are not deciding on how to make Squid parallel, but how to make is
massively _more_ parallel than it already is.

>
> We *already* have significant experience with threads (threaded disk io
> engine) and multiple processes (diskd io engine, helpers). We shouldn't
> require a single answer for breaking squid up, rather good analysis by
> the person doing the work on breaking a particular bit of it up.
>
>
>>> I AM THINKING ABOUT HYBRID OF BOTH...
>>>
>>> Somebody might implement process model, Then we would merge both
>>> process and thread models .. together we could have a better squid..
>>> :)
>>> What do u think? !!!!
>> I doubt we have the resources to do a generic process model so I would
>> rather decide on a single primary direction (processes or threads) and
>> try to generalize that later if needed. However, a process (if we decide
>> to go down that route) may still have lower-level threads, but that is a
>> secondary question/decision.
>
> We could simply adopt ACE wholesale and focus on the squid unique bits
> of the stack. Squid is a pretty typical 'all in one' bundle at the
> moment; I'd like to see us focus and reuse/split out unrelated bits
> more.
>
> -Rob

I think we can open the doors earlier than after that. I'm happy with an
approach that would see the smaller units of Squid growing in
parallelism to encompass two full cores.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20
   Current Beta Squid 3.1.0.14

Received on Sat Nov 21 2009 - 11:13:05 MST

This archive was generated by hypermail 2.2.0 : Tue Nov 24 2009 - 12:00:06 MST