Re: squid-smp: synchronization issue & solutions

From: Robert Collins <robertc_at_robertcollins.net>
Date: Sat, 21 Nov 2009 16:59:43 +1100

On Tue, 2009-11-17 at 08:45 -0700, Alex Rousskov wrote:

> > Important features of OPENMP, you might be interested in...
> >
> > ** If your compiler is not supporting OPENMP then you dont have to do
> > any special thing, Compiler simply ignores these #pragmas..
> > and runs codes as if they are in sequential single thread program,
> > without affecting the end goal.

I don't think this is useful to us: all the platforms we consider
important have threading libraries. Support does seem widespread.
 
> > ** Programmers need not to create any locking mechanism and worry
> > about critical sections,

We have to worry about this because:
 - OpenMP is designed for large data set manipulation
 - few of our datasets are large except for:
   - some ACL's
   - the main hash table

So we'll need separate threads created around large constructs like
'process a request' (unless we take a thread-per-CPU approach and a
queue of jobs). Either approach will require careful synchronisation on
the 20 or so shared data structures.

> > ** By default it creates number threads equals to processors( * cores
> > per processor) in your system.
>
> All of the above make me think that OPENMP-enabled Squid may be
> significantly slower than multi-instance Squid. I doubt OPENMP is so
> smart that it can correctly and efficiently orchestrate the work of
> Squid "threads" that are often not even visible/identifiable in the
> current code.

I think it could, if we had a shared-nothing model under the hood so
that we could 'simply' parallelise the front end dispatch and let
everything run. However, that doesn't really fit our problem.

> >> - Designed for parallelizing computation-intensive programs such as
> >> various math models running on massively parallel computers. AFAICT, the
> >> OpenMP steering group is comprised of folks that deal with such models
> >> in such environments. Our environment and performance goals are rather
> >> different.
> >>
> >
> > But that doesnt mean that we can not have independent threads,
>
> It means that there is a high probability that it will not work well for
> other, very different, problem areas. It may work, but not work well enough.

I agree. From my reading openMP isn't really suitable to our domain.
I've asked around a little and noone has said 'Yes! you should Do It'.
The similar servers I know of like drizzle(Mysql) do not do it.

> >> I think our first questions should instead include:
> >>
> >> Q1. What are the major areas or units of asynchronous code execution?
> >> Some of us may prefer large areas such as "http_port acceptor" or
> >> "cache" or "server side". Others may root for AsyncJob as the largest
> >> asynchronous unit of execution. These two approaches and their
> >> implications differ a lot. There may be other designs worth considering.

I'd like to let people start writing (and perf testing!) patches. To
unblock people. I think the primary questions are:
 - do we permit multiple approaches inside the same code base. E.g.
OpenMP in some bits, pthreads / windows threads elsewhere, and 'job
queues' or some such abstraction elsewhere ?
    (I vote yes, but with caution: someone trying something we don't
already do should keep it on a branch and really measure it well until
its got plenty of buy in).

 - If we do *not* permit multiple approaches, then what approach do we
want for parallelisation. E.g. a number of long lived threads that take
on work, or many transient threads as particular bits of the code need
threads. I favour the former (long lived 'worker' threads).

If we can reach either a 'yes' on the first of these two questions or a
decision on the second, then folk can start working on their favourite
part of the code base. As long as its well tested and delivered with
appropriate synchronisation, I think the benefit of letting folk scratch
itches will be considerable.

I know you have processes vs threads as a key question, but I don't
actually think it is.

We *already* have significant experience with threads (threaded disk io
engine) and multiple processes (diskd io engine, helpers). We shouldn't
require a single answer for breaking squid up, rather good analysis by
the person doing the work on breaking a particular bit of it up.

> > I AM THINKING ABOUT HYBRID OF BOTH...
> >
> > Somebody might implement process model, Then we would merge both
> > process and thread models .. together we could have a better squid..
> > :)
> > What do u think? !!!!
>
> I doubt we have the resources to do a generic process model so I would
> rather decide on a single primary direction (processes or threads) and
> try to generalize that later if needed. However, a process (if we decide
> to go down that route) may still have lower-level threads, but that is a
> secondary question/decision.

We could simply adopt ACE wholesale and focus on the squid unique bits
of the stack. Squid is a pretty typical 'all in one' bundle at the
moment; I'd like to see us focus and reuse/split out unrelated bits
more.

-Rob

Received on Sat Nov 21 2009 - 05:59:52 MST

This archive was generated by hypermail 2.2.0 : Wed Nov 25 2009 - 12:00:10 MST