Re: squid-smp: synchronization issue & solutions

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 17 Nov 2009 08:45:51 -0700

On 11/17/2009 04:09 AM, Sachin Malave wrote:

>> After spending 2 minutes on openmp.org, I am not very excited about
>> using OpenMP. Please correct me if I am wrong, but OpenMP seems to be:
>>
>> - An "approach" or "model" requiring compiler support and language
>> extensions. It is _not_ a library. You examples with #pragmas is a good
>> illustration.

> Important features of OPENMP, you might be interested in...
>
> ** If your compiler is not supporting OPENMP then you dont have to do
> any special thing, Compiler simply ignores these #pragmas..
> and runs codes as if they are in sequential single thread program,
> without affecting the end goal.
>
> ** Programmers need not to create any locking mechanism and worry
> about critical sections,
>
> ** By default it creates number threads equals to processors( * cores
> per processor) in your system.

All of the above make me think that OPENMP-enabled Squid may be
significantly slower than multi-instance Squid. I doubt OPENMP is so
smart that it can correctly and efficiently orchestrate the work of
Squid "threads" that are often not even visible/identifiable in the
current code.

>> - Designed for parallelizing computation-intensive programs such as
>> various math models running on massively parallel computers. AFAICT, the
>> OpenMP steering group is comprised of folks that deal with such models
>> in such environments. Our environment and performance goals are rather
>> different.
>>
>
> But that doesnt mean that we can not have independent threads,

It means that there is a high probability that it will not work well for
other, very different, problem areas. It may work, but not work well enough.

>> I think our first questions should instead include:
>>
>> Q1. What are the major areas or units of asynchronous code execution?
>> Some of us may prefer large areas such as "http_port acceptor" or
>> "cache" or "server side". Others may root for AsyncJob as the largest
>> asynchronous unit of execution. These two approaches and their
>> implications differ a lot. There may be other designs worth considering.
>>
>
> See my sample codes, I sent in last mail.. There i have separated out
> the schedule() and dial() functions, Where one thread is registering
> calls in AsyncCallQueue and another is dispatching them..
> Well, We can concentrate on other areas also

scheedule() and dial() are low level routines that are irrelevant for Q1.

>> Q2. Threads versus processes. Depending on Q1, we may have a choice. The
>> choice will affect the required locking mechanism and other key decisions.
>>
>
> If you are planning to use processes then it is as good as running
> multiple squids on single machine..,

I am not planning to use processes yet, but if they are indeed as good
as running multiple Squids, that is a plus. Hopefully, we can do better
than multi-instance Squid, but we should be at least as bad/good.

> Only thing is they must be
> accepting requests on different ports... But if we want distribute
> single squid's work then i feel threading is the best choice..

You can have a process accepting a request and then forwarding the work
to another process or receiving a cache hit from another process.
Inter-process communication is slower than inter-thread communication,
but it is not impossible.

> I AM THINKING ABOUT HYBRID OF BOTH...
>
> Somebody might implement process model, Then we would merge both
> process and thread models .. together we could have a better squid..
> :)
> What do u think? !!!!

I doubt we have the resources to do a generic process model so I would
rather decide on a single primary direction (processes or threads) and
try to generalize that later if needed. However, a process (if we decide
to go down that route) may still have lower-level threads, but that is a
secondary question/decision.

Cheers,

Alex.
Received on Tue Nov 17 2009 - 15:44:42 MST

This archive was generated by hypermail 2.2.0 : Sat Nov 21 2009 - 12:00:05 MST