Re: squid-smp: synchronization issue & solutions from Sachin Malave on 2009-11-17 (squid-dev)

From: Sachin Malave <sachinmalave_at_gmail.com>
Date: Tue, 17 Nov 2009 16:39:49 +0530

On Mon, Nov 16, 2009 at 9:43 PM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 11/15/2009 11:59 AM, Sachin Malave wrote:
>
>> Since last few days i am analyzing squid code for smp support, I found
>> one big issue regarding debugs() function, It is very hard get rid of
>> this issue as it is appearing at almost everywhere in the code. So for
>> testing purpose i have disable the debug option in squid.conf as
>> follows
>>
>> -------------------------------
>> debug_options 0,0
>> -------------------------------
>>
>> Well this was only way, as did not want to spend time on this issue.....
>
> You can certainly disable any feature as an intermediate step as long as
> the overall approach allows for the later efficient support of the
> temporary disabled feature. Debugging is probably the worst feature to
> disable though because without it we do not know much about Squid operation.
>
I agree, We should find a way to re-enable this feature. It is
temporarily disabled...
Off-course locking debugs() was not the solution thats why it is disabled...

>
>> Now concentrating on locking mechanism...
>
> I would not recommend starting with such low-level decisions as locking
> mechanisms. We need to decide what needs to be locked first. AFAIK,
> there is currently no consensus whether we start with processes or
> threads, for example. The locking mechanism would depend on that.
>

>
>> As OpenMP library is widely supported by almost all platforms and
>> compilers, I am inheriting locking mechanism from the same
>> Just include omp.h & compile code with -fopenmp option if using gcc,
>> Other may use similar thing on their platform, Well that is not a big
>> issue..

>
> After spending 2 minutes on openmp.org, I am not very excited about
> using OpenMP. Please correct me if I am wrong, but OpenMP seems to be:
>
> - An "approach" or "model" requiring compiler support and language
> extensions. It is _not_ a library. You examples with #pragmas is a good
> illustration.
>

We have to use something to create and manage threads, there are some
other libraries and models also but i feel we need something that will
work on all platforms,
Important features of OPENMP, you might be interested in...

** If your compiler is not supporting OPENMP then you dont have to do
any special thing, Compiler simply ignores these #pragmas..
and runs codes as if they are in sequential single thread program,
without affecting the end goal.

** Programmers need not to create any locking mechanism and worry
about critical sections,

** By default it creates number threads equals to processors( * cores
per processor) in your system.

** Its fork and join model is scalable.. ( Off-course we must find
such areas in exiting code)

** OPENMP is OLD but still growing .. Providing new features with new
releases.. Think about other threading libraries, I think their
developments are stopped, Some of them are not freely available, some
of them are available only on WINDOWS..

** IT IS FREE and OPEN-SOURCE like us..

** INTEL just has released TBB ( Threading Building Blocks), But i
doubt its performance on AMD ( non-intel ) hardware.

** You might be thinking about old Pthreads, But i think OPENMP is
very safe and better than pthreads for programmers

SPECIALLY ONE WHO IS MAKING CHANGES IN EXISTING CODES. and easy to debugs.

please think about my last point... :)

> - Designed for parallelizing computation-intensive programs such as
> various math models running on massively parallel computers. AFAICT, the
> OpenMP steering group is comprised of folks that deal with such models
> in such environments. Our environment and performance goals are rather
> different.
>

But that doesnt mean that we can not have independent threads, Only
thing is that we have to start these threads in main(), because main
never ends.. Otherwise those independent threads will die after
returning to calling function..

>
>> 1. hash_link ---- LOCKED
>>
>> 2. dlink_list ---- LOCKED
>>
>> 3. ipcache, fqdncache ---- LOCKED,
>>
>> 4. FD / fde handling ---WELL, SEEMS NOT CREATING PROBLEM, If any then
>> please discuss.
>>
>> 5. statistic counters --- NOT LOCKED ( I know this is very important,
>> But these are scattered all around squid code, Write now they may be
>> holding wrong values)
>>
>> 6. memory manager --- DID NOT FOLLOW
>>
>> 7. configuration objects --- DID NOT FOLLOW
>
> I worry that the end result of this exercise would produce a slow and
> buggy Squid for several reasons:
>
> - Globally locking low-level but interdependent objects is likely to
> create deadlocks when two or more locked objects need to lock other
> locked objects in a circular fashion.
>

is there any other option ? As discussed, Amos is trying to make these
areas as independent as possible. So that we would have less locking
in the code.

> - Locking low-level objects without an overall performance-aware plan is
> likely to result in performance-killing competition for critical locks.
> I believe that with the right design, many locks can be avoided.
>
>
> I think our first questions should instead include:
>
> Q1. What are the major areas or units of asynchronous code execution?
> Some of us may prefer large areas such as "http_port acceptor" or
> "cache" or "server side". Others may root for AsyncJob as the largest
> asynchronous unit of execution. These two approaches and their
> implications differ a lot. There may be other designs worth considering.
>

See my sample codes, I sent in last mail.. There i have separated out
the schedule() and dial() functions, Where one thread is registering
calls in AsyncCallQueue and another is dispatching them..
Well, We can concentrate on other areas also

> Q2. Threads versus processes. Depending on Q1, we may have a choice. The
> choice will affect the required locking mechanism and other key decisions.
>

If you are planning to use processes then it is as good as running
multiple squids on single machine.., Only thing is they must be
accepting requests on different ports... But if we want distribute
single squid's work then i feel threading is the best choice..

I AM THINKING ABOUT HYBRID OF BOTH...

Somebody might implement process model, Then we would merge both
process and thread models .. together we could have a better squid..
:)
What do u think? !!!!

>
> Thank you,
>
> Alex.
>
>

-- 
Mr. S. H. Malave
Computer Science & Engineering Department,
Walchand College of Engineering,Sangli.
sachinmalave_at_wce.org.in

Received on Tue Nov 17 2009 - 11:09:57 MST

This archive was generated by hypermail 2.2.0 : Tue Nov 17 2009 - 12:00:07 MST