Re: squid-smp from Sachin Malave on 2009-10-15 (squid-dev)

From: Sachin Malave <sachinmalave_at_gmail.com>
Date: Thu, 15 Oct 2009 06:46:21 -0400

On Thu, Oct 15, 2009 at 5:56 AM, Adrian Chadd <adrian_at_squid-cache.org> wrote:
> Oh, I can absolutely give you guys food for thought. I was just hoping
> someone else would already try to do a bit of legwork.
>
> Things to think about:
>
> * Do you really, -really- want to reinvent the malloc wheel? This is
> separate from caching results and pre-made class instances. There's
> been a lot of work in well-performing, thread-aware malloc libraries
> * Do you want to run things in multiple processes or multiple threads?
> Or support both?
> * How much of the application do you want to push out into separate
> threads? run lots of "copies" of Squid concurrently, with some locking
> going on? Break up individual parts of the processing pipeline into
> threads? (eg, what I'm going to be experimenting with soon - handling
> ICP/HTCP in a separate thread for some basic testing)
> * Survey the current codebase and figure out what depends upon what -
> in a way that you can use for figuring out what needs to be made
> re-entrant and what may need locking. Think about how to achieve all
> of this. Best example of this - you're going to need to figure out how
> to do concurrent debug logging and memory allocation - so see what
> that code uses, what that codes' code uses, etc
> * 10GE cards are dumping individual PCIe channels to CPUs; which means
> that the "most efficient" way of pumping data around will be to
> somehow throw individual connections onto specific CPUs, and keep them
> there. There's no OS support for this yet, but OSes may be "magical"
> (ie, handing you sockets in specific threads via accept() and hoping
> that the NIC doesn't reorganise its connection->PCIe channel hash)
> * Do you think its worth being able to migrate specific connections
> between threads? Or once they're in a thread they're there for good
> * If you split up squid into "lots of threads running the whole app",
> what and where would you envisage locking and blocking? What about
> data sharing? How would that scale given a handful of example
> workloads? What about in abnormal situations? How well will things
> degrade?
> * What about using message passing and message queues? Where would it
> be appropriate? Where wouldn't it be appropriate? Why?
>
> Here's an example:
>
> * Imagine you're doing store lookups using message passing with your
> "store" being a separate thread with a message queue. Think about how
> you'd handle say, ICP peering between two caches doing > 10,000
> requests a second. What repercussions does that have for the locking
> of the message queues between other threads. What are the other
> threads doing?
>
> With that in mind, survey the kinds of ways that current network apps
> "do" threading:
>
> * look at the various ways apache does it - eg, the per-connection
> thread+process hybrid model, the event-worker thread model, etc

Yes, I think....... if we have loosely coupled
architecture(distributed or multiprocessor ) then it is better to use
processes otherwise on multi-core platform threading model can be
used.... ( I am targeting multi-core ).................

> * look at memcached - one thread doing accept'ing, farming requests
> off to other threads that just run a squid-like event loop. Minimal
> inter-thread communication for the most part
> * investigate what the concurrency hooks for various frameworks do -
> eg, the boost asio library stuff has "colours" which you mark thread
> events with. These colours dictate which events need to be run
> sequentially and which can run in parallel
> * look at all of the random blogs written by windows networking coders
> - they're further ahead of the massively-concurrent network
> application stack because Windows has had it for a number of years.
>

Ok !!! Questions that you have raised would be considered while
creating threads or processes.........

> Now. You've mentioned you've looked at the others and you think major
> replumbing is going to be needed. Here's a hint - its going to be
> needed. Thinking you can avoid it is silly. Figuring out what you can
> do right now that doesn't lock you into a specific trajectory is -not-
> silly. For example, figuring out what APIs need to be changed to make
> them re-enterant is not silly. Most of the stuff in lib/ with static
> char buffers that they return need to be changed. That can be done
> -now- without having to lock yourself into a particular concurrency
> model.
>
> 2c,

thank you.... :)

>
>
>
> Adrian
>
> 2009/10/15 Amos Jeffries <squid3_at_treenet.co.nz>:
>> Adrian Chadd wrote:
>>>
>>> 2009/10/15 Sachin Malave <sachinmalave_at_gmail.com>:
>>>
>>>> Its not like we want to make project bad. Squid was not deployed on
>>>> smp before because we did not have shared memory architectures
>>>> (multi-cores), also the library support for multi-threading was like
>>>> nightmare for people. Now things are changed, it is very easy to
>>>> manage threads, people have multi-core machines at their desktops, and
>>>> as hardware is available now or later somebody has to try and build
>>>> SMP support. think about future.......
>>>>
>>>> To cop with internet speed & increase in number of users, Squid must
>>>> use multi-core architecture and distribute its work............
>>>
>>> I 100% agree with your comments. I agree 100% that Squid needs to be
>>> made scalable on multi-core boxes.
>>>
>>> Writing threaded code may be easier now than in the past, but the ways
>>> of screwing stability, debuggability, performance and such -haven't-
>>> changed.. This is what I'm trying to get across. :)
>>
>> Aye, understood. Which is why I've made sure all this discussion is done in
>> squid-dev. So those like yourself who might have anything to point at as
>> good/bad examples can do so.
>>
>> Sure, Squid can be re-written from the group up yet again. But none of us
>> want the ten year delay that will cause. The answer is to drop eight years
>> of improvements and use the Squid-2 code, or go ahead with a somewhat
>> incompletely upgraded Squid-3 code. Leveraging some of the SMP work to
>> further upgrade the remaining sections, while just slipping SMP into the
>> currently upgraded components.
>>
>> Do you actually have any relevant implementations you in your infinite
>> wisdom and foresight want to point us at? Or just diss us for not knowing
>> enough?
>>
>> I'm already aware of the overall models Varnish, Oops, Apache, and Polipo,
>> and Nginx are documented as using. Without looking at the code it's clear
>> that their approaches are not beneficial to Squid without major re-plumbing.
>>
>> The solution we have to use is a mix, possibly unique to Squid, which
>> retains Squids features and niche coverage. The right mix of tools for each
>> task to be performed: child processes, IPC, and events. Now adding threads
>> for the pieces that are applicable. There is order in the chaos.
>>
>> Amos
>> --
>> Please be using
>> Current Stable Squid 2.7.STABLE7 or 3.0.STABLE19
>> Current Beta Squid 3.1.0.14
>>
>>
>

-- 
Mr. S. H. Malave
Computer Science & Engineering Department,
Walchand College of Engineering,Sangli.
sachinmalave_at_wce.org.in

Received on Thu Oct 15 2009 - 10:46:28 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 27 2009 - 12:00:05 MDT