Re: [squid-users] Squid-2, Squid-3, roadmap

From: Mark Nottingham <mnot@dont-contact.us>
Date: Fri, 7 Mar 2008 10:33:30 +1100

Ideally, you'd avoid locking as much as possible; e.g., have a pool of
threads for disk access (as now with aufs), a pool for header parsing,
a pool for forward requests, and so on. I don't think it's a good idea
at all to re-architect squid into a thread-per-connection model or
anything; just find the places that are bottlenecks and allow some
parallelism, keeping the number of threads low.

(says he, the non-threads programmer. I'm not *that* crazy...)

Redirectors and other helpers are already able to run on other CPUs,
so that's a non-issue.

Cheers,

On 07/03/2008, at 3:05 AM, Adrian Chadd wrote:

> Well, the way I'd approach it is to first get an idea of how to throw
> things into 'threads', and probably draft and craft a basic event loop
> and submission queue for "stuff" to happen across threads.
>
> Then "Squid" can run as one thread, and CPU intensive stuff can happen
> via message queues to other threads.
>
> Eventually my gut feeling (reliable as it is) tells me that the most
> efficient and scalable way of doing this is to create a lightweight
> "squid" that handles just client and server-side interactions, with
> storage,
> logging, ACLs and other stuff happening in other threads, and then
> create multiple "squid" threads that run almost indepedently from
> one another.
> This would avoid all of the crazy fine-grain locking that
> traditionally is done
> to take a non-threaded app into the threaded world. I really think
> avoiding that is a very good idea.
>
> Oh, and no, there's nothing in Squid right now that "jumps out" save
> perhaps
> pushing regular expression lookups into a seperate thread or
> threads. But
> really, if you're going to do that then you're better off pushing a
> large part
> of the ACL subsystem into seperate threads and have the main code
> submit
> lookup requests there. Of course, what would be interesting there is
> benchmarking
> how effective it'd be to batch things like ACL lookups in "groups"
> to try and
> get some cache coherency effects going, rather than the current
> tendency for Squid
> to process a request as far as it can go before something blocking
> comes along,
> blowing much of the CPU cache away as possible in the meantime.
>
> But really, the big problem is to spend some time looking at efficient
> ways of parallelising network applications and what works well on
> current
> hardware/OSes. I'm just playing around with a simple TCP proxy right
> now which
> I'll use to experiment with "better" ways of doing stuff reasonably
> portably.
> I can then set this as the "upper bounds" for how well stuff may
> perform, and
> can then spend some time looking at how to tune things like
> parallelism,
> IO handling, memory allocation and event notification. Then I can
> spend some more
> time looking at batching operations such as IO, ACL lookups, etc -
> see if better
> use of CPU caches can be made and also see if doing all the system
> read/write
> syscalls in one hit per loop rather than spread out throughout the
> program execution
> makes any difference.
>
> Its really hard to benchmark -these- inside Squid, and thus its very
> difficult to
> figure out how to make better use of current hardware. _This_ is the
> "First Problem"
> to solve.
>
> Of course, all of this depends entirely on whether I get enough
> clients to start
> funding some of this work, and how much I can dedicate to this over
> my Semantics,
> Experimental Methods and Behavioural Neuropsychology classes this
> semester. :)
>
>
>
>
> Adrian
> (Sleep? Hah!)
>
> On Thu, Mar 06, 2008, Chris Woodfield wrote:
>> I'll readily admit that I Am Not A Developer, but I'm wondering if
>> this could be something that could be worked incrementally - finding
>> easy-to-cleave-off subsystems that can be moved to separate threads
>> similarly to how asyncio was. The most obvious one I can think of is
>> the front-end client/server network socket communication code; next
>> would be logging. Are there any other subsystems that jump out as
>> "independent" enough to do this in the existing code base?
>>
>> -C
>>
>> On Mar 6, 2008, at 4:17 AM, Adrian Chadd wrote:
>>
>>> On Wed, Mar 05, 2008, Michael Puckett wrote:
>>>> Mark Nottingham wrote:
>>>>>
>>>>> A killer app for -3 would be multi-core support (and the perf
>>>>> advantages that it would bring), or something else that the
>>>>> re-architecture makes possible that isn't easy in -2. AIUI,
>>>>> though,
>>>>> that isn't the case; i.e., -3 doesn't make this significantly
>>>>> easier.
>>>> Absolutely THE killer app for either -2 or -3. The fact that multi-
>>>> core
>>>> processors are now the defacto standard in any box makes this more
>>>> important by the day IMHO. Being able to do sustained IO across
>>>> multiple
>>>> Gb NICs will absolutely require it. This is the single biggest
>>>> performance enhancement that could be implemented. So where does
>>>> multi-core support fall on either roadmap?
>>>
>>> 12 months away on my draft Squid-2 roadmap, if there was enough
>>> commercial
>>> interest. Thing is, the Squid internals are very horrible for SMP
>>> (both 2 and 3)
>>> and the list of stuff that I've put into the squid-2 roadmap is what
>>> I think
>>> is the minimum amount of work required before really starting to
>>> take advantage
>>> of multiple cores.
>>>
>>>
>>>
>>>
>>> Adrian
>>>
>>> --
>>> - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial
>>> Squid Support -
>>> - $25/pm entry-level VPSes w/ capped bandwidth charges available in
>>> WA -
>>>
>
> --
> - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial
> Squid Support -
> - $25/pm entry-level VPSes w/ capped bandwidth charges available in
> WA -

--
Mark Nottingham       mnot@yahoo-inc.com
Received on Thu Mar 06 2008 - 16:34:59 MST

This archive was generated by hypermail pre-2.1.9 : Tue Apr 01 2008 - 13:00:04 MDT