Re: Thoughts about move work off the main squid thread to achieve parallelism

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 08 Mar 2011 15:43:17 +1300

 On Mon, 7 Mar 2011 19:35:33 +0000, Ming Fu wrote:
> Hi,
>
> My Name is Ming Fu. I have worked on squid on and off since 2000.
>
> My current interesting is to improve the performance of squid 3.
>

 Greetings and Welcome,

  Performance is one of the areas we are very keen on looking for
 improvements. Your interest and any patches are greatly appreciated.

> What I am think of doing is to move the none cache-able processing
> off the main squid thread.
>
> My assumption is the following:
>
> 1. significant portion of reply from web server are not cache-able.
> 2. Off load not cache-able processing off the main squid thread can
> save some CPU load from the main squid thread. This is similar to
> what
> is already happening on disk write and unlink.

 The way Squid-3.2+ SMP support operates there are N "main threads"
 running in parallel. As I understand it the RockStore project pushes
 storage handling out to a separate process now.

>
> Two approach I can think of:
> 1. move the processing of not cache-able reply to separate threads,
> these threads not need to access the cache.
> 2. Push the work down to the kernel's socket layer. Some kind of
> kernel filter that is able to associate two sockets and copy the in
> coming data from one socket to another. The squid establishes the
> association and provide information for the kernel filter to tell the
> end of a reply (chunked encoding or content-length). The kernel
> breaks
> the association when one reply is processed and squid regains the
> control of the sockets.
>
> The option 2 could potentially be faster than option 1, but will be
> depends on the OS platform. I come from a BSD background, I have some
> confidence that this will be possible for FreeBSD.
>
> Does my thoughts make sense?

 You do. I've been looking at these same ideas for a while.

 Option 2 sounds interesting, we have had sendfile() suggested in the
 past but that is unable to support any of the features people are
 finding important these days (data size reporting, bandwidth control and
 in-transit adaptation). If you know of other kernel features which can
 speedily pass data around they would be worth discussion and analysis
 for suitability.

 The idea behind option 1 does fits well with the design we are working
 towards at present. The "easy" way to implement it would be to bypass
 the disk file operations for marked requests/replies.

 Alex has been working on storage improvements with the project called
 RockStore which are nearly complete now. Take a look at the code changes
 there as part of the planning, storage bypass may be done already and
 waiting QA. Future changes relating to storage will be added to squid on
 top of this RockStore works update.

 Hope this helps, and looking forward to working with you.

 Amos
Received on Tue Mar 08 2011 - 02:43:23 MST

This archive was generated by hypermail 2.2.0 : Tue Mar 08 2011 - 12:00:03 MST