Re: [SQU] multiprocessor machines

From: Adrian Chadd <adrian@dont-contact.us>
Date: Tue, 2 Jan 2001 05:16:24 -0700

On Tue, Jan 02, 2001, Adam Rice wrote:
> Adrian Chadd wrote:
>
> > What people have to realise is that the mechanics behind a poll() loop
> > really don't scale, and even though it can be made to scale better,
> > it will take a lot of work. The more CPU you throw at it will make
> > the code run better, but the main influencing factor behind the behaviour
> > of poll() is the network traffic patterns, not the CPU you throw at it.
> > I've kernel-profiled squid boxes spending 50% of their time in a big
> > poll() loop in the kernel..
>
> I read (on Kernel Traffic, many moons ago) of a neat method to get
> around this. In the simplest case, you have two threads, both of which
> are in poll(). One thread takes care of the few file descriptors that
> are really busy (the overhead from poll() is kept low by only having a
> few file descriptors on the list), and one takes care of the rest that
> are mostly idle (the overhead from poll() is kept low by not being
> called very often). This could be extended to additional threads in an
> obvious way.
>
> The benefits of this approach would be that the idle/busyness of HTTP
> connections is pretty predictable, you'd get SMP support "for free", and
> it would work with current kernels that don't have some notional poll()
> successor implemented. The drawbacks would be that it would perform
> badly in benchmarks (where all connections would tend to be the same),
> and that I can see it being extremely challenging to implement.

Since I keep getting people throwing "why not split the poll() loop up
across multiple CPUs?" at me, I really should braindump into the list
for archives sake.

Your approach would work. But it would give squid SMP support in a similar
way that async io give squid "SMP support". If you break up the poll()
loop into multiple threads, you still have the underlying problem that
most of squid isn't thread-safe.

Now, there are a few ways to handle this:

* the "standard" thread way
  Take squid, add a "giant mutex" to it, and then make it multithreaded.
  You now have a single mutex protecting everything - so even though all
  threads can be in poll() at one time, only one thread can be in the
  squid main body at one time. Then work on pushing the mutexes "into"
  squid.

  The trouble with this way is that you've just made the main squid code
  *VERY* complex, and as you add more CPUs you'll have lots of threads
  contending for mutexes.

  Its the "general thread program" way, but it'll take a long time, make
  squid even more complex (in a time where we're trying to make it
  simpler and tidier!) and it'll scale poorly with more CPUs.

* the "async io" thread way
  Similar to above, but don't go much past the "giant mutex" bit. Instead,
  have each of the threads run poll() over a set of file descriptors, and
  then add the completion callbacks to a queue which is called by the
  "main" squid thread.

  You could probably do this without changing *too* much of the code.

  The trouble with this way is that you've bought concurrency to poll(),
  but not to the main squid thread. So, if you have 2 CPUs, you'll end up
  saturating one with poll() and one with squid. If you have 4 CPUs, you'll
  still end up saturating one with squid, but the poll() load is shared
  across the three CPUs, and their usage is a function of network traffic
  and the load of the first CPU. So, you end up with simpler code and a
  faster squid, but its just as inefficient as asyncufs.

* the "adrian" way
  The Adrian Way is a little stranger - I'm working on making the rest of
  squid work better with some of the newer OS / machine advances that exist
  today. Once this is done, we can "optimise" poll()/select(). My SMP-aware
  code is still in experimentation, and I'll write up a quick paper on it
  once I have the time to complete it.

See, if we choose one of the first two methods, we're locking ourselves
down to a paradigm which isn't very flexible. I'm trying to make squid
rely less on older OS concepts so it becomes more malleable towards newer
"features".

The modio work on sourceforge is an example of this - I'm ripping out the
in-memory StoreEntry index/hash and pushing the object lookup into the
object storage layers. This allows people to try out ideas like the
reiserfs_raw and other disk-based hashes, and it allows me to rework
the storage manager/storage client code to make squid work better with
today's OSes network code.

This will (hopefully!) result in the implementation of the last few years
of web cache ideas and bring squid back up to scratch against some of
the commercial cache vendors while still maintaining flexibility.

I am *very* happy for new ideas - if anyone has any, feel free to email
me.

(Don't take it as I'm not looking at having squid run on "older"/commercial
OSen, I'm quite cogniscent of them, and I still have to support them .. :-)

Now, if I only had infinite spare time .. :-)

Adrian

--
To unsubscribe, see http://www.squid-cache.org/mailing-lists.html
Received on Tue Jan 02 2001 - 05:18:45 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:57:18 MST