Re: 1.2b20-1: Async IO fixes & optimizations

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Wed, 06 May 1998 20:37:56 +0000

--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Stewart Forster wrote:

> Eeps. Sending signals is a seriously slow way to do communication.
> Sure it'll break the select loop, but each finishing thread may
> incur a lot of system overhead for signal processing. I wouldn't
> recommend this approach.

We have to choose. Either something breaking select(), or use a very
short timeout. After some considerations I agree and the correct path is
to use a very short timeout. A threaded Squid is most useful on a loaded
machine, and there the average select call is very short anyway.

> I wanted to avoid any knowledge of what the thread library was doing
> from the main part of squid in order to prevent data sharing. I see
> that you broke this by introducing external data into the low-level
> thread library.

Yet another reason to use the short-timeout path ;-)

> Sigh. I wanted to avoid all use of locks to prevent any form
> of resource waiting. The method I used did this. I even had a
> working version that did away with pthread_cond_wait() at the
> expense of more FD's. Sure pthread_cond_wait() has implicit
> internal locks, but I was happy for it to handle that side of
> the business itself.

The only added resource waiting is if pthread_mutex_trylock has
any important implicit waits which I hope not. The main thread should
never block on the mutex, and the I/O thread is already blocked on
cond_wait, which in turn unlocks/locks the mutex.

> There wasn't any race condition to begin with. There never was
> the situation where the child could read corrupted half-written
> data.

Well, this can be argued.
* Can you guarantee that a update of a pointer is atomic on all
processors / configurations?
* You had both threads depending on ->req and ->donereq which is updated
independently. A typical corrupted-data race.
* There was the condition-variable race condition you documented
yourself (ok, this is not corrupted data, but it is a race condition
which the documented way to use condition variables avoids).

I think that our first goal should be to get a working and known-to-be
stable implementation, then look how it can be safely optimized. I admit
that I did not exactly follow this when doing the SIGCONT hack either.

My SIGCONT hack is hereby recalled. Please remove any traces of SIGCONT,
squid_in_select and thread_done, and use a very short ASYNC_IO select()
timeout.

/Henrik

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:49 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:46 MST