Re: Squid & fork

From: Michael Pelletier <mikep@dont-contact.us>
Date: Tue, 17 Feb 1998 11:37:09 -0500 (EST)

On Tue, 17 Feb 1998, Pavel P. Zabortsev wrote:

> Is it the truth that squid do NOT fork itself when an request have
> arrived? That is in each moment only one copy (!) of squid works. I
> think that it results in decline of productivity! But maybe I did not
> found how to force squid to start some copies itself? How I can do
> that?

Yes, it's true, and that's one of the advantages of Squid. Forking a copy
means copying the entire process and data space of the process, which
takes a non-insignificant amount of time and memory. Each fork() has this
kind of overhead, and saps the performance of the process.

Instead, Squid has a kind of pseudo-multi-threaded architecture, where
each request is assigned a filedescriptor and series of "handlers," which
are pointers to subroutines.

The "select()" loop polls each of the descriptors to see if they're ready
to run one of the handlers, such as the "read" handler, the "write"
handler, the "close" handler, and so on, and each descriptor that's ready
for it has its handler subroutine called. Descriptors that aren't ready,
such as those that are waiting on a slow server to respond, or doing DNS
lookups, etc, are skipped until they're ready to go.

A computer can think a lot faster than I/O can take place, and Squid takes
advantage of this fact, by not twiddling its thumbs while waiting on I/O
-- it goes out and takes care of something else that's ready.

In the "cache information" screen of your cache manager, you can see an
item called "Select loop called" which indicates the number of times that
the select loop has been called and the average amount of time each loop
took to handle all the descriptors that were ready for action. Mine
currently shows 5,460,705 times, and 66.606 milliseconds average time, and
my server has been up for four days.

Squid's single process is designed in such a way that it can easily handle
hundreds and hundreds of simultaneous requests, because most of those
requests won't need to be acted on at precisely the same time. Why fork()
another process with all the memory and overhead it needs just to sit
around and wait on slow I/O?

        -Mike Pelletier.
Received on Tue Feb 17 1998 - 08:44:03 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:54 MST