Re: apache style squid?

From: Dean Gaudet <dgaudet-list-squid-dev@dont-contact.us>
Date: Tue, 7 Oct 1997 02:01:24 -0700 (PDT)

--MimeMultipartBoundary
Content-Type: TEXT/PLAIN; charset=US-ASCII

> forking is VERY bad. LOTS of swap used. Lots of context to switch.
> Bad news all over.

swap use depends entirely on how well behaved your children are, and the
characteristics of your unix. Every Unix has copy-on-write ... if it
doesn't you're probably not running squid on it. For example, I've got
several linux 2.0.30 boxes running Apache with 400 to 500 children each,
and each child has only 80k data that isn't shared with all the other
children. Linux has optimistic memory allocation, Solaris doesn't -- when
I try the same thing on a Solaris box I have to slap a 2gb disk on it just
for swap which is never used.

Solaris has a terribly huge context switch overhead, I've never understood
why. Linux doesn't. For example, I've seen a dual pentium 133 running
linux do 100 fork()s and exit()s per second, while doing other things. An
ultra-2 (dual 167s) could almost manage 50 fork()s and exit()s while doing
nothing else. Those aren't comparable hardware, the ultra box is way
better than the pentium, ram wasn't an issue. It's not even a comparable
level of SMP kernel -- that's linux 2.0.30 with the single-kernel-lock
versus solaris 2.5.1 with its fine grained kernel. But linux easily
outperformed solaris. (This wasn't something I didn't set out to test...
it just happened while I was testing Apache for signal race conditions.)

It could be just as simple as cisc vs. risc overhead. But I doubt it.
Maybe some day I can do the same test under sparclinux vs. solaris.

> Using threads means no mmap() and no fork()'ing. Very efficient.

Generally yeah. But it's really worth a look at the JAWS project papers
-- see <http://www.cs.wustl.edu/~jxh/research/research.html>. It
discusses various models. One that I'm really excited about is the
completion port/fiber model that NT gives. I'm pretty certain the same
can be done with some of the more recent POSIX standards, stuff that's not
on any system yet though.

> > Don't need large amounts of file descriptors per process. Each
> > process is only handling 1 request at a time, so if
> > it's using more than 15 file descriptors I'd be
> > suprised.
>
> Threads still a problem. You'd only have one process. Still, we're pushing
> through 3M TCP hits/day (+7M UDP hits), per cache box and we still are only
> using 2000 fd's maximum. There's a bug which we removed which prevented squid
> from killing dead/stalled connections. Once fixed, that no longer becomes
> a problem. With separate sub-threads, connections could probably be closed
> more safely than with the current model, and this saving even more on FD
> usage.

Does Solaris 2.6 have a 2048 descriptor limit? It does fix the select()
above 1024 problem, right? I know you guys use poll() for that size, just
curious. Linux 2.1.x can easily go above 1024 descriptors, 2.0.x tops off
around 1024 (due to its method of doing select).

> > Some memory management wins in being able to alloc() a bunch
> > of private memory, and then exit() the process, having
> > the parent re-start you. I.e. guarenteed return of
> > memory to the system.

This isn't the way to manage memory. Even Apache doesn't do this. It
uses resource pools, every resource is tied to a pool. Each request has a
pool associated with it. pools allocate memory in 8k blocks. When Apache
is running multithreaded (the code is multithreaded, but nobody has done a
pthreads version yet) it doesn't have to lock between threads except when
it's touching the list of 8k blocks. The pool memory allocator amounts to
an add operation and that's it unless it runs out of memory in the blocks
already in the pool, so it's really light.

> > Don't need as much buffering when you sleep on read()'s,
> > write()'s et al. i.e. lighter malloc()
> > load. i.e. lower memory fragmentation in theory. :)
>
> True we don't need as much buffering. A good malloc() library solves
> most memory fragmentation issues.

Pools also seem to do well... since most allocations are in the block
size. Only when large things are allocated do >8k blocks get created.

Dean

--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:25 MST