Re: copy-on-write

From: Oskar Pearson <oskar@dont-contact.us>
Date: Sat, 31 May 1997 02:09:21 +0200 (GMT)

Hi all

Duane - is this list big enough to discuss stuff like this? about how many
people read this? It's not very active... people might be on squid-users
that don't know about this list...

> > It would be much more efficient to consider ALL pages shared all the time.
>
> This contradicts definition of fork(). And, shared pages cannot be
> protected against reading. this has security implications.
>
> > If a child process writes to ram and it's parent isn't 1 (ie it's parent
> > hasn't died or been killed) in the process table struct the kernel then
> > duplicates the block...
> >
> > Much more efficient... I wonder if it's the way some systems do it ;)
>
> Nope. Recall what page types can be for any MMU hardware: read-only,
Hmm - Ok, I thought that this was done at the kernel level, rather than
using the MMU... of course this could be very inefficient...

> read/shared, read/exec, exec/shared, read/write/shared. Of these, only
> read-only gives a chance to track two proccesses trying to write the
> same page. If you have ALL pages shared/writeable, both child and parent
> would write concurrently without any kernel notice - bad thing (tm).
Unless you get the actual write call to look at the permissions on the
block.

> What you are wondering about is actually vfork(). It is pretty good
> for Solaris and very likely would solve problems for forking just a child
> process. But how many OS-es support vfork()? man page says "its cool,
> but na-na-na, be careful..." not a big deal, but makes coders nervous
> when its gotchas are different on every OS...
hmm :)
from the linux man page for fork:

       Under Linux, vfork is merely an alias for fork.
       fork does never return the error ENOMEM.

and the clone man page says:
       clone is an alternate interface to fork, with more
       options. fork is equivalent to clone(0, SIGCLD|COPYVM).

> > So - I still hold that there isn't a need for a stub process....

> Term "stub proccess" in todo list is pretty far from well defined.
> IMHO, ftp transfers should be included into squid main process some
> day, dnsservers implemented either using async resolver lib or totally
There is a web server called zeus (http://www.zeus.co.uk/products/server/)
after a little interregation it seems that they have their own resolver
library... (It wants "a file to use for resolving" and runs all as one
process)

> If I recall right, squid is sheduled for major rewrite before 1.2 alpha
> appeares. I think, it is quite a right time to start some debate about its
> overall design evolution. Funny, to talk about this stub stuff, on one
> hand there is a desire to merge most of squid's parts into one to avoid
> multiple processes, while on the other hand there is a desire to split
> squid into functionally separate tasks to make code cleaner and more
> readable.
>
> PS. Duane, is there any fundamental redesign planned for 1.2? I think I
> have few interesting ideas to share...

Well as I see it there are 2 ways to go:

re-write using threads: makes the source easier to understand
keep the current ideas: might end up just as easy to understand, may be more
        efficient

Any thought about the "squid-filesystem", with one large file containing all
data... (I posted a message about this to squid-dev a long time ago)

About zeus again - I asked them if they had a problem with "max filehandles
per process" and they said not.... They seem to handle as many as they can
with one process and if it's coming too close, they fork another process
and get it to handle incoming connections for a while...

You can also say "run 3 copies of zeus", but they say that it shouldn't
exceed the number of processors in the system... (ie it handles SMP well,
which would be great in squid...

This is a message I wrote to someone else a while back about something
similar (how to code a process that handles connections with the
minimum of forking and without running into the "per-process" limit):

-----------------------------
This is what apache does (never actually looked at the source though ;):

Forks multiple processes:
does a listen on the port it's supposed to be watching
when it gets a connection, it passes the filehandle to the first process
        it forked and considers it busy until it gets
        a reply from that process saying "done"
if it gets another connection it passes it to the first available child.

There are problems with this idea... I think (judging by apaches
performance on our web server...)

If the processes take a long time to complete (if they are doing blocking
dns lookups, for example) you need to run lots and lots of the child processes.
We use MaxClients 190
in other words it can run up to 190 copies of httpd at times (this is why
the load is sometimes 20 ;)

If you did the following
start a process
pre-fork a baby process
        that baby process handles all connections
                if the baby nearly runs out of filehandles it says
                "#$%# - daddy, help me" and passes back the token
                telling the father to spawn another process. Father does
                so, and baby1 gives the filehandle/token to baby2 to handle
                all new incoming connections. (possibly using shared ram?)
        baby 2 runs out of filehandles. if baby1 is also low, father then starts
                baby3, otherwise baby1 takes over again while baby2 slowly
                closes filehandles.

This essentially limits the number of filehandles as follows:
        max of 253 babies, if you use a unix socket to handle the communication
        from baby1 -> daddy -> baby2 with some FD's for logs
        thus you have about 253*255 sockets (64515)

--------------------------

Great in theory, but with no practical way of doing it?
Having a "daddy process" that forks means that you don't have have the much
discussed "stub process" problem.

Also - it may be worth considering a different algorythm for cache content
removal... I don't know enough about this...
http://www6.nttlabs.com/HyperNews/get/PAPER250.html

Might be something interesting on
http://www.cs.bu.edu/faculty/djy/cs835-fall96.html

        Oskar
Received on Tue Jul 29 2003 - 13:15:41 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:18 MST