Re: copy-on-write

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 28 May 1997 21:32:47 +0300 (EETDST)

> From: Oskar Pearson <oskar@is.co.za>
>
> I think that the kernel must work the other way:
>
> It would be very inefficient to have to work through the entire contents
> of ram at the time of a fork and say "Fork. This page is now marked as
> shared. This page is also marked as shared. This page is marked as
> shared...."

    It is even more complicated as most of the xxx-MB is on the heap,
 dynamically allocated and incredibly fragmented.

> It would be much more efficient to consider ALL pages shared all the time.

    This contradicts definition of fork(). And, shared pages cannot be
 protected against reading. this has security implications.

> If a child process writes to ram and it's parent isn't 1 (ie it's parent
> hasn't died or been killed) in the process table struct the kernel then
> duplicates the block...
>
> Much more efficient... I wonder if it's the way some systems do it ;)

    Nope. Recall what page types can be for any MMU hardware: read-only,
 read/shared, read/exec, exec/shared, read/write/shared. Of these, only
 read-only gives a chance to track two proccesses trying to write the
 same page. If you have ALL pages shared/writeable, both child and parent
 would write concurrently without any kernel notice - bad thing (tm).
 Having ALL pages read-only? hmm... this leaves the only way - mark all
 parent pages read-only only upon fork() and duplicate those written
 later. As I said, this would slow both proccesses down for longer time
 and introduce quite headache when a child forks further with parts of
 pages dupped others not and pure kernel has to keep track of all of them

    What you are wondering about is actually vfork(). It is pretty good
 for Solaris and very likely would solve problems for forking just a child
 process. But how many OS-es support vfork()? man page says "its cool,
 but na-na-na, be careful..." not a big deal, but makes coders nervous
 when its gotchas are different on every OS...

> So - I still hold that there isn't a need for a stub process....

    Term "stub proccess" in todo list is pretty far from well defined.
 IMHO, ftp transfers should be included into squid main process some
 day, dnsservers implemented either using async resolver lib or totally
 separate proccesses communicating with squid via loopback and udp messages.

    Stub proccess is not bad by itself - it is much cleaner and easier to
 keep track, instead of hacking inside kernels and historical drawbacks.

 ----

    If I recall right, squid is sheduled for major rewrite before 1.2 alpha
 appeares. I think, it is quite a right time to start some debate about its
 overall design evolution. Funny, to talk about this stub stuff, on one
 hand there is a desire to merge most of squid's parts into one to avoid
 multiple processes, while on the other hand there is a desire to split
 squid into functionally separate tasks to make code cleaner and more
 readable.

 PS. Duane, is there any fundamental redesign planned for 1.2? I think I
 have few interesting ideas to share...

-------------------------------------------------------------------
 Andres Kroonmaa Telefon: 6308 909
 Network administrator
 E-mail: andre@ml.ee Phone: (+372) 6308 909
 Organization: MicroLink Online
 EE0001, Estonia, Tallinn, Sakala 19 Fax: (+372) 6308 901
-------------------------------------------------------------------
Received on Tue Jul 29 2003 - 13:15:41 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:18 MST