Re: copy-on-write

From: Oskar Pearson <oskar@dont-contact.us>
Date: Wed, 28 May 1997 16:49:08 +0200 (GMT)

Andres Kroonmaa wrote:

Quoting all of this so that we can remember it :)

>
> > Hi
> >
> > Looking at
> > http://squid.nlanr.net/Squid/Devel/todo.html
> > http://squid.nlanr.net/Squid/Devel/Todo/9703122.txt
> >
> > I don't think that this is necessary (at least with linux, probably with
> > newer versions of solaris too, from what I have heard)
> >
> > Most use a "Copy-On-Write" method when they fork. This means that they
> > share pages, unless either of the processes write to a page, in which
> > case the OS then duplicates the page.
> >
> > This means that if a 400M program forks and then execs another process,
> > you don't need 800M of ram to do it, and all it does is insert a process
> > into the process list (thus adding a little struct to a table).
> >
> > So - is there something I am missing?
>
> "Copy-On-Write" relies on page-protection "read-only". After a fork()
> you are guaranteed to have exact copy of all proccess image, including
> read/write data parts. Now, to implement "Copy-On-Write" for all data
> squid uses, OS should mark all parent proccess pages to be read-only
> just for the sake of child process - this is quite an overhead and as
> parent process still probably modifies most pages all the time, the
> copy-on-write would still duplicate all pages, but later, via pagefaults.
>
> As all this would slow down both processes in overall, OS-es do not do
> that for data pages, instead they copy full proccess images. To overcome
> this overhead, algoritmic changes are needed, and this is exactly what
> stub process is for.

I agreed for you for many weeks, but realised something the other
day:

Since the only way to create a process is by forking, doing things the
way you suggest is a bad idea.

I think that the kernel must work the other way:

It would be very inefficient to have to work through the entire contents
of ram at the time of a fork and say "Fork. This page is now marked as
shared. This page is also marked as shared. This page is marked as
shared...."

It would be much more efficient to consider ALL pages shared all the time.

If a child process writes to ram and it's parent isn't 1 (ie it's parent
hasn't died or been killed) in the process table struct the kernel then
duplicates the block...

Much more efficient... I wonder if it's the way some systems do it ;)

I will try and look at the Linux source code sometime...

So - I still hold that there isn't a need for a stub process....

        Oskar
Received on Tue Jul 29 2003 - 13:15:41 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:18 MST