Re: copy-on-write

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Mon, 2 Jun 1997 12:07:03 +0300 (EETDST)

> From: Oskar Pearson <oskar@is.co.za>
>
> Duane - is this list big enough to discuss stuff like this? about how many
> people read this? It's not very active... people might be on squid-users
> that don't know about this list...

    For me, squid-users is too much, I read only this list. Thus, if anything
 interesting appeares on users list, someone could crosspost here.

> > > Much more efficient... I wonder if it's the way some systems do it ;)
> >
> > Nope. Recall what page types can be for any MMU hardware: read-only,
> Hmm - Ok, I thought that this was done at the kernel level, rather than
> using the MMU... of course this could be very inefficient...
>
> > read/shared, read/exec, exec/shared, read/write/shared. Of these, only
> > read-only gives a chance to track two proccesses trying to write the
> > same page. If you have ALL pages shared/writeable, both child and parent
> > would write concurrently without any kernel notice - bad thing (tm).
>
> Unless you get the actual write call to look at the permissions on the
> block.

 What do you mean by write call? (Shared) memory is supposed to be accessed
 directly. You are right in your remark, that kernel does the setup and
 permission check, but kernel is notified by MMU hardware via hardware
 interrupt about any events like page not in ram, write to read-only page,
 etc. Then kernel looks up in its alloction structures and decides whether
 to map the page in or issue signal sigsegv. MMU notifies kernel only
 about violations, if page is mapped in memory and page permission allows
 for any process to write in there, there will be no kernel notification.
 This is what I was talking about.

> > What you are wondering about is actually vfork(). It is pretty good
> > for Solaris and very likely would solve problems for forking just a child
> > process. But how many OS-es support vfork()? man page says "its cool,
> > but na-na-na, be careful..." not a big deal, but makes coders nervous
> > when its gotchas are different on every OS...
> hmm :)
> from the linux man page for fork:

  Here is one from FreeBSD (Solaris is pretty close):

     vfork - spawn new process in a virtual memory efficient way

DESCRIPTION
  Vfork() can be used to create new processes without fully copying the ad-
  dress space of the old process, which is horrendously inefficient in a
  paged environment. It is useful when the purpose of fork(2) would have
  been to create a new system context for an execve(2). Vfork() differs
  from fork(2) in that the child borrows the parent's memory and thread of
  control until a call to execve(2) or an exit (either by a call to exit(3)
  or abnormally). The parent process is suspended while the child is using
  its resources.
-----
 In NOTES section, it says that this function will be eliminated in future
 and will be made alias for fork() when other (better) system sharing
 mechanism will be implemented.

> > > So - I still hold that there isn't a need for a stub process....
>
> > Term "stub proccess" in todo list is pretty far from well defined.
> > IMHO, ftp transfers should be included into squid main process some
> > day, dnsservers implemented either using async resolver lib or totally
> There is a web server called zeus (http://www.zeus.co.uk/products/server/)
> after a little interregation it seems that they have their own resolver
> library... (It wants "a file to use for resolving" and runs all as one
> process)

    bind 4.9.x contrib contains an async resolver lib by Darren Reed,
 which I guess does exactly this. (arlib)

> > PS. Duane, is there any fundamental redesign planned for 1.2? I think I
>
> Well as I see it there are 2 ways to go:
>
> re-write using threads: makes the source easier to understand
> keep the current ideas: might end up just as easy to understand, may be more
> efficient

 As much as I understand, rewriting in threads is held back by many factors.
 First, not all OS-es support POSIX threads, but more important might be
 the fact that to go threads we'd need to abandon totally squid's current
 design of select loop, this would be totally different code, thus not
 Squid any more. Coders who write select style fluently might get stuck
 with totally different style of concurrency, this is not only rewrite,
 this is also re-learn of programming techniques. While I am a fan of
 threaded coding, I'd suggest decide rewrite in threads as parallel
 branch and compare on diffculty, performance etc. and only if feeling
 confident with threaded code to make a switch. This is a decision for
 main squid developers team.
    Of course, parts of squid could run as threads inside select-style
 squid. This is pretty possible and can have benefits. For eg. ICP service
 could run as separate thread, sitting in a blocking read on ICP socket,
 and upon a read service it immediately and then suspend again in a read.
 But it would need to have independant concurrency from main process, thus
 a need for memory structures locking will arise at once. The benefit would
 be a reduced polling of incoming sockets, better responce time for these
 and good integration with other parts of squid which would be tough to
 achieve with separate process.

> Any thought about the "squid-filesystem", with one large file containing all
> data... (I posted a message about this to squid-dev a long time ago)

    Yes, many have thought same way I guess. You'll have to deal with
 problems of fragmentation, allocation, load sharing, and all this over many
 files on different Unix filesystems, If you want to get most bang of it,
 you'd want to use raw devices, and this means going pretty lowlevel to
 hardware, implementing disk caching, crash recovery(!), error checking and
 fixing... lots of work, considered, deferred...

> If you did the following
> start a process
> pre-fork a baby process
> that baby process handles all connections
> if the baby nearly runs out of filehandles it says
> "#$%# - daddy, help me" and passes back the token
> telling the father to spawn another process. Father does
> so, and baby1 gives the filehandle/token to baby2 to handle
> all new incoming connections. (possibly using shared ram?)
> baby 2 runs out of filehandles. if baby1 is also low, father then starts
> baby3, otherwise baby1 takes over again while baby2 slowly
> closes filehandles.
>
> This essentially limits the number of filehandles as follows:
> max of 253 babies, if you use a unix socket to handle the communication
> from baby1 -> daddy -> baby2 with some FD's for logs
> thus you have about 253*255 sockets (64515)

    I think something like this is a way to go. I'll post my thoughts about
 this separately. There are many problems to solve...

 regards,

-------------------------------------------------------------------
 Andres Kroonmaa Telefon: 6308 909
 Network administrator
 E-mail: andre@ml.ee Phone: (+372) 6308 909
 Organization: MicroLink Online
 EE0001, Estonia, Tallinn, Sakala 19 Fax: (+372) 6308 901
-------------------------------------------------------------------
Received on Tue Jul 29 2003 - 13:15:41 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:19 MST