Re: memory mapped store entries

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Thu, 27 Aug 1998 12:41:29 +0300 (EETDST)

On 27 Aug 98, at 18:44, Chris Wedgwood <chris@cybernet.co.nz> wrote:

> On Thu, Aug 27, 1998 at 09:23:43AM +0300, Andres Kroonmaa wrote:
>
> > Dirty pages have some max lifetime in any kernel. Kernel _must_
> > flush them in some timeframe, otherwise it is a _broken_ kernel.
>
> I agree it should, and I assumed it did - but my simple tests
> indicate on an idle linux machine, this timeframe is more than 10
> minutes.
 
 oh, Solaris, for eg. guarantees that dirty page is flushed in 30 secs
 at worst (by default)
 
> > No. You don't want to explicitly msync anything. msync will block the
> > whole process for the duration of a pageout.
>
> No it won't - or shouldn't. See the man page for msync, it can be
> done asynchronously, or at worst - in another thread (assuming were
> using async-io).

 You're right, i didn't look at man page enough.
 
> If the kernel has a semaphore of lock around all those pages for the
> duration the sync. is going on, then yes - that will block, but I
> don't know of any kernel which does this.

 perhaps not. I was assuming that msync would not return until pageouts
 are guaranteed. async flag changes my view.

> > Again I disagree. mmaped data could get inconsistent _only_ in a case
> > that squid proccess is wiped out of existence just while it is in the
> > proccess of updating any single store entry AND has not yet updated
> > structure consistently. In any other case, the mmapped store data will
> > be consistent, as kernel _must_ flush all dirty pages when it has to
> > unmap all pages related to the process.
>
> Consider this:
>
> - swap is sync'd to disk (somehow)
> - lots of new files are added to store, perhaps some removed
> - file blocks and file-metadata blocks get written to disk
> - power goes off
>
> Now, whats mmap'd is out of date and _wrong_.

 yeah, it will as old as from last page flush, not more. It will just
 lack recent changes, but not corrupt. Same goes for current squid.
 And unless you use syncronous disk io, you'd have FS a mess anyway,
 so what's the point in arguing about mmap or any other delayed write.

> > It is way much more reliable than catching signals and trying to write
> > out swaplog as it is done currently.
>
> Yes, I agree. mmap is a win - but its not 100% foolproof against
> corruption because my test show the kernel does not flush dirty
> buffers at all of very frequently, not my kernel anyhow.

 then your kernel is broken, or timed page flushes explicitly disabled.

> Test code appended. Run it... wait for 10 minutes or so, then
> HARD-REBOOT. Examine /tmp/mmap-test when the machine comes back up
> and tell me if its not all zero's still.

 I don't have spare box around, but this is at core of Solaris io system,
 so I'm pretty sure it is flushed in 30 secs.

 Anyway, we have arrived at understanding, that we must have ifdef that
 would do msync on some systems regularly...

 ----------------------------------------------------------------------
  Andres Kroonmaa mail: andre@online.ee
  Network Manager
  Organization: MicroLink Online Tel: 6308 909
  Tallinn, Sakala 19 Pho: +372 6308 909
  Estonia, EE0001 http://www.online.ee Fax: +372 6308 901
 ----------------------------------------------------------------------
Received on Tue Jul 29 2003 - 13:15:53 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:54 MST