Re: memory mapped store entries

From: Chris Wedgwood <chris@dont-contact.us>
Date: Tue, 25 Aug 1998 15:37:23 +1200

On Tue, Aug 25, 1998 at 12:58:12PM +1000, Stewart Forster wrote:

> Lock the page with a semaphore.

Semaphore's can be really expensive. On big MP machines with async IO
and high loads, this needs to be as cheap as reasonably possible.

> What about when paging occurs due to memory shortage. Sure paging
> is bad in any event. I was thinking mlock() would ensure the
> mmap()ed pages would remain resident so they didn't require any
> page-faults to bring them in again after being paged out.

mlock is expensive, so is mmap.

mmap+writev is a big win for large files, but my test indicate it
sucks rocks for small files. mmap take time to setup and take down,
and it thrashes the TLB alot.

madvise is pretty much necessary, and not all OSs have this.

mmap+squidFS would rock, but that assumes your VM is as big as your
disk space, which for most people isn't an option.

I think if mmap+write is going to be used, a hybrid approach would be
best, or better still abstract this a bit so things like the linux
copyfd and PH-UX sendfile can be used on architectures which support
this.

> Maybe. You have assumed ample memory to prevent buffer cache
> thrash for your mmap()ed pages.

mmap'ing to the extent of serious overcommit will made perfomance
tuning complicated.

Right now, most (all?) OSs read read-ahead for read, but not all for
mmap - linux 2.0.x sucks rocks for mmap+write is paging is required
(but if paging isn't, it kills for speed).

linux-2.1.x is better, I think Solaris is better still (using
madvise sensibly).

> I'd do that with a threaded fsync() happening every N secs.

fsync syncs metadata too... fsyncdata sync only the data. Perhaps to
one of the first and 5 of the latter or something?

> You'd have to do comparisons first to say that.

Of, and mlock is only available to uid=0 on most OSs, which probably
means its not going to be of use to many people... (nobody runs squid
as root, right boys and girls?)

> That's pretty low load. I'm talking about 5000 HTTP requests per
> minute at peak loads across 80Gb cache swaps. The mmap() system
> you propose will save a fair chunk of memory on that.

Ouch... Bang bang the Packet Man.

What OS/mem/config are you running?

> I propose that a side-patch for your stuff is generated and tested
> against some real killer caches before committing it. If it's just
> a #define at compile time, then I'm happy with that too.

./configure --enable-mmap-io

-cw
Received on Tue Jul 29 2003 - 13:15:52 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:53 MST