Re: squid rewrite from Andres Kroonmaa on 1997-06-09 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Mon, 9 Jun 1997 11:37:40 +0200 (EETDST)

> From: Oskar Pearson <oskar@is.co.za>
>
> What I am basically proposing:
>
> Create a process that does the listen on the incoming sockets. It also
> functions as a database manager, in that it knows all of the objects
> in the cache and where they are stored. This database will include
> locking, so that as an object is being downloaded it isn't expired.
>
> This database is stored as (?a seperate file?) a memory-mapped (man mmap)
> file with the synchronous flags on so that one update immediately
> becomes visible to all process that have the same memory-mapped
> file. Note that these process can't change the file (they open
> it with "PROT_READ" - only the "central-object-manager" has access
> to "PROT_WRITE")

mmm, I guess sync flag is to flush changes to disks immediately. If
mmap-ed MAP_SHARED changes are visible to all procs already. syncing
to disk would slow things down.

> If the baby-process gets too loaded (eg nears the fd limit) it means that
> the parent process can then fork (possibly pre-fork?) a second
> process that it can then pass all requests to. If there are multiple
> CPUs in the machine it can send every 2nd request to one process, then
> the other... thus we eliminate both problems at once.

I'd suggest preforking determined number of service processes and
distribute load between them evenly. This has less impact in case of
crash of any service processes. If you have 1 proc serving 1000 sessions
and it goes down, I guess about 300-400 users will be really pissed. If
you have spread these 1000 sessions between 10 processes, only 30-40 users
will be pissed...

> The only way I can figure a way around both these problems is to
> use shared memory (or some form of ICP - shared memory will be the most
> efficient, I think, since it doesn't involve system calls and waiting
> on unix-domain sockets etc)

Yep, shared mem would be best.

> If we want to get really fancy we could mmap the entire structure
> that the data is stored in.... letting the OS handle all of the disk-io
> without the lag caused by the filesystem.

Reconsider. mmap-ed data has one big drawback - you never know when
you would block the whole process on the page IO. Select loop follows
non-blocking behaviour, but when you simply access a page that is expected
to be on disk, it is hard to do it non-blocking way. The biggest problem
is that OS cannot optimize for disk io in this way. OS must service 1
page at a time, contra selecting between 1000s when using select().

-------------------------------------------------------------------
Andres Kroonmaa Telefon: 6308 909
Network administrator
E-mail: andre@ml.ee Phone: (+372) 6308 909
Organization: MicroLink Online
EE0001, Estonia, Tallinn, Sakala 19 Fax: (+372) 6308 901
-------------------------------------------------------------------
Received on Tue Jul 29 2003 - 13:15:41 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:20 MST