Re: squid-1.2-SQUIDFS

From: Stephen R. van den Berg <srb@dont-contact.us>
Date: Sun, 31 May 1998 02:49:19 +0200

--MimeMultipartBoundary
Content-Type: text/plain; charset=us-ascii

Alex Rousskov wrote:
>Assume that the entire disk cache is mmap()-ed. Assume, you have almost
>enough memory, but you are short on 100K only. Assume the OS decides to swap
>out main.o portion of the Squid process. This would require, say, 10 disk
>I/Os. The net disk activity is, thus, minimized (no I/Os after everything is
>prefetched into memory). However, since main.o is on disk, Squid does nothing
>but waiting for being paged in.

Well, there are some flaws in this logic, I think.
1. Once we're 100KB short on memory, wether we're mmapping or not, the
   above worst case is equally likely to occur. I.e. there is no particular
   reason why mmapping would cause this worst case to happen more often
   than in the non-mmapping case.
2. Even more so, when we mmap files instead of reading them into memory,
   we share the IO-buffers with the OS. This implies that we'll actually
   be preserving memory. This would mean that due to the reduced memory
   requirements, the worst case swapping-problems will occur a lot later.
   
>preserve. Moreover, IMO, the more tasks you hand over to OS, the better the
>chances that it will make suboptimal (for your application) decision are. As

True, but by mmapping files, in a way you're actually reducing the involvement
of the OS. Since you share IO-buffers between user and kernel space.

>far as I understand, none of the Unix OSes was _designed_ to handle massive
>amounts of mmap()-ed files (by count and/or volume). Thus, I would expect a
>lot of "surprises" when it comes to real world performance.

Possible. Then again, mmapping large files basically is the same operation
as using a lot of swap from a kernel point of view. I would expect
OSes to be quite capable of handling that.

>Finally, as far as memory copying is concerned, has anybody calculated the
>number of malloc()/memcpy()/strlen()/free() and similar operations Squid does
>to process a single request? I would expect that the total overhead Squid
>introduces dominates whatever copying is done by the kernel. If somebody
>cares about this overhead, the place to start is Squid, not the OS, I guess.

This overhead you're talking about happens at the start of each request
when processing the header. Once squid starts passing through data,
all overhead is dominated by the memory space and copying overhead.
Both bottlenecks have the potential to dominate, which one will dominate
first possibly depends on the kernel architecture at hand and the
request patterns.

-- 
Sincerely,                                                          srb@cuci.nl
           Stephen R. van den Berg (AKA BuGless).
The eleventh commandment: Thou shalt not re-curse!
--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:50 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:48 MST