Hmmmm.

From: Michael O'Reilly <michael@dont-contact.us>
Date: Mon, 17 Mar 1997 22:01:20 +0800

Food for thought. This is a profiling of real time (i.e. wall clock
time) consumed per system call (this was generated with the use of
strace, and a bit of perl).

The columns run:
#1 is the number of times this system call was made during the trace
        run.
#2 is the total time spent in this system call in seconds.
#3 is the average time used per invocation.
#4 is the portion of total time used on this system call.

        <system call> <#1> <#2> <#3> <#4>

                bind 25 0.0134 0.000537 0.00006
                stat 10 0.0262 0.002620 0.00012
              sendto 78 0.0857 0.001099 0.00041
            recvfrom 79 0.1086 0.001374 0.00052
            getdents 20 0.1486 0.007432 0.00071
          setsockopt 1205 0.4845 0.000402 0.00231
         getsockname 2311 0.9258 0.000401 0.00441
              socket 1205 1.1376 0.000944 0.00542
               lseek 4098 1.5391 0.000376 0.00733
             connect 2393 2.0355 0.000851 0.00970
              accept 2311 2.7471 0.001189 0.01309
        gettimeofday 9526 3.8969 0.000409 0.01857
               close 4980 4.0235 0.000808 0.01917
               fcntl 17304 6.2089 0.000359 0.02959
               write 13615 18.6318 0.001368 0.08879
              select 39879 25.4206 0.000637 0.12114
              unlink 1073 39.7614 0.037056 0.18948
                read 7883 44.5279 0.005649 0.21220
                open 1599 58.1204 0.036348 0.27697
Total time elapsed: 209.843380000015

As you can see, select gets called an ungodly number of times. As far
as I can see, this is due to the very frequent ICP checking.

The bulk of the wall clock time tho is used in unlink, read, and open,
which account for about 67% of total elapsed time!

The main problem seem to be the synchronous I/O waiting for disk,
amplified but the fact that only one disk queue is getting used.

I had some thoughts about have an apache style model, which multiple
processes using mmap() to share the index, and locking to share the
accept()'s. Anyone looking at this? It'd be a lot more portable than
the async disk I/O I see some people looking at...

The cache is handling 8 gig of disk, running on a PPro 200 with 256Meg
of ram. Squid was using 170meg of ram when this trace was taken, and
there was about 60 meg of ram being used for disk cache. It normally
takes about 55,000 connections/hour.

comments?

Michael.
Received on Mon Mar 17 1997 - 06:20:37 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:34:42 MST