disk utilisation over 80 %

From: Marcel Janssen <maja@dont-contact.us>
Date: Wed, 05 May 1999 15:52:55 +0200

Hi,
we use Squid 2.1 PATCH 2 on a Sparc Station 5
with 224 MB memory and 2 cache_dirs of about 3,6 GB.

/dev/dsk/c0t3d0s0 1753142 761457 939091 45% /
/proc 0 0 0 0% /proc
fd 0 0 0 0% /dev/fd
swap 143988 12 143976 1% /tmp
/dev/dsk/c0t1d0s6 4372182 2686552 1641909 63% /usr/local/squid/cache1
/dev/dsk/c0t4d0s6 4372182 3031577 1296884 71% /usr/local/squid/cache2

cache_mem 2 MB
maximum_object_size 16 MB
the rest is default

I've got a diskutilisation over 80% and the system is
waiting for IO of the root disk. (BTW: logging is on the root disk)

          sd1 sd3 sd4 nfs1
 rps wps util rps wps util rps wps util rps wps util
   1 5 4.3 62 3 81.6 1 6 4.1 0 0 0.0
   0 5 3.5 63 3 82.1 2 10 7.3 0 0 0.0
   1 3 2.0 62 2 81.7 2 4 3.6 0 0 0.0
   1 6 4.7 62 3 83.6 1 7 6.3 0 0 0.0
   1 5 3.7 64 4 85.7 1 5 3.8 0 0 0.0
   1 4 3.9 64 3 82.9 1 5 4.0 0 0 0.0
   1 7 4.6 62 3 81.5 1 4 3.8 0 0 0.0
   1 5 3.9 65 3 84.1 1 5 4.1 0 0 0.0
   1 7 5.2 65 3 83.4 1 5 4.0 0 0 0.0
   1 5 3.0 61 9 82.7 2 2 3.3 0 0 0.0
   1 5 3.9 58 17 85.8 2 6 6.9 0 0 0.0
   1 4 2.7 58 19 86.2 1 7 5.4 0 0 0.0
   1 4 3.3 57 17 84.7 1 5 3.8 0 0 0.0
   1 5 4.4 59 17 86.3 2 5 4.5 0 0 0.0
   1 5 4.5 56 16 83.9 2 7 5.7 0 0 0.0
   1 5 3.8 59 18 88.5 1 5 3.5 0 0 0.0
   2 5 3.7 58 19 87.4 0 1 1.2 0 0 0.0

This is what I got out of the cache.log

tail -f cache.log
   (a sample)
1999/05/05 14:28:12| WARNING: newer swaplog entry for fileno 0000052E
1999/05/05 14:29:56| sslReadClient: FD 100: read failure: (131) Connection reset by peer
1999/05/05 14:31:43| WARNING: newer swaplog entry for fileno 01000E5D
1999/05/05 14:56:27| comm_accept: FD 54: (130) Software caused connection abort
1999/05/05 14:56:27| httpAccept: FD 54: accept failure: (130) Software caused connection abort

After the last 2 messages the disk I/O on sd3 went up the sky.
          sd1 sd3 sd4 nfs1
 rps wps util rps wps util rps wps util rps wps util
   7 17 17.8 0 4 5.7 4 2 3.8 0 0 0.0
   4 19 15.2 4 2 7.4 4 2 3.9 0 0 0.0
   4 22 18.1 6 3 13.3 3 1 3.1 0 0 0.0
   4 23 19.2 26 5 38.9 2 1 2.4 0 0 0.0
   3 20 17.1 37 6 50.5 1 1 1.9 0 0 0.0
   4 19 16.7 43 7 59.8 3 5 5.2 0 0 0.0
   2 9 7.6 52 11 77.8 2 6 5.5 0 0 0.0

The CPU Idle time went up to an average of about 90% because
processes were waiting on IO.

Could someone explain

A: 1999/05/05 14:29:56| sslReadClient: FD 100: read failure: (131) Connection reset by peer

B: 1999/05/05 14:56:27| comm_accept: FD 54: (130) Software caused connection abort
   1999/05/05 14:56:27| httpAccept: FD 54: accept failure: (130) Software caused connection abort

C: What's happening here?

-- 
With kind regards,
Marcel Janssen
Oce Technologies
Received on Wed May 05 1999 - 08:06:18 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:46:14 MST