Re: Problems with Solaris 2.6 and Squid-1.1.20 !

From: Tomi Orava <Tomi.Orava@dont-contact.us>
Date: Fri, 13 Mar 1998 09:21:11 +0200 (EET)

>
> On Thu, 12 Mar 1998, Tomi Orava wrote:
>
> > Notice the big difference between the selection loop average times as
> > well as in CPU usage values. Vmstat command verifies that it really
> > is the case that Squid is using 60-90 % of CPU from time to time, which is
> > abnormal behaviour.
>
> Indeed. Have you truss'ed the squid process to find out what it is doing?

Yes I have. Here is a sample truss output, in case you could figure something out
from it. To me it looks like a really heavy polling compared to a busy production
machine with lots of requests ...

[ sample 1]

fstat64(23, 0xEFFFF030) = 0
getsockopt(23, 65535, 8192, 0xEFFFF134, 0xEFFFF12C) = 0
fcntl(23, F_SETFD, 0x00000003) = 0
fcntl(23, F_GETFL, 0x00000000) = 2
fstat64(23, 0xEFFFF030) = 0
getsockopt(23, 65535, 8192, 0xEFFFF134, 0xEFFFF12C) = 0
fstat64(23, 0xEFFFF030) = 0
getsockopt(23, 65535, 8192, 0xEFFFF134, 0xEFFFF130) = 0
setsockopt(23, 65535, 8192, 0xEFFFF134, 4) = 0
fcntl(23, F_SETFL, 0x00000006) = 0
setsockopt(23, 6, 1, 0xEFFFF194, 4) = 0
connect(23, 0x0041A990, 16) Err#150 EINPROGRESS
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
connect(23, 0x0041A990, 16) Err#133 EISCONN
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
write(23, " G E T / ~ d e s / h t".., 497) = 497
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
read(23, " H T T P / 1 . 0 2 0 0".., 65535) = 207
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
write(20, " H T T P / 1 . 0 2 0 0".., 207) = 207
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
read(23, "FFD8FFE0\010 J F I F\001".., 65535) = 1460
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
write(20, "FFD8FFE0\010 J F I F\001".., 1460) = 1460
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0
read(23, "A8E0BBD2 ^17F8 SFF\0\t\t".., 65535) = 1460
poll(0xEFFFDA60, 2, 0) = 0

[ sample 2 ]

write(20, " H T T P / 1 . 0 2 0 0".., 8192) = 8192
poll(0xEFFFDA60, 2, 0) = 0
open64("/var/spool/squid/01/02", O_RDONLY|O_NDELAY) = 23
fcntl(23, F_SETFD, 0x00000001) = 0
fstat64(23, 0xEFFFFB88) = 0
getdents64(23, 0x002E8A50, 1048) = 80
getdents64(23, 0x002E8A50, 1048) = 0
close(23) = 0
poll(0xEFFFDBE0, 6, 0) = 1
poll(0xEFFFDA60, 2, 0) = 0
write(20, " hDB 7 uDBE5FF\0C2 1E5F9".., 5629) = 5629
close(20) = 0
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 7, 1000) = 2
poll(0xEFFFDA60, 2, 0) = 0
write(18, " 8 8 9 7 7 1 4 8 7 . 0 4".., 142) = 142
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDA60, 2, 0) = 0
write(19, " 0 0 0 0 0 1 4 4 3 5 0".., 104) = 104
poll(0xEFFFDA60, 2, 0) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 0) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
poll(0xEFFFDBE0, 5, 1000) = 0
open64("/var/spool/squid/02/02", O_RDONLY|O_NDELAY) = 20
fcntl(20, F_SETFD, 0x00000001) = 0
fstat64(20, 0xEFFFFB88) = 0
getdents64(20, 0x002E8A50, 1048) = 80
getdents64(20, 0x002E8A50, 1048) = 0
close(20) = 0
poll(0xEFFFDBE0, 5, 0) = 0
poll(0xEFFFDBE0, 5, 0) = 0
poll(0xEFFFDBE0, 6, 1000) = 1
poll(0xEFFFDA60, 2, 0) = 0

Another very interesting warning message I got by using command `squid -k shutdown`.

1998/03/13 09:10:03| Preparing for shutdown after 3989 connections
1998/03/13 09:10:03| Waiting 1 seconds for active connections to finish
1998/03/13 09:10:17| FD 15 Closing HTTP connection
1998/03/13 09:10:17| FD 16 Closing ICP connection
1998/03/13 09:10:28| comm_select: select failure: (9) Bad file number
1998/03/13 09:10:28| examine_select: Examining open file descriptors...
1998/03/13 09:10:28| FD 4: (9) Bad file number
1998/03/13 09:10:28| WARNING: FD 4 has handlers, but it's invalid.
1998/03/13 09:10:28| FD 4 is a Socket called ''
1998/03/13 09:10:28| lifetm:0 tmout:0 read:1e290 write:1c528
1998/03/13 09:10:28| Select loop Error. Retry 1
1998/03/13 09:10:28| Shutting down...
1998/03/13 09:10:28| storeWriteCleanLog: Starting...
1998/03/13 09:10:29| Finished. Wrote 353 lines.
1998/03/13 09:10:29| Took 1 seconds ( 353.0 lines/sec).
CPU Usage: user 7550 sys 39
Maximum Resident Size: 0 KB
Page faults with physical i/o: 4559

>
> I haven't used the retry patch but I guess it didn't introduce any
> obscurity which may cause this. So it's supposed to be a Solaris problem.
> Have you also compiled squid on the 2.6 machine ? Did you make use of
> LD_LIBRARY_PATH ? Does netstat show any errors ?
> Just guessing into the blue ...

Yes, I have compiled the particular Squid in the very same machine. LD_LIBRARY_PATH
has not been set at the moment, but I'll set it and compile & run again. Netstat did
not show anything strange going on compared once again to the working setup.

//Tomi Orava

Tomi Orava | The views and opinions expressed in this article
Oy LM Ericsson Ab | are those of the poster and not his employer.
LMF/ZT/Unix Administration
phone: +358 9 2992604
Received on Fri Mar 13 1998 - 02:45:42 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:39:21 MST