On 23/06/2008, Adrian Chadd <adrian_at_creative.net.au> wrote:
> On Mon, Jun 23, 2008, Bradley Kite wrote:
>  > Hi all,
>  >
>  > I have a problem that I am currently trying to diagnose. Please
>  > correct me if you think this should be on the -users list instead, but
>  > I think its a bug and I need further help trying to diagnose it.
>
>
> Ruh roh!
>
>
>  > Dropping the squid server out of service on the load balancer to stop
>  > actual traffic, and then running "squid -k debug" produces the
>  > following messages for lots of different FDs (I presume its for all
>  > FD's that squid thinks are active):
>
>
> Ok:
>
>
>  > 2008/06/23 10:57:55| comm_call_handlers(): got fd=11314 read_event=19
>  > write_event=18 F->read_handler=0x8070890 F->write_handler=(nil)
>
>
> Ok. hm.
>
>
>  > 2008/06/23 10:57:55| comm_call_handlers(): Calling read handler on fd=11314
>  > 2008/06/23 10:57:55| clientReadRequest: FD 11314: reading request...
>  > 2008/06/23 10:57:55| clientReadRequest: FD 11314 closed?
>
>
> So thats from:
>
>         debug(33, 5) ("clientReadRequest: FD %d closed?\n", fd);
>         F->flags.socket_eof = 1;
>         conn->defer.until = squid_curtime + 1;
>         conn->defer.n++;
>         fd_note_static(fd, "half-closed");
>
>
>  > 2008/06/23 10:57:55| cbdataLock: 0x655b16c0
>  > 2008/06/23 10:57:55| cbdataValid: 0x655b16c0
>  > 2008/06/23 10:57:55| cbdataUnlock: 0x655b16c0
>  > 2008/06/23 10:57:55| commSetSelect: FD 11314 type 1
>  > 2008/06/23 10:57:55| commSetEvents(fd=11314)
>
>
> I'd modify that debug to include the conn->defer.n value, so you can see what the
>  heck is going on. The defer count should grow to a fixed amount before the socket
>  is properly closed.
>
>  Try setting half_closed_clients to off?
>
>  Ah!
>
>
>  > I could set "half_closed_clients off", however, even at the start of
>  > the decline in file descriptors (ie when there are still file
>  > descriptors available) there are problems browsing certain websites,
>  > so I think this will just mask a symptom of the problem rather than
>  > fix it.
>
>
> Like what?
>
>
>  > A simple restart of squid fixes the issue, but only for a while. Our
>  > support guys are having to restart squid on various devices about 5-10
>  > times a day at the moment in order to try minimise impact to our
>  > customers.
>  >
>  > We are running squid-2.6-STABLE18 on Linux (Centos 5), using the
>  > epoll() event notification interface.
>  >
>  > Any help will be much appreciated as I've run out of things to check.
>  > Please let me know if you need any further information.
>
>
> its hard to tell. The sockets are returning EOF (0 byte reads) and should
>  be closed. The question is why thats happening..
>
Hi Adrian,
I have some debug for you now:
This server just started showing the symptoms described earlier. When
dropped into debug mode, the defer count is mostly around 6 or 7
thousand - each being 1 second then its relative to the time of the
probem (or the time of the initial request since the problem).
2008/06/24 15:20:48| clientReadRequest: FD 1944 closed? Defer Count: 5974
2008/06/24 15:20:48| clientReadRequest: FD 1942 closed? Defer Count: 6302
2008/06/24 15:20:48| clientReadRequest: FD 1941 closed? Defer Count: 6179
2008/06/24 15:20:48| clientReadRequest: FD 1940 closed? Defer Count: 7005
On this particular occasion the problem has been caught early, so not
that many FD's are in this state.
squidclient mgr:info is below, if that provides any additional clues:
Squid Object Cache: Version 2.6.STABLE18
Start Time:     Tue, 24 Jun 2008 03:02:55 GMT
Current Time:   Tue, 24 Jun 2008 14:30:21 GMT
Connection information for squid:
        Number of clients accessing cache:      5081
        Number of HTTP requests received:       5475693
        Number of ICP messages received:        0
        Number of ICP messages sent:    0
        Number of queued ICP replies:   0
        Request failure ratio:   0.00
        Average HTTP requests per minute since start:   7965.5
        Average ICP messages per minute since start:    0.0
        Select loop called: 81904070 times, 0.504 ms avg
Cache information for squid:
        Request Hit Ratios:     5min: 0.0%, 60min: 0.0%
        Byte Hit Ratios:        5min: 0.7%, 60min: 3.1%
        Request Memory Hit Ratios:      5min: 0.0%, 60min: 0.0%
        Request Disk Hit Ratios:        5min: 0.0%, 60min: 0.0%
        Storage Swap size:      191721208 KB
        Storage Mem size:       916892 KB
        Mean Object Size:       23.52 KB
        Requests given to unlinkd:      0
Median Service Times (seconds)  5 min    60 min:
        HTTP Requests (All):   0.06286  0.06640
        Cache Misses:          0.06286  0.06640
        Cache Hits:            0.00000  0.00000
        Near Hits:             0.00000  0.00000
        Not-Modified Replies:  0.00000  0.00000
        DNS Lookups:           0.00094  0.00190
        ICP Queries:           0.00000  0.00000
Resource usage for squid:
        UP Time:        41245.752 seconds
        CPU Time:       5015.237 seconds
        CPU Usage:      12.16%
        CPU Usage, 5 minute avg:        1.06%
        CPU Usage, 60 minute avg:       1.32%
        Process Data Segment Size via sbrk(): 2095744 KB
        Maximum Resident Size: 0 KB
        Page faults with physical i/o: 378705
Memory usage for squid via mallinfo():
        Total space in arena:  2095876 KB
        Ordinary blocks:       2066961 KB 329163 blks
        Small blocks:               0 KB      0 blks
        Holding blocks:         60984 KB      6 blks
        Free Small blocks:          0 KB
        Free Ordinary blocks:   28914 KB
        Total in use:          -2066359 KB 101%
        Total free:             28914 KB 0%
        Total size:            -2037444 KB
Memory accounted for:
        Total accounted:       1677823 KB
        memPoolAlloc calls: 768370423
        memPoolFree calls: 747816806
File descriptor usage for squid:
        Maximum number of file descriptors:   16384
        Largest file desc currently in use:   2499
        Number of file desc currently in use: 1418
        Files queued for open:                   0
        Available number of file descriptors: 14966
        Reserved number of file descriptors:   100
        Store Disk files open:                   0
        IO loop method:                     epoll
Internal Data Structures:
        8182835 StoreEntries
        181239 StoreEntries with MemObjects
        181238 Hot Object Cache Items
        8151855 on-disk objects
In particular, squid thinks it has 1418 file descriptors in use.
netstat shows only a few hundred:
[root_at_elc-url-sqd-13 squid]# netstat -an | grep 3128 | wc -l
287
I am now going to set "half_closed_connections off" and see what debug
can be obtained from that.
In the mean time if you have any other suggestions please let me know.
-- Brad.Received on Tue Jun 24 2008 - 15:08:50 MDT
This archive was generated by hypermail 2.2.0 : Wed Jun 25 2008 - 12:00:08 MDT