Re: [Fwd: [Fwd: [squid-users] Update: #redirectors decays overtime]] from Mike Kiernan on 2001-08-22 (squid-users)

From: Mike Kiernan <mkiernan@dont-contact.us>
Date: Wed, 22 Aug 2001 10:52:12 +0200

Robert,

Many thanks for the response. You are _the_ man!
I made your suggested edit to redirect.c:

/* *** MIKE ****
redirectors->ipc_type = IPC_TCP_SOCKET;
*** MIKE **** */
redirectors->ipc_type = IPC_FIFO;

You can see the results for yourself on the attached
graph (the blue line is the server with this fix - as
you can see after 10 o'clock the number of redirectors
is stable at 90 and doesn't decay) . I can probably
reduce the number of redirectors back down to a
more sensible number now.

This leaves me with a few questions:

- why is the default to use sockets rather than pipes
   for local ipc ?? [I come from solaris where normally
   you'd switch from sockets to doors on the local box]
- is there a perf difference between the two ipc mechanisms
   on linux ?
- why is redirectors, dnsservers etc. hardcoded so low?
- in these days of continuous uptime etc etc why does
   squid decide to exit in when the situation is far from
   beyond repair?

I don't mind patching source when needbe, but in the long
term (ie. upgrading) it's all very time-consuming going
through the patch/test cycle. I'd rather put effort into
ensuring changes that make sense for everyone go into
the base source release. Is there a facility for logging
RFE's for squid ??

thanks again for your swift response...

Mike Kiernan

> On 22 Aug 2001 09:14:49 +0200, Mike Kiernan wrote:
> > Still no feedback on this squirm problem - I'd
> > really appreciate any ideas on how to take this further...
>
> I'm probably not the right person to answer :-0 given that I'm not as
> psychic as some folk are for finding/solving linux problems :].
>
> > Attached is a graph of ps -C squirm | wc -l
> > stats in rrdtool which shows the number of
> > squirms decreasing, squid crashes, and squirms
> > increase to 90, and start decaying again etc.
> > (data from 3 squid servers on the same graph).
> > I don't believe increasing the number of redirectors
> > is going to help, and isn't very efficient since it's
> > only peak burst traffic that breaks it...(it's already
> > up to 90, dnsservers up to 30)
> >
> > The entry on crash in the log is:
> >
> >
> > 2001/08/21 22:49:09| helperHandleRead: FD 103 read: (11) Resource
> > temporarily unavailable
> > 2001/08/21 22:49:09| WARNING: redirector #47 (FD 103) exited
> > 2001/08/21 22:49:09| storeDirWriteCleanLogs: Starting...
> > 2001/08/21 22:49:09| WARNING: Closing open FD 6
> > 2001/08/21 22:49:09| Finished. Wrote 1102 entries.
> > 2001/08/21 22:49:09| Took 0.0 seconds (103193.2 entries/sec).
> > FATAL: Too few redirector processes are running
> > Squid Cache (Version 2.4.STABLE1): Terminated abnormally.
> > CPU Usage: 24699.830 seconds = 9356.610 user + 15343.220 sys
> > Maximum Resident Size: 0 KB
> > Page faults with physical i/o: 11333
> > Memory usage for squid via mallinfo():
> > total space in arena: 104958 KB
> > Ordinary blocks: 101531 KB 125120 blks
> > Small blocks: 0 KB 0 blks
> > Holding blocks: 3000 KB 6 blks
> > Free Small blocks: 0 KB
> > Free Ordinary blocks: 3426 KB
> > Total in use: 104531 KB 100%
> > Total free: 3426 KB 3%
> > 2001/08/21 22:49:13| Starting Squid Cache version 2.4.STABLE1 for
> > i686-pc-linux-gnu...
> > 2001/08/21 22:49:13| Process ID 7299
> > 2001/08/21 22:49:13| With 8192 file descriptors available
> > 2001/08/21 22:49:13| helperOpenServers: Starting 30 'dnsserver'
> > processes
> > 2001/08/21 22:49:13| helperOpenServers: Starting 90 'squirm' processes
> > 2001/08/21 22:49:14| Unlinkd pipe opened on FD 129
> >
> >
> > strace of one squirm child at the time shows:
> >
> > read(0, "http://republika.pl/kastom/image"..., 4096) = 71
> > write(1, "http://republika.pl/kastom/image"..., 71) = 71
> > open("/var/log/squirm/match", O_WRONLY|O_APPEND|O_CREAT, 0666) = -1
> > ENOENT (No such file or directory)
> > read(0, 0x40014000, 4096) = -1 ECONNRESET (Connection
>
> squid won't be deliberatly closing the helper connection. Something
> therefor is causing the helper to fail on it's reset, or something is
> interfering with the ipc linl.
>
> You might try changing the IPC_TCP_SOCKET in redirect.c to IPC_FIFO.
> (Unless you use OSF1 and Poll().
>
> Other than that, I can whip up a restart-dead-helpers patch for you,
> based on some other work I've done, but it won't solve the core issue
> that when your machine is busy it's dropping links.
>
> Rob
>
> > reset by peer)
> > munmap(0x40015000, 4096) = 0
> > _exit(0) = ?
> >
> >
> > thanks,
> > Mike

--
Onet.pl S.A.
http://www.onet.pl/
Krakow, Poland

Received on Wed Aug 22 2001 - 02:55:17 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:01:53 MST