Squid hangs weekly on SUN Ultra

From: Andreas Strotmann <Strotmann@dont-contact.us>
Date: Mon, 30 Mar 1998 08:47:49 +0100

Hi,

we've been wrestling with this problem for months now, and it's not
getting better. Can anybody help?

We're running Squid 1.20 on a 2-processor (200 MHz) Sun UltraSparc running
Solaris 2.5.1; the problem I'm reporting was first observed running Squid
1.15 and later running 1.18. Squid has been configured to use 4096 FDs,
and has been observed using at least 2100 FDs.

About once a week, apparently during a peak-load period, Squid will
literally freeze. It can't be killed; only rebooting the machine works.
Analyzing the system core dump has revealed that things are locked in an
I/O wait state on that machine. After a while, if the "crash" has gone
undetected that long, even logging in becomes impossible.

The machine is a dedicated proxy server. It's got 16 GB of disk space on
4 disks, 700+ MB of main memory, fresh CPUs, ATM. It serves the
equivalent of a 1.5-2Mbit connection during peak loads.

According to a NetCache technical report I recently read, that appears to
be more than a machine with this configuration running Squid ought to be
able to deliver. However, while we used to see a slow-down and gradual
fading of performance when the machine started to get near its performance
limit when we started using squid, the current kind of sudden total
breakdown that doesn't even cause an automatic reboot is a most
frustrating situation. It is made the worse for the fact that we're a
university that cannot afford to have its computer center offer a
24hr,365d service. Proxy-Autoconfig doesn't work in this situation,
because the browser doesn't recognize that the proxy is a living-dead
zombie...

We've tried the following:

 - Sun sent new CPUs
 - the disks have been replaced
 - all (minor) other services have been removed from the machine

to no avail. I'm writing this as the proxy server spent another weekend
in this mode (Friday just after all the staff have left is a typical time
for this to happen:-( ).

I know that Squid is running at its very limits on our machine;
unfortunately, that means that it can only service 25% or so of our total
WAN traffic. Is this the kind of behaviour to expect when a machine
running Squid reaches its limits?

Any help is appreciated!!

Thanks,

Andreas

-- 
Andreas Strotmann       / ~~~~~~ \________________A.Strotmann@Uni-Koeln.DE
Universitaet zu Koeln  /| University of Cologne   \
Regionales Rechenzentrum| Regional Computer Center \
Robert-Koch-Str. 10    /|    Tel: +49-221-478-5524 |\   Home: -221-4200663
D-50931  Koeln        __|__  FAX: +49-221-478-5590 |__________~~~~~~~~~~~~   
Received on Sun Mar 29 1998 - 23:50:15 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:39:29 MST