[squid-users] Hunting SegFaults from Brian on 2002-05-13 (squid-users)

From: Brian <hiryuu@dont-contact.us>
Date: Tue, 14 May 2002 00:49:57 -0400

As part of our squid rev-proxy, we have this zipcache which dies about
once per month:
2002/05/13 15:06:42| storeAufsOpenDone: (2) No such file or directory
2002/05/13 15:06:42| /var/squid3/00/36/000036D7
FATAL: Received Segment Violation...dying.
2002/05/13 15:07:14| storeDirWriteCleanLogs: Starting...
2002/05/13 15:07:14| WARNING: Closing open FD 12
2002/05/13 15:07:14| WARNING: Closing open FD 13
2002/05/13 15:07:15| 65536 entries written so far.
2002/05/13 15:07:15| 131072 entries written so far.
2002/05/13 15:07:15| 196608 entries written so far.
2002/05/13 15:07:16| 262144 entries written so far.
2002/05/13 17:20:18| Starting Squid Cache version 2.4.STABLE3 for
i686-pc-linux-gnu...

This is Debian Linux 3.0-testing and squid uses 3 aufs cache_dirs.

There are two basic problems here:
1. It died.
2. Somewhere after 262144 objects, it stopped and it took a kill -9 to
make it exit.

The front-line servers also die occasionally, but they come back. They're
smaller (70,000 objects), so perhaps the item count plays a role in
problem 2?

This squid also has
quick_abort_min -1 KB
range_offset_limit -1 bytes

while the front-line doesn't. This means it often retrieves cache misses
at LAN speed. At one point, I also had READ_AHEAD_GAP set to 2MB, but
that makes it downright unstable (crashing about once a week).

How do I find out what's going on here? Since it continues past the
segfault, would a core dump work? If so, how do you create/use a
multi-thread dump?

-- Brian
Received on Mon May 13 2002 - 22:49:59 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:08:06 MST