RE: [squid-users] SQUID2.6STABLE12 - Assertion failure in hash.c related to DNS cache.

From: Pulu Anau <panau@dont-contact.us>
Date: Fri, 20 Apr 2007 12:31:24 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Just to put an update in here in case anyone else has this problem in the future and wants to search it...

It wasn't what I said below about it being updated right when it hits.. the code actually expires it and then instantly recreates it (if the record in the ipcache table is expired, which all of these were).

Ran configure with --disable-internal-dns and --disable-snmp (this also caused crashes in hash.c even though I wasn't using it) and squid is now running like a champ. Thank goodness that old dnsserver stuff was still in the codebase.

Again this was completely standard Redhat 4 on both 32 and 64bit (two entirely different servers). Guess I'll wait for the real fix when it starts happening to more people...

Thanks,

Pulu


- -----Original Message-----
From: Pulu Anau [mailto:panau@2wire.com]
Sent: Thursday, April 19, 2007 2:11 PM
To: squid-users@squid-cache.org
Subject: RE: [squid-users] SQUID2.6STABLE12 - Assertion failure in hash.c related to DNS cache.

* PGP Signed: 04/19/07 at 14:10:31

I've taken all the changes to the source I made out, and got a stack trace from the original error (see below). In addition I turned all the related debugging from squid (not much, but also below).

It appears that the problem occurs when a domain is released from the ipcache at the moment it's getting queried.

Other related settings: I have 3 search domains that get checked, and two nameservers. Ipcache_size, _low, and _high are at the default settings.

If anyone has any idea I would love to hear it, I really don't want to go back to 2.5. Are there other debug sections I should enable? It doesn't appear that hash.c calls the debug routines at all.

Oh yeah, one other thing that I don't think is (directly?) related, when I look at the backtrace, idnsALookup is called with a callback value that it appears from the code it passes directly to idnsCachedLookup, but on all of these I've seen gdb shows the value of the callback as 0.

Thanks,

Pulu


Backtrace:
#0 0x008597a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0015f7a5 in raise () from /lib/tls/libc.so.6
#2 0x00161209 in abort () from /lib/tls/libc.so.6
#3 0x00158d91 in __assert_fail () from /lib/tls/libc.so.6
#4 0x080eedbc in hash_lookup (hid=0x8e5e080, k=0xadb16a0) at hash.c:194
#5 0x0807e470 in idnsCachedLookup (key=0xadb16a0 "login.yahoo.com", callback=0, data=0xa93e3f8) at dns_internal.c:1015
#6 0x0807e528 in idnsALookup (name=0xadb16a0 "login.yahoo.com", callback=0x80a0e0a <ipcacheHandleReply>, data=0xa93e3f8) at dns_internal.c:1041
#7 0x080a10a6 in ipcache_nbgethostbyname (name=0x9235f58 "login.yahoo.com", handler=0x80761f6 <commConnectDnsHandle>, handlerData=0x9358b58) at ipcache.c:456
#8 0x080761ee in commConnectStart (fd=521, host=0x9235f58 "login.yahoo.com", port=80, callback=0, data=0x935c4d8) at comm.c:283
#9 0x08084cd2 in fwdConnectStart (data=0x935c4d8) at forward.c:621
#10 0x08084d3b in fwdStartComplete (servers=0x938eed0, data=0x935c4d8) at forward.c:631
#11 0x080ae3a4 in peerSelectCallback (psstate=0x926f7c0) at peer_select.c:202
#12 0x080ae810 in peerSelectFoo (ps=0x926f7c0) at peer_select.c:310
#13 0x080ae104 in peerSelect (request=0x9235ed0, entry=0xd650d88, callback=0x8084cda <fwdStartComplete>, callback_data=0x935c4d8) at peer_select.c:157
#14 0x08085593 in fwdStart (fd=396, e=0xd650d88, r=0x9235ed0) at forward.c:881
#15 0x08071da4 in clientProcessMiss (http=0xd0fb890) at client_side.c:3459
#16 0x080719dd in clientProcessRequest (http=0xd0fb890) at client_side.c:3375
#17 0x0806b3e9 in clientCheckNoCacheDone (answer=0, data=0x1b2a) at client_side.c:667
#18 0x08055093 in aclCheckCallback (checklist=0x922f060, answer=ACCESS_DENIED) at acl.c:2273
#19 0x08054ef4 in aclCheck (checklist=0x922f060) at acl.c:2230
#20 0x08055390 in aclNBCheck (checklist=0x922f060, callback=0, callback_data=0x6) at acl.c:2408
#21 0x0806b38a in clientCheckNoCache (http=0xd0fb890) at client_side.c:655
#22 0x0806a778 in clientAccessCheck2 (data=0xd0fb890) at client_side.c:369
#23 0x0806b32d in clientRedirectDone (data=0xd0fb890, result=0x0) at client_side.c:647
#24 0x080afcca in redirectHandleReply (data=0x93593e8, reply=0x0) at redirect.c:70
#25 0x08090d3b in helperHandleRead (fd=7, data=0x8e6e698) at helper.c:769
#26 0x080792cc in do_comm_select (msec=10) at comm_generic.c:264
#27 0x08078cc1 in comm_select (msec=10) at comm_generic.c:386
#28 0x080a3b91 in main (argc=2, argv=0xbfeecfa4) at main.c:838
#29 0x0014cde3 in __libc_start_main () from /lib/tls/libc.so.6
#30 0x0804ebe9 in _start ()


Last three lines from cache.log (crashed immediately after this)
2007/04/19 13:26:28| ipcache_nbgethostbyname: Name 'login.yahoo.com'.
2007/04/19 13:26:28| ipcacheRelease: Releasing entry for 'login.yahoo.com'
2007/04/19 13:26:28| ipcache_nbgethostbyname: MISS for 'login.yahoo.com'

My GCC (should it matter):
Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.6/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux
Thread model: posix
gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)


- -----Original Message-----
From: Pulu Anau [mailto:panau@2wire.com]
Sent: Wednesday, April 18, 2007 5:35 PM
To: squid-users@squid-cache.org
Subject: [squid-users] SQUID2.6STABLE12 - Assertion failure in hash.c related to DNS cache.

> Old Signed: 04/18/07 at 17:35:07

Hi,

Note: http://marc.info/?l=squid-users&m=115342544210559&w=2

I am having the exact same problem and in the same place as the message mentioned above. Unfortunately I attempted to turn the crash into a memory leak by removing the assertion that was causing the crash (assert(walker != walker->next); in lib/hash.c) and ended up with debug data that doesn’t match my cores. But, the crash just moved to another place in hash.c:

(gdb) bt
#0 0x008597a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x008997a5 in raise () from /lib/tls/libc.so.6
#2 0x0089b209 in abort () from /lib/tls/libc.so.6
#3 0x00892d91 in __assert_fail () from /lib/tls/libc.so.6
#4 0x080db31c in hash_remove_link (hid=Variable "hid" is not available.
) at hash.c:281
#5 0x0807552d in idnsCallback (q=0xaf9ee38, answers=0x9b12ab8, n=2, error=0x0) at dns_internal.c:650
#6 0x08076339 in idnsGrokReply (buf=Variable "buf" is not available.
) at dns_internal.c:826
#7 0x08076543 in idnsRead (fd=6, data=0x0) at dns_internal.c:883
#8 0x08071ff8 in comm_select (msec=153) at comm_generic.c:264
#9 0x080966c1 in main (argc=2, argv=0xbfe6f8b4) at main.c:838
#10 0x00886de3 in __libc_start_main () from /lib/tls/libc.so.6
#11 0x0804ee41 in _start ()

The assertion that triggers this is (I believe, I’m far from a C programming wizard) from the same thing, searching for a key in this hash that should be there but isn’t.

Until it crashes it runs along fine, the performance is so much better than previous versions (cpu usage on the server went from 85% squid to 2%) that I’d hate to go back. This is built from the source RPM at http://people.redhat.com/stransky/squid on RHES4 both 32bit and 64bit. The patches in the RPM are almost all related to file locations.

Does anyone have any idea how I might better try and track this down? Would making/finding an external DNS resolver help?

Any help would be greatly appreciated.

Pulu


* Pulu Anau <panau@2wire.com>
* 0x4A79A7C4(L)

* Pulu Anau <panau@2wire.com>
* 0x4A79A7C4(L)

-----BEGIN PGP SIGNATURE-----
Version: 9.6.1 (Build 1012)

wsFVAwUBRikVDIWUwm1KeafEAQiZsxAAndqAdU1z2Ortgd6N8igBVfMlSFFESP9A
fEbPGuwHrvEKNtSaY3V4lnNjtjnxZVMhxFD5bhHnYGCml5PT/sH3lX+foqg76UHj
q0MjX6x9RQIUBzNmpyiZnM7eHO7Fc5CJO7AfFdOEVuXgP2eHnlKy0F4FkNT8RApe
dA31n8bG55XMRRwxkK/6vnN4kizkhd5+6UbTIIhcmVPxrwvUA3RWeJZPeCxswsoa
zrRgPq25DGQhrPGx7g6Zys5CrA/lozKBWm2RWJTRE085j3b5fLU4beQG8IKaxZGk
q5+O8IpNBYWV+kwmJ0kb0HHs4O8gHMHGWHT6GZYx/icKE0FzKzfIpXWFCHY77r+a
0WV4sQGqF5NvoJTZn1+JUXUUN52fvukZ/6zsJbwbtBTSbyavilYb74/OPmXMt8hh
jRMwmyskUQFNUTRfHsvlmYJ974nZ5laWuCm8/zYEC+WggF3L64gI/uUBHpAbcTvy
KFh7VjeFQjyDOwmMonwxzG6RVXZaUV8eeStsI8lk8uep+spv6ce4U+ifAPIYoC2W
UPnWqeYwX5Rz+d37XAehv9drToqgLuxIX/Yyn/fRmXXXqtQnrK3X9teEQXcbZyQG
bl84XwCnd1WRZrrVl1ubjk3X/gXsvmqOTdTCfE9fVPniu6DCFaVwOGMpVzW4Qrdc
2B9cnVqeSlk=
=VLnh
-----END PGP SIGNATURE-----

Received on Fri Apr 20 2007 - 13:31:36 MDT

This archive was generated by hypermail pre-2.1.9 : Tue May 01 2007 - 12:00:01 MDT