Re: MemPools rewrite

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Tue, 7 Nov 2000 10:33:13 +0200

 How do you chase a memory corruption bug that occurs once in
 several millions of requests?

 It pops up in my chunked mempools, but I can't find any error
 on my side. Seems like rare combination of events is triggering
 something that corrupts my freelist handling.

 One thing that could corrupt it is when item is freed twice,
 I added a check to detect it, and now it has run for two days
 without crashing... ;( And I can't find a way to reproduce it.
 I've got just a gut feeling that it has to do with either DNS
 or peer traffic, but nothing to back it up.

 Why I suspect its not my code, is because older squid version
 without mempool hacks is also occasionally crashing, also very
 rarely. But I have no stack traces for it (for some reason it
 is not dumping core..)

Program received signal SIGSEGV, Segmentation fault.
0x80dceca in memPoolGet (pool=0x82b7c40) at MemPool.c:204
204 chunk->freeList = free->next;
(gdb) bt
#0 0x80dceca in memPoolGet (pool=0x82b7c40) at MemPool.c:204
#1 0x80dd3ce in memPoolAlloc (pool=0x82b7c40) at MemPool.c:323
#2 0x80a6d2b in memAllocate (type=MEM_CLIENT_SOCK_BUF) at mem.c:129
#3 0x8074e92 in clientProcessExpired (data=0x8981dd0) at client_side.c:344
#4 0x807782b in clientCacheHit (data=0x8981dd0,
    buf=0xa8ae550 "HTTP/1.0 200 OK\r\nContent-length: 3389\r\nContent-type: image/gif\r\nDate: Sat, 04 Nov 2000 01:14:38
GMT\r\nExpires: Sat, 04 Nov 2000 13:14:38 GMT\r\nLast-modified: Thu, 06 Apr 2000 17:20:26 GMT\r\nCache-control"...,
size=3687)
    at client_side.c:1371
#5 0x80c73a0 in storeClientReadHeader (data=0x88fffb0,
    buf=0xa8ae550 "HTTP/1.0 200 OK\r\nContent-length: 3389\r\nContent-type: image/gif\r\nDate: Sat, 04 Nov 2000 01:14:38
GMT\r\nExpires: Sat, 04 Nov 2000 13:14:38 GMT\r\nLast-modified: Thu, 06 Apr 2000 17:20:26 GMT\r\nCache-control"...,
len=3790)
    at store_client.c:422
#6 0x806819e in storeAufsReadDone (fd=14, my_data=0x88e6728, len=3790, errflag=0) at store_io_asyncufs.c:218
#7 0x806753f in aioCheckCallbacks () at async_io.c:370
#8 0x807eeda in comm_poll (msec=525) at comm_select.c:340
#9 0x80a63ee in main (argc=4, argv=0x8047880) at main.c:708
(gdb) p *chunk
$4 = {freeList = 0xf76fa9a9, objCache = 0xa892550, count = 1, next = 0x8983798, prev = 0x0, lastref = 973437275}
(gdb) p *0xf76fa9a9
Cannot access memory at address 0xf76fa9a9

 So, struct chunk must have been corrupted previously.
 Where and when, no idea.

------------------------------------
 Andres Kroonmaa <andre@online.ee>
 Delfi Online
 Tel: 6501 731, Fax: 6501 708
 Pärnu mnt. 158, Tallinn,
 11317 Estonia
Received on Tue Nov 07 2000 - 01:36:10 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:56 MST