Re: Strange assertion failure in 2.6 snapshot...

From: Joe Cooper <joe@dont-contact.us>
Date: Tue, 05 Mar 2002 22:38:25 -0600

I can trigger it on demand, every time. It is just a plain ol' purge
from squidclient:

squidclient -m PURGE http://www.somesite.com/blah.html

Crash. Any purge will do. But I think this is the first crasher I've
run into, so you're not doing too bad. ;-)

I can give you an account on my machine to bounce over to the server in
question (which is behind a firewall) and root access on it, if it would
be helpful.

Adrian Chadd wrote:
> On Tue, Mar 05, 2002, Joe Cooper wrote:
>
>>Hey folks,
>>
>>It appears that 2.6 after the commloops merge will not perform a PURGE
>>without an assertion failure. I get the following:
>>
>>2002/03/05 08:36:30| assertion failed: store_client.c:211:
>>"sc->cmp_offset == copy_offset"
>>
>
> Thats definitely from my work.
> Interesting .. so you're saying that a PURGE will _reliably_ trigger
> this bug?
>
> I've had the bug once since I committed the code (damn timing! :)
> and someone else has reported it to squid-dev, but I haven't
> been able to reproduce it. If you can do it reliably here, please tell
> me (and tell me exactly how you're doing the PURGE) and I'll be
> happy to track this down and squish it. :-)
>
> The code goes through this path quite frequently, so I'm not sure why
> its deciding all of a sudden to read less of an object.
> http->out.offset is only ever incremented, so something is being
> replaced here (storeentry/storeclient?)
>
>
>
> adrian
>
>
>>And a backtrace of:
>>
>>Program received signal SIGABRT, Aborted.
>>[Switching to Thread 1024 (LWP 27216)]
>>0x400db5c1 in __kill () from /lib/libc.so.6
>>(gdb) bt
>>#0 0x400db5c1 in __kill () from /lib/libc.so.6
>>#1 0x4005538e in raise (sig=6) at signals.c:65
>>#2 0x400dc9a8 in abort () at ../sysdeps/generic/abort.c:88
>>#3 0x08065e15 in xassert ()
>>#4 0x0809bf30 in storeClientCopy ()
>>#5 0x0805fb83 in clientWriteComplete ()
>>#6 0x0806274d in CommWriteStateCallbackAndFree ()
>>#7 0x08065249 in comm_select ()
>>#8 0x08085b35 in main ()
>>#9 0x400c9e5e in __libc_start_main (main=0x8085860 <main>, argc=2,
>> ubp_av=0xbffffb54, init=0x8049fb0 <_init>, fini=0x80bc720 <_fini>,
>> rtld_fini=0x4000d3c4 <_dl_fini>, stack_end=0xbffffb4c)
>> at ../sysdeps/generic/libc-start.c:129
>>
>>(gdb) frame 4
>>#4 0x0809bf30 in storeClientCopy ()
>>(gdb) print sc->cmp_offset
>>Attempt to extract a component of a value that is not a structure pointer.
>>(gdb) print sc
>>$1 = 0x400830d0
>>(gdb) print *sc
>>$2 = 0.0084132443630149294931509329313612522
>>
>>This is a snapshot from immediately after the merge, so maybe Adrian has
>>fixed it since then? I'll fetch it and check it out next. I was just
>>guessing about what sort of information would be needed for debugging--I
>>can do another backtrace and print any other values needed to figure out
>>what's happening.
>>
>>Anyway. This seemed weird enough to get some input from Adrian...
>>
>>Will give this a go on 2.5 and 2.4, also, but I seem to recall I was
>>already running 2.5 for some earlier work in this area with no problems.
>> (And I suspect someone would have mentioned it by now, also.)
>>--
>>Joe Cooper <joe@swelltech.com>
>>http://www.swelltech.com
>>Web Caching Appliances and Support
>>
>>
>
>

-- 
Joe Cooper <joe@swelltech.com>
http://www.swelltech.com
Web Caching Appliances and Support
Received on Tue Mar 05 2002 - 21:39:28 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:50 MST