Re: debugs from an interesting CBDATA case study

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 07 Jan 2014 08:35:54 -0700

On 01/07/2014 02:40 AM, Amos Jeffries wrote:
> For background:
> CBDATA uses an external lock counting and pointer management structure
> to perform auto_ptr/unique_ptr semantics on objects. If the objects
> "this" pointer does not match a magic binary pattern in the CBDATA
> object (cookie) for the class it will assert as shown below.
>
>
> For the code being patched:
>
> * Both Comm::TcpReceiver and ServerStateData inherit from AsynsJob using
> the "virtual public" inheritence.
>
> * Both Comm::TcpReceiver and ServerStateData attempt to use
> JobCallback() passing their 'this' pointer.
>
> * The instantiated object in all cases described below is an
> HttpStateData object which inherits from both of these classes.
>
> * the only change between builds is the class inheritence order.

If you are stuck, I suggest two things:

0) Make sure HttpStateData is the only class in its class hierarchy with
CBDATA_CLASS2() or equivalent.

1) Create the closed_ callback in start() or the method which start()
calls, not the constructor. I do not see how from-child-constructor
calling can create this specific bug (based on your description), but it
is not kosher and should be fixed anyway.

2) [Add cbdata debugging to] find which object stores the invalid cbdata
pointer. Then find how that invalid pointer gets to that object. Since
you can reproduce this easily, it should not take long to pin point the
wrong-pointer source. Once we know what creates the wrong cbdata
pointer, we should be able to fix the problem. (And if the pointer is
correct, we would know that something corrupts job memory).

The information necessary for #2 may already be present in your logs and
stack traces, but it is difficult for me to weed it out.

Good luck,

Alex.

>>From the hierarchy definition:
> class HttpStateData : public ServerStateData, public Comm::TcpReceiver
>
> This is replicable with any MISS request.
>
> 2014/01/02 02:01:50.351 kid1| src/http.cc(2243) httpStart: httpStart:
> "GET http://example.com/"
> 2014/01/02 02:01:50.352 kid1| src/cbdata.cc(324) cbdataInternalAlloc:
> cbdataAlloc: 0x9c8fdf0
> 2014/01/02 02:01:50.352 kid1| ../src/base/AsyncJob.cc(28) AsyncJob:
> AsyncJob constructed, this=0x9c8fe80 type=HttpStateData [job5]
> 2014/01/02 02:01:50.352 kid1| src/store.cc(545) lock: StoreEntry::lock:
> key 'D9237C562111E3C9330563403648CEC3' count=3
> 2014/01/02 02:01:50.352 kid1| src/http.cc(113) HttpStateData:
> HttpStateData 0x9c8fdf0 created
> 2014/01/02 02:01:50.353 kid1| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x9c8fdf0=1
> 2014/01/02 02:01:50.353 kid1| src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x9c8fdf0
> 2014/01/02 02:01:50.353 kid1| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x9c8fdf0=2
> 2014/01/02 02:01:50.353 kid1| src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x9c8fdf0
> 2014/01/02 02:01:50.359 kid1| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x9c8fe24=1
> 2014/01/02 02:01:50.359 kid1| assertion failed: ../../src/cbdata.cc:132:
> "cookie == ((long)this ^ Cookie)"
>
>
> #3 0x0814a09b in xassert (msg=0x837f7b8 "cookie == ((long)this ^
> Cookie)", file=0x837f5ba "../../src/cbdata.cc", line=132) at
> ../../src/debug.cc:566
> __FUNCTION__ = "xassert"
> #4 0x0810ed83 in check (aLine=423, this=0x85b52d4) at
> ../../src/cbdata.cc:132
> this = 0x85b52d4
> #5 cbdataInternalLock (p=p_at_entry=0x85b52e4) at ../../src/cbdata.cc:423
> c = 0x85b52d4
> __FUNCTION__ = "cbdataInternalLock"
> #6 0x082e2b11 in CommCommonCbParams::CommCommonCbParams
> (this=this_at_entry=0xbffff254, aData=aData_at_entry=0x85b52e4) at
> ../../src/CommCalls.cc:10
> No locals.
> #7 0x082e3813 in CommCloseCbParams::CommCloseCbParams (this=0xbffff254,
> aData=0x85b52e4) at ../../src/CommCalls.cc:107
> No locals.
> #8 0x0834f190 in CommCbMemFunT (aMeth=
> (void (Comm::TcpReceiver::*)(Comm::TcpReceiver * const, const
> CommCbMemFunT<Comm::TcpReceiver, CommCloseCbParams>::Params &))
> 0x834e270 <Comm::TcpReceiver::tcpConnectionClosed(CommCloseCbParams
> const&)>, aJob=..., this=0xbffff268) at ../../../src/CommCalls.h:180
> No locals.
> #9 Comm::TcpReceiver::tcpConnectionInit (this=this_at_entry=0x85b52e4) at
> ../../../src/comm/TcpReceiver.cc:36
> No locals.
> ----- failing:
> void
> Comm::TcpReceiver::tcpConnectionInit()
> {
> /* Ideally this would be setup by the constructor but it involves
> * calls to toCbdata() virtual function implemented by the derived class
> * so must be run explicitly by that class's constructor.
> */
>
> typedef CommCbMemFunT<Comm::TcpReceiver, CommCloseCbParams> Dialer;
> //line 36:
> closed_ = JobCallback(33, 5, Dialer, this,
> Comm::TcpReceiver::tcpConnectionClosed);
> comm_add_close_handler(tcp->fd, closed_);
> }
> -----
>
> #10 0x081a8e20 in HttpStateData::HttpStateData
> (this=this_at_entry=0x85b52b0, theFwdState=theFwdState_at_entry=0x85b3380,
> __in_chrg=<optimized out>,
> __vtt_parm=<optimized out>) at ../../src/http.cc:140
> No locals.
> #11 0x081a8f0c in httpStart (fwd=fwd_at_entry=0x85b3380) at
> ../../src/http.cc:2244
> __FUNCTION__ = "httpStart"
>
>
>
>
>
> *******************************************************
>
>
>>From the hierarchy definition:
> class HttpStateData : public Comm::TcpReceiver, public ServerStateData
>
> job12 is a POST request. It gets past the httpStart() function which
> dies above. There is a series of "waiting for request body to complete"
> calls. Then ...
>
> 2014/01/05 04:22:43.901| ../src/base/AsyncJob.cc(117) callStart:
> HttpStateData status in: [ job12]
> 2014/01/05 04:22:43.901| src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x89ed1f8
> 2014/01/05 04:22:43.901| src/Server.cc(368) sentRequestBody:
> sentRequestBody: FD 22: size 463: errflag 0.
> 2014/01/05 04:22:43.901| src/Server.cc(369) sentRequestBody:
> sentRequestBody called
> 2014/01/05 04:22:43.901| src/BodyPipe.cc(377) postConsume:
> reentrant debuging 2-{src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x89ed1f8}-2
> consumed 106 bytes [106<=106<=106 0+2047 pipe0x89e9080 cons0x89ed224]
> 2014/01/05 04:22:43.902| src/Server.cc(425) sendMoreRequestBody: will
> write 106 request body bytes
> 2014/01/05 04:22:43.902| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x89ed1f8=10
> 2014/01/05 04:22:43.902| src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x89ed1f8
> 2014/01/05 04:22:43.903| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x89ed1f8=11
> 2014/01/05 04:22:43.903| src/cbdata.cc(510) cbdataReferenceValid:
> cbdataReferenceValid: 0x89ed1f8
> 2014/01/05 04:22:43.903| src/cbdata.cc(419) cbdataInternalLock:
> cbdataLock: 0x89ed224=1
> 2014/01/05 04:22:43.903| assertion failed: ../../src/cbdata.cc:132:
> "cookie == ((long)this ^ Cookie)"
>
>
> #3 0x0814a1fb in xassert (msg=0x837f8f8 "cookie == ((long)this ^
> Cookie)", file=0x837f6fa "../../src/cbdata.cc", line=132) at
> ../../src/debug.cc:566
> __FUNCTION__ = "xassert"
> #4 0x0810eee3 in check (aLine=423, this=0x89ed214) at
> ../../src/cbdata.cc:132
> this = 0x89ed214
> #5 cbdataInternalLock (p=p_at_entry=0x89ed224) at ../../src/cbdata.cc:423
> c = 0x89ed214
> __FUNCTION__ = "cbdataInternalLock"
> #6 0x082e2c91 in CommCommonCbParams::CommCommonCbParams
> (this=this_at_entry=0xbffff2ac, aData=aData_at_entry=0x89ed224) at
> ../../src/CommCalls.cc:10
> No locals.
> #7 0x082e3524 in CommIoCbParams::CommIoCbParams (this=0xbffff2ac,
> aData=0x89ed224) at ../../src/CommCalls.cc:78
> No locals.
> #8 0x0823de31 in CommCbMemFunT (aMeth=&virtual
> ServerStateData::sentRequestBody(CommIoCbParams const&), aJob=...,
> this=0xbffff2c8)
> at ../../src/CommCalls.h:180
> No locals.
> #9 ServerStateData::sendMoreRequestBody (this=this_at_entry=0x89ed224) at
> ../../src/Server.cc:427
> conn = {p_ = 0x89ed008}
> __FUNCTION__ = "sendMoreRequestBody"
> buf = {buf = 0x89f13d8
> "0h0f0D0B0_at_0\t\006\005+\016\003\002\032\005", size = 106, max_capacity =
> 2097152000, capacity = 2048, stolen = 0,
> static CBDATA_MemBuf = 10}
> #10 0x0823ef48 in ServerStateData::sentRequestBody (this=0x89ed224,
> io=...) at ../../src/Server.cc:403
> __FUNCTION__ = "sentRequestBody"
> #11 0x081aee60 in CommCbMemFunT<HttpStateData, CommIoCbParams>::doDial
> (this=0x89f139c) at ../../src/CommCalls.h:197
> No locals.
> #12 0x081aef75 in JobDialer<HttpStateData>::dial
> (this=this_at_entry=0x89f139c, call=...) at ../../src/base/AsyncJobCalls.h:166
> __FUNCTION__ = "dial"
> #13 0x081af0d6 in AsyncCallT<CommCbMemFunT<HttpStateData,
> CommIoCbParams> >::fire (this=0x89f1380) at ../../src/base/AsyncCall.h:137
> No locals.
> #14 0x082d0442 in AsyncCall::make (this=0x89f1380) at
> ../../../src/base/AsyncCall.cc:32
> __FUNCTION__ = "make"
> ...
>
Received on Tue Jan 07 2014 - 15:36:06 MST

This archive was generated by hypermail 2.2.0 : Tue Jan 14 2014 - 12:00:13 MST