Re: Possible reason for std::vector crashes

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Tue, 08 Apr 2014 18:48:07 +0300

The problem is that the peer_select code trying to access a std::vector
which is already destroyed.
If we call peerSelect for fwd FwdState object eg:
    peerSelect(&fwd->serverDestinations, ..., fwd);

and the fwd state object become invalid (eg because client closed the
connection) then it is possible to see this crash.

I believe the fix for these cases is easy, we need to add the following
code at the beggining of peerSelectDnsResults and peerSelectDnsPaths
methods (and any other method uses the vector):

  if (!cbdataReferenceValid(psstate->callback_data)) {
    delete psstate;
    return;
  }

Regards,
   Christos

On 03/26/2014 11:13 AM, Kinkie wrote:
> And instead..
> the code testing Alex's suspects triggered in peer_select.cc:337, this
> is the relevant snippet of stack trace:
>
>
> #3 0x0000000000632f2d in Vector<RefCount<Comm::Connection> >::size (
> this=<optimized out>, this=<optimized out>) at ../src/base/Vector.h:355
> #4 0x0000000000638a45 in peerSelectDnsResults (ia=0x29ff8b0, details=...,
> data=0x2cf59b0) at peer_select.cc:337
> #5 0x000000000060d4e6 in ipcacheCallback (i=i_at_entry=0x29ff890,
> wait=wait_at_entry=19) at ipcache.cc:347
> #6 0x000000000060dbd4 in ipcacheHandleReply (data=<optimized out>,
> answers=<optimized out>, na=<optimized out>, error_message=0x0)
> at ipcache.cc:497
> #7 0x000000000058b357 in idnsCallback (q=0x24eec80, q_at_entry=0x24f5370,
> error=error_at_entry=0x0) at dns_internal.cc:1127
>
>
> I'm currently investigating more in depth.
>
> On Wed, Mar 26, 2014 at 8:32 AM, Kinkie <gkinkie_at_gmail.com> wrote:
>> Hi,
>> I think I have isolated the changelet in FwdState::connectionList.
>> However WHAT in that changelet may be causing the crashes is still
>> unknown. The hunt for iterate-after-clear has so far turned up
>> nothing.
>>
>> On Wed, Mar 26, 2014 at 8:01 AM, Niki Gorchilov <niki_at_gorchilov.com> wrote:
>>> Hello there,
>>>
>>> Is there any progress regarding the random coredumps related to
>>> std::vector migration?
>>>
>>> Best,
>>> Niki
>>>
>>> On Mon, Mar 17, 2014 at 6:41 PM, Kinkie <gkinkie_at_gmail.com> wrote:
>>>> Yes, I will.
>>>>
>>>> On Mon, Mar 17, 2014 at 5:24 PM, Alex Rousskov
>>>> <rousskov_at_measurement-factory.com> wrote:
>>>>> Hello,
>>>>>
>>>>> I have discovered one difference between std::vector and Squid's own
>>>>> Vector implementation that might explain some of the random crashes that
>>>>> some of us have been seeing after the Vector class was replaced with
>>>>> std::vector in trunk.
>>>>>
>>>>> Squid Vector sets the number of stored elements to zero on destruction.
>>>>> Std::vector does not do that. Here is the output of a simple test code
>>>>> quoted in the postscriptum:
>>>>>
>>>>>> alive std::vector size: 1
>>>>>> deleted std::vector size: 3
>>>>>> alive Squid Vector size: 1
>>>>>> deleted Squid Vector size: 0
>>>>>
>>>>> Both vectors behave correctly, but std::vector code is optimized not to
>>>>> do extra work such as maintaining the size member. This means that
>>>>> iterating a deleted Squid Vector is often safe (until the freed memory
>>>>> is overwritten) because the broken caller code would just discover that
>>>>> there are no items in the vector and move on.
>>>>>
>>>>> Iterating the deleted std::vector usually leads to crashes because all
>>>>> iteration-related methods such as size() rely on immediately deleted and
>>>>> changed pointers (std::vector does not store its size as a data member
>>>>> but uses memory pointers to compute the number of stored elements).
>>>>>
>>>>> This difference leads to std::vector migration problems such as this
>>>>> trivially reproducible one:
>>>>>
>>>>>> Adaptation::AccessRules &
>>>>>> Adaptation::AllRules()
>>>>>> {
>>>>>> static AccessRules TheRules;
>>>>>> return TheRules;
>>>>>> }
>>>>>
>>>>> After TheRules array is deallocated during global destructions,
>>>>> iterating AllRules() becomes unsafe. The old Vector-based code usually
>>>>> worked because deallocated TheRules had zero size.
>>>>>
>>>>> The specific bug mentioned above is trivial to work around by allocating
>>>>> TheRules dynamically (and never deleting it). However, if similar bugs
>>>>> exist in other code where Vector is used as a data member of some
>>>>> transaction-related structure, then they will lead to crashes. It is
>>>>> quite possible that the affected structure's memory itself is never
>>>>> deleted when the offending code accesses it (due to cbdata locks, for
>>>>> example) so the [equally buggy] Vector-based code "works".
>>>>>
>>>>> One way we can try to catch such cases is by temporary switching back to
>>>>> Vector and then:
>>>>>
>>>>> * Modifying Vector to mark deleted Vector objects:
>>>>>
>>>>> Vector::~Vector() {
>>>>> clean();
>>>>> deleted = true;
>>>>> }
>>>>>
>>>>> * And adjusting *every* Vector method to assert if a deleted object is
>>>>> still being used. For example:
>>>>>
>>>>> template<class E>
>>>>> size_t
>>>>> Vector<E>::size() const
>>>>> {
>>>>> assert(!deleted);
>>>>> return count;
>>>>> }
>>>>>
>>>>> If one of those asserts is triggered, we would be closer to solving this
>>>>> mystery.
>>>>>
>>>>>
>>>>> Kinkie, if you agree with this analysis, would you be able to make the
>>>>> above modifications and test?
>>>>>
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Alex.
>>>>>
>>>>>
>>>>> Goes into Squid's main.cc:
>>>>>> +#include <vector>
>>>>>> +#include "base/Vector.h"
>>>>>> int
>>>>>> main(int argc, char **argv)
>>>>>> {
>>>>>> + typedef std::vector<int> StdVector;
>>>>>> + StdVector *stdv = new StdVector;
>>>>>> + stdv->push_back(1);
>>>>>> + std::cout << "alive std::vector size: " << stdv->size() << "\n";
>>>>>> + delete stdv;
>>>>>> + std::cout << "deleted std::vector size: " << stdv->size() << "\n";
>>>>>> +
>>>>>> + typedef Vector<int> SqdVector;
>>>>>> + SqdVector *sqdv = new SqdVector;
>>>>>> + sqdv->push_back(1);
>>>>>> + std::cout << "alive Squid Vector size: " << sqdv->size() << "\n";
>>>>>> + delete sqdv;
>>>>>> + std::cout << "deleted Squid Vector size: " << sqdv->size() << "\n";
>>>>>> + if (true) return 0;
>>>>>> return SquidMainSafe(argc, argv);
>>>>>> }
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Francesco
>>
>>
>>
>> --
>> Francesco
>
>
>
Received on Tue Apr 08 2014 - 15:48:15 MDT

This archive was generated by hypermail 2.2.0 : Tue Apr 08 2014 - 12:00:12 MDT