I did a bzr update just now, to commit, and I see that this problem is
fixed :-)
Please ignore....
On 04/08/2014 06:48 PM, Tsantilas Christos wrote:
> The problem is that the peer_select code trying to access a std::vector
> which is already destroyed.
> If we call peerSelect for fwd FwdState object eg:
> peerSelect(&fwd->serverDestinations, ..., fwd);
>
> and the fwd state object become invalid (eg because client closed the
> connection) then it is possible to see this crash.
>
> I believe the fix for these cases is easy, we need to add the following
> code at the beggining of peerSelectDnsResults and peerSelectDnsPaths
> methods (and any other method uses the vector):
>
> if (!cbdataReferenceValid(psstate->callback_data)) {
> delete psstate;
> return;
> }
>
> Regards,
> Christos
>
> On 03/26/2014 11:13 AM, Kinkie wrote:
>> And instead..
>> the code testing Alex's suspects triggered in peer_select.cc:337, this
>> is the relevant snippet of stack trace:
>>
>>
>> #3 0x0000000000632f2d in Vector<RefCount<Comm::Connection> >::size (
>> this=<optimized out>, this=<optimized out>) at ../src/base/Vector.h:355
>> #4 0x0000000000638a45 in peerSelectDnsResults (ia=0x29ff8b0, details=...,
>> data=0x2cf59b0) at peer_select.cc:337
>> #5 0x000000000060d4e6 in ipcacheCallback (i=i_at_entry=0x29ff890,
>> wait=wait_at_entry=19) at ipcache.cc:347
>> #6 0x000000000060dbd4 in ipcacheHandleReply (data=<optimized out>,
>> answers=<optimized out>, na=<optimized out>, error_message=0x0)
>> at ipcache.cc:497
>> #7 0x000000000058b357 in idnsCallback (q=0x24eec80, q_at_entry=0x24f5370,
>> error=error_at_entry=0x0) at dns_internal.cc:1127
>>
>>
>> I'm currently investigating more in depth.
>>
>> On Wed, Mar 26, 2014 at 8:32 AM, Kinkie <gkinkie_at_gmail.com> wrote:
>>> Hi,
>>> I think I have isolated the changelet in FwdState::connectionList.
>>> However WHAT in that changelet may be causing the crashes is still
>>> unknown. The hunt for iterate-after-clear has so far turned up
>>> nothing.
>>>
>>> On Wed, Mar 26, 2014 at 8:01 AM, Niki Gorchilov <niki_at_gorchilov.com> wrote:
>>>> Hello there,
>>>>
>>>> Is there any progress regarding the random coredumps related to
>>>> std::vector migration?
>>>>
>>>> Best,
>>>> Niki
>>>>
>>>> On Mon, Mar 17, 2014 at 6:41 PM, Kinkie <gkinkie_at_gmail.com> wrote:
>>>>> Yes, I will.
>>>>>
>>>>> On Mon, Mar 17, 2014 at 5:24 PM, Alex Rousskov
>>>>> <rousskov_at_measurement-factory.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I have discovered one difference between std::vector and Squid's own
>>>>>> Vector implementation that might explain some of the random crashes that
>>>>>> some of us have been seeing after the Vector class was replaced with
>>>>>> std::vector in trunk.
>>>>>>
>>>>>> Squid Vector sets the number of stored elements to zero on destruction.
>>>>>> Std::vector does not do that. Here is the output of a simple test code
>>>>>> quoted in the postscriptum:
>>>>>>
>>>>>>> alive std::vector size: 1
>>>>>>> deleted std::vector size: 3
>>>>>>> alive Squid Vector size: 1
>>>>>>> deleted Squid Vector size: 0
>>>>>>
>>>>>> Both vectors behave correctly, but std::vector code is optimized not to
>>>>>> do extra work such as maintaining the size member. This means that
>>>>>> iterating a deleted Squid Vector is often safe (until the freed memory
>>>>>> is overwritten) because the broken caller code would just discover that
>>>>>> there are no items in the vector and move on.
>>>>>>
>>>>>> Iterating the deleted std::vector usually leads to crashes because all
>>>>>> iteration-related methods such as size() rely on immediately deleted and
>>>>>> changed pointers (std::vector does not store its size as a data member
>>>>>> but uses memory pointers to compute the number of stored elements).
>>>>>>
>>>>>> This difference leads to std::vector migration problems such as this
>>>>>> trivially reproducible one:
>>>>>>
>>>>>>> Adaptation::AccessRules &
>>>>>>> Adaptation::AllRules()
>>>>>>> {
>>>>>>> static AccessRules TheRules;
>>>>>>> return TheRules;
>>>>>>> }
>>>>>>
>>>>>> After TheRules array is deallocated during global destructions,
>>>>>> iterating AllRules() becomes unsafe. The old Vector-based code usually
>>>>>> worked because deallocated TheRules had zero size.
>>>>>>
>>>>>> The specific bug mentioned above is trivial to work around by allocating
>>>>>> TheRules dynamically (and never deleting it). However, if similar bugs
>>>>>> exist in other code where Vector is used as a data member of some
>>>>>> transaction-related structure, then they will lead to crashes. It is
>>>>>> quite possible that the affected structure's memory itself is never
>>>>>> deleted when the offending code accesses it (due to cbdata locks, for
>>>>>> example) so the [equally buggy] Vector-based code "works".
>>>>>>
>>>>>> One way we can try to catch such cases is by temporary switching back to
>>>>>> Vector and then:
>>>>>>
>>>>>> * Modifying Vector to mark deleted Vector objects:
>>>>>>
>>>>>> Vector::~Vector() {
>>>>>> clean();
>>>>>> deleted = true;
>>>>>> }
>>>>>>
>>>>>> * And adjusting *every* Vector method to assert if a deleted object is
>>>>>> still being used. For example:
>>>>>>
>>>>>> template<class E>
>>>>>> size_t
>>>>>> Vector<E>::size() const
>>>>>> {
>>>>>> assert(!deleted);
>>>>>> return count;
>>>>>> }
>>>>>>
>>>>>> If one of those asserts is triggered, we would be closer to solving this
>>>>>> mystery.
>>>>>>
>>>>>>
>>>>>> Kinkie, if you agree with this analysis, would you be able to make the
>>>>>> above modifications and test?
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Alex.
>>>>>>
>>>>>>
>>>>>> Goes into Squid's main.cc:
>>>>>>> +#include <vector>
>>>>>>> +#include "base/Vector.h"
>>>>>>> int
>>>>>>> main(int argc, char **argv)
>>>>>>> {
>>>>>>> + typedef std::vector<int> StdVector;
>>>>>>> + StdVector *stdv = new StdVector;
>>>>>>> + stdv->push_back(1);
>>>>>>> + std::cout << "alive std::vector size: " << stdv->size() << "\n";
>>>>>>> + delete stdv;
>>>>>>> + std::cout << "deleted std::vector size: " << stdv->size() << "\n";
>>>>>>> +
>>>>>>> + typedef Vector<int> SqdVector;
>>>>>>> + SqdVector *sqdv = new SqdVector;
>>>>>>> + sqdv->push_back(1);
>>>>>>> + std::cout << "alive Squid Vector size: " << sqdv->size() << "\n";
>>>>>>> + delete sqdv;
>>>>>>> + std::cout << "deleted Squid Vector size: " << sqdv->size() << "\n";
>>>>>>> + if (true) return 0;
>>>>>>> return SquidMainSafe(argc, argv);
>>>>>>> }
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Francesco
>>>
>>>
>>>
>>> --
>>> Francesco
>>
>>
>>
>
>
-- Tsantilas Christos Network and Systems Engineer email:christos_at_chtsanti.net web:http://www.chtsanti.net Phone:+30 6977678842Received on Tue Apr 08 2014 - 15:57:01 MDT
This archive was generated by hypermail 2.2.0 : Tue Apr 08 2014 - 12:00:12 MDT