Possible reason for std::vector crashes

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 17 Mar 2014 10:24:29 -0600

Hello,

    I have discovered one difference between std::vector and Squid's own
Vector implementation that might explain some of the random crashes that
some of us have been seeing after the Vector class was replaced with
std::vector in trunk.

Squid Vector sets the number of stored elements to zero on destruction.
Std::vector does not do that. Here is the output of a simple test code
quoted in the postscriptum:

> alive std::vector size: 1
> deleted std::vector size: 3
> alive Squid Vector size: 1
> deleted Squid Vector size: 0

Both vectors behave correctly, but std::vector code is optimized not to
do extra work such as maintaining the size member. This means that
iterating a deleted Squid Vector is often safe (until the freed memory
is overwritten) because the broken caller code would just discover that
there are no items in the vector and move on.

Iterating the deleted std::vector usually leads to crashes because all
iteration-related methods such as size() rely on immediately deleted and
changed pointers (std::vector does not store its size as a data member
but uses memory pointers to compute the number of stored elements).

This difference leads to std::vector migration problems such as this
trivially reproducible one:

> Adaptation::AccessRules &
> Adaptation::AllRules()
> {
> static AccessRules TheRules;
> return TheRules;
> }

After TheRules array is deallocated during global destructions,
iterating AllRules() becomes unsafe. The old Vector-based code usually
worked because deallocated TheRules had zero size.

The specific bug mentioned above is trivial to work around by allocating
TheRules dynamically (and never deleting it). However, if similar bugs
exist in other code where Vector is used as a data member of some
transaction-related structure, then they will lead to crashes. It is
quite possible that the affected structure's memory itself is never
deleted when the offending code accesses it (due to cbdata locks, for
example) so the [equally buggy] Vector-based code "works".

One way we can try to catch such cases is by temporary switching back to
Vector and then:

* Modifying Vector to mark deleted Vector objects:

Vector::~Vector() {
    clean();
    deleted = true;
}

* And adjusting *every* Vector method to assert if a deleted object is
still being used. For example:

template<class E>
size_t
Vector<E>::size() const
{
    assert(!deleted);
    return count;
}

If one of those asserts is triggered, we would be closer to solving this
mystery.

Kinkie, if you agree with this analysis, would you be able to make the
above modifications and test?

Thank you,

Alex.

Goes into Squid's main.cc:
> +#include <vector>
> +#include "base/Vector.h"
> int
> main(int argc, char **argv)
> {
> + typedef std::vector<int> StdVector;
> + StdVector *stdv = new StdVector;
> + stdv->push_back(1);
> + std::cout << "alive std::vector size: " << stdv->size() << "\n";
> + delete stdv;
> + std::cout << "deleted std::vector size: " << stdv->size() << "\n";
> +
> + typedef Vector<int> SqdVector;
> + SqdVector *sqdv = new SqdVector;
> + sqdv->push_back(1);
> + std::cout << "alive Squid Vector size: " << sqdv->size() << "\n";
> + delete sqdv;
> + std::cout << "deleted Squid Vector size: " << sqdv->size() << "\n";
> + if (true) return 0;
> return SquidMainSafe(argc, argv);
> }
Received on Mon Mar 17 2014 - 16:24:42 MDT

This archive was generated by hypermail 2.2.0 : Mon Mar 17 2014 - 12:00:12 MDT