Using std::shared_ptr to share data between producer/consumer threads - c++

I am trying to use std::shared_ptr to point to the data being produced by one thread and consumed by another. The storage field is a shared pointer to the base class,
Here's the simplest Google Test I could create that reproduced the problem:
#include "gtest/gtest.h"
#include <thread>
struct A
{
virtual ~A() {}
virtual bool isSub() { return false; }
};
struct B : public A
{
bool isSub() override { return true; }
};
TEST (SharedPointerTests, threadedProducerConsumer)
{
int loopCount = 10000;
shared_ptr<A> ptr;
thread producer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
ptr = make_shared<B>(); // <--- THREAD
});
thread consumer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
shared_ptr<A> state = ptr; // <--- THREAD
});
producer.join();
consumer.join();
}
When run, sometimes gives:
[ RUN ] SharedPointerTests.threadedProducerConsumer
pure virtual method called
terminate called without an active exception
Aborted (core dumped)
GDB shows the crash with two threads at the locations shown. The stacks follow:
Stack 1
#0 0x00000000006f430a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fffe00008c0)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:144
#1 0x00000000006f26a7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffdf960bc8,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:553
#2 0x00000000006f1692 in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffdf960bc0,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:810
#3 0x00000000006f16ca in std::shared_ptr<A>::~shared_ptr (this=0x7fffdf960bc0, __in_chrg=<optimized out>)
at /usr/include/c++/4.8/bits/shared_ptr.h:93
#4 0x00000000006e7288 in SharedPointerTests_threadedProducerConsumer_Test::__lambda2::operator() (__closure=0xb9c940)
at /home/drew/dev/SharedPointerTests.hh:54
#5 0x00000000006f01ce in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()>::_M_invoke<>(std::_Index_tuple<>) (this=0xb9c940) at /usr/include/c++/4.8/functional:1732
#6 0x00000000006efe13 in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()>::operator()(void) (
this=0xb9c940) at /usr/include/c++/4.8/functional:1720
#7 0x00000000006efb7c in std::thread::_Impl<std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()> >::_M_run(void) (this=0xb9c928) at /usr/include/c++/4.8/thread:115
#8 0x00007ffff6d19ac0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007ffff717bf8e in start_thread (arg=0x7fffdf961700) at pthread_create.c:311
#10 0x00007ffff647ee1d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Stack 2
#0 0x0000000000700573 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > >::_S_destroy<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > (__a=..., __p=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/alloc_traits.h:281
#1 0x00000000007003b6 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > >::destroy<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > (__a=..., __p=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/alloc_traits.h:405
#2 0x00000000006ffe76 in std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2>::_M_destroy (
this=0x7fffe00008f0) at /usr/include/c++/4.8/bits/shared_ptr_base.h:416
#3 0x00000000006f434c in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:161
#4 0x00000000006f26a7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffe8161b68,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:553
#5 0x00000000006f16b0 in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffe8161b60,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:810
#6 0x00000000006f4c3f in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::operator=<B>(std::__shared_ptr<B, (__gnu_cxx::_Lock_policy)2>&&) (this=0x7fffffffdcb0, __r=<unknown type in /home/drew/dev/unittests, CU 0x0, DIE 0x58b8c>)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:897
#7 0x00000000006f2d2a in std::shared_ptr<A>::operator=<B>(std::shared_ptr<B>&&) (this=0x7fffffffdcb0,
__r=<unknown type in /home/drew/dev/unittests, CU 0x0, DIE 0x55e1c>)
at /usr/include/c++/4.8/bits/shared_ptr.h:299
#8 0x00000000006e7232 in SharedPointerTests_threadedProducerConsumer_Test::__lambda1::operator() (__closure=0xb9c7a0)
at /home/drew/dev/SharedPointerTests.hh:48
#9 0x00000000006f022c in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()>::_M_invoke<>(std::_Index_tuple<>) (this=0xb9c7a0) at /usr/include/c++/4.8/functional:1732
#10 0x00000000006efe31 in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()>::operator()(void) (
this=0xb9c7a0) at /usr/include/c++/4.8/functional:1720
#11 0x00000000006efb9a in std::thread::_Impl<std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()> >::_M_run(void) (this=0xb9c788) at /usr/include/c++/4.8/thread:115
#12 0x00007ffff6d19ac0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#13 0x00007ffff717bf8e in start_thread (arg=0x7fffe8162700) at pthread_create.c:311
#14 0x00007ffff647ee1d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
I have tried various approaches here, including using std::dynamic_pointer_cast but I haven't had any luck.
In reality the producer stores many different subclasses of A by their type_id in a std::map<type_id const*,std::shared_ptr<A>> (one instance per type) which I look up from the consumer by type.
My understanding is that std::shared_ptr is threadsafe for these types of operations. What am I missing?

shared_ptr has thread-safety on its control block. When a shared_ptr is created and points to a newly created resource it creates a control block. According to MSDN this holds:
The shared_ptr objects that own a resource share a control block. The control block holds:
the number of shared_ptr objects that own the resource,
the number of weak_ptr objects that point to the resource,
the deleter for that resource if it has one,
the custom allocator for the control block if it has one.
This means that shared_ptr will ensure that there are no synchronization issues with multiple copies of shared_ptr pointing to the same memory. However, it does not manage the synchronization of the memory itself. See the section on thread safety (emphasis mine)
Multiple threads can read and write different shared_ptr objects at the same time, even when the objects are copies that share ownership.
Your code shares ptr which means you have a data race. Also note that it is possible for your producer thread to produce several objects before the consumer thread is scheduled to run, meaning that you lose some objects.
As has been pointed out in a comment, you can use atomic operations on shared_ptr. The producer thread then looks like:
thread producer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
{
auto p = std::make_shared<B>(); // <--- THREAD
std::atomic_store<A>( &ptr, p );
}
});
The object is created and then atomically stored into ptr. The consumer then needs to atomically load the object.
thread consumer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
{
auto state = std::atomic_load<A>( &ptr ); // <--- THREAD
}
});
This still has the disadvantage that objects will be lost when the producer thread is allowed to run for multiple iterations.
These examples were written in Visual Studio 2012. At this time, gcc hasn't fully implemented atomic shared_ptr access, as noted in section 20.7.2.5

Related

How to do thread safe shared_ptr modification and access?

Goal: I want to modify internal information and access this information from many threads synchronously as fast as possible
I simplified code bellow, but this is how I tried to achieve this.
I have 2 shared pointers.
One is called m_mutable_data and the other is called m_const_data.
m_mutable_data is updated in strand guarded way. m_const_data is updated with contents of m_mutable_data every 60s also in the strand guarded way.
This is the only place m_const_data shared pointer is reset with new data. m_const_data is read synchronously by many threads, 1000+ times per second.
Code
class black_list_container : public std::enable_shared_from_this<black_list_container>
{
struct meta_data
{
bool blacked;
}
struct black_list_data
{
std::unordered_map<uint32_t,meta_data> data;
}
public:
#pragma optimize( "", off )
bool is_blacked(uint32_t id)
{
// This call is called from many different threads (1000+ calls per second)
// should be synchronous and as fast as possible
auto c = m_const_data;
return c->data[id].blacked;
}
#pragma optimize( "", on )
#pragma optimize( "", off )
void update_const_data()
{
// Called internaly by timer every 60s to update m_const_data with contents of m_mutable_data
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
auto snapshot = new black_list_data();
snapshot->data = m_mutable_data->data;
m_const_data.reset(snapshot);
});
}
#pragma optimize( "", on )
private:
void internal_modification_mutable_data()
{
// Called internaly by different metrics
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
// .... do some modification on internal m_mutable_data
});
}
boost::asio::io_context::strand m_strand;
std::shared_ptr<black_list_data> m_mutable_data;
std::shared_ptr<black_list_data> m_const_data;
};
Very, very seldom this code crashes in method 'is_blacked' on line
auto c = m_const_data;
This is the backtrace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./STRATUM-01'.
Program terminated with signal 6, Aborted.
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64
(gdb) bt
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
#1 0x00007fe09aaf2a78 in abort () from /lib64/libc.so.6
#2 0x00007fe09ab33ed7 in __libc_message () from /lib64/libc.so.6
#3 0x00007fe09ab3c299 in _int_free () from /lib64/libc.so.6
#4 0x00000000005fae36 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fe0440aeaa0) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:154
#5 0x00000000006b9205 in ~__shared_count (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:684
#6 ~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:1123
#7 ~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr.h:93
#8 black_list_container_impl::is_blacked (this=0x7fe08c287e50, id=23654) at /var/lib/jenkins/workspace/validator/src/black_list_container.cpp:69
I'm not exactly sure why destruction of shared_ptr is called in frame #7
Obviously I did not achieve my goal so please direct me into pattern that actually achieves my goal in thread safe way.
I know I could have used
std::atomic<std::shared_ptr<black_list_data>> m_const_data;
but would not this affect performance while reading from many different threads?
I think I found the answer to my question in this article Atomic Smart Pointers.
So I have to change code in update_const_data() to
auto snapshot = std::make_shared<black_list_data>();
snapshot->data = m_mutable_data->data;
std::atomic_store(&m_const_data, snapshot);
and code in is_blacked() to
auto c = std::atomic_load(&m_const_data);

Thread sanitizer reporting with QCoreApplication::postEvent() when calling into received event instance

I have following code:
std::thread thread;
EventReceiver target;
{
thread = std::thread([&]() {
auto event = new MyEvent();
QCoreApplication::postEvent(&target, event);
}
});
IdleProcessor::processEvents();
thread.join();
}
Where
class MyEvent : public QEvent, public async::ExecutorEventInterface
{
public:
MyEvent()
: QEvent(QEvent::User){
};
void execute() override
{
}
};
class EventReceiver : public QObject
{
Q_OBJECT
public:
EventReceiver();
/**
* #brief receives the MainExecutor posted event
* #param event
* #return
*/
bool event(QEvent* event) override
{
auto myEvent = dynamic_cast<ExecutorEventInterface*>(event);
if (myEvent)
{
myEvent->execute();
return true;
}
return false;
}
};
I get thread sanitizer report, and don't figure out why.
==================
WARNING: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) (pid=56278)
Read of size 8 at 0x7b0800026018 by main thread:
#0 async::EventReceiver::event(QEvent*) eventreceiver.cpp:16 (unittests:x86_64+0x107353775)
#1 QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) <null> (QtCore:x86_64+0x1ea917)
#2 AsyncTest_test_Test::TestBody() async_tests.cpp:103 (unittests:x86_64+0x1004fa8f3)
#3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (unittests:x86_64+0x1042addbd)
#4 TestApplication::event(QEvent*) testapplication.cpp:20 (unittests:x86_64+0x1041f89cf)
#5 QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) <null> (QtCore:x86_64+0x1ea917)
#6 start <null> (libdyld.dylib:x86_64+0x163d4)
Previous write of size 8 at 0x7b0800026018 by thread T4:
#0 async::ExecutorEventInterface::ExecutorEventInterface() executoreventinterface.h:9 (unittests:x86_64+0x10058dd10)
#1 MyEvent::MyEvent() async_tests.cpp:73 (unittests:x86_64+0x100599466)
#2 MyEvent::MyEvent() async_tests.cpp:74 (unittests:x86_64+0x1005993e8)
#3 AsyncTest_test_Test::TestBody()::$_6::operator()() const async_tests.cpp:98 (unittests:x86_64+0x100599322)
#4 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, AsyncTest_test_Test::TestBody()::$_6> >(void*) type_traits:4428 (unittests:x86_64+0x100598fb5)
Location is heap block of size 32 at 0x7b0800026000 allocated by thread T4:
#0 operator new(unsigned long) <null> (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x6d723)
#1 AsyncTest_test_Test::TestBody()::$_6::operator()() const async_tests.cpp:98 (unittests:x86_64+0x100599308)
#2 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, AsyncTest_test_Test::TestBody()::$_6> >(void*) type_traits:4428 (unittests:x86_64+0x100598fb5)
Thread T4 (tid=424005, finished) created by main thread at:
#0 pthread_create <null> (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2936d)
#1 std::__1::thread::thread<AsyncTest_test_Test::TestBody()::$_6, void>(AsyncTest_test_Test::TestBody()::$_6&&) __threading_support:336 (unittests:x86_64+0x100598642)
#2 std::__1::thread::thread<AsyncTest_test_Test::TestBody()::$_6, void>(AsyncTest_test_Test::TestBody()::$_6&&) thread:360 (unittests:x86_64+0x1004faa88)
#3 AsyncTest_test_Test::TestBody() async_tests.cpp:94 (unittests:x86_64+0x1004fa81e)
#4 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (unittests:x86_64+0x1042addbd)
#5 TestApplication::event(QEvent*) testapplication.cpp:20 (unittests:x86_64+0x1041f89cf)
#6 QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) <null> (QtCore:x86_64+0x1ea917)
#7 start <null> (libdyld.dylib:x86_64+0x163d4)
SUMMARY: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) eventreceiver.cpp:16 in async::EventReceiver::event(QEvent*)
Why that happens considering that EventReceiver is created in a thread and than posted to the main thread, seems to be a race on virtual method execute but at that point object is fully constructed.
LATER EDIT
This seems to be a benign race though. The sanitizer does not know/see that EventReceiver::event(QEvent* event) happens after/is caused by using QCoreApplication::postEvent(&target, event), which happens after MyEvent is constructed (because MyEvent instance is input to QCoreApplication::postEvent)
I can add explicit fencing for this to be clear to the sanitizer but this seems excessive ? Wonder if it's recommended to just use some annotations to disable this.

Boost python not seeing memory is owned by smart pointer

I am getting a seg fault trigger when the destructor below destroys it's vector elements. Originally it was a vector<Parent> but I changed this to vector<unique_ptr<Parent>> and since then the crash occurs every time:
class Owner
{
Owner() = default;
~Owner() = default; // Seg faults
void assignVec(std::vector<std::unique_ptr<Parent>>& vec)
{
_vec = std::move(vec);
}
std::vector<std::unique_ptr<Parent>> _vec;
};
Each vector element subtype is a polymorphic class, also inheriting from boost::python::wrapper
class Child: public Parent, public boost::python::wrapper<Parent>
{
Child();
virtual ~Child() = default;
};
where:
class Parent
{
Parent() = default;
virtual ~Parent() = default;
};
So the entire inheritance hierarchy does have virtual destructors.
GDB backtrace is showing:
#0 0x00007ffff636b207 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1 0x00007ffff636c8f8 in __GI_abort () at abort.c:90
#2 0x00007ffff63add27 in __libc_message (do_abort=do_abort#entry=2, fmt=fmt#entry=0x7ffff64bf678 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3 0x00007ffff63b6489 in malloc_printerr (ar_ptr=0x7ffff66fb760 <main_arena>, ptr=<optimized out>, str=0x7ffff64bcd31 "free(): invalid pointer", action=3) at malloc.c:5004
#4 _int_free (av=0x7ffff66fb760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3843
#5 0x00007fffc373972f in Child::~Child (this=0x2742b10, __in_chrg=<optimized out>) at Child.h:23
#6 0x000000000045694e in std::default_delete<Parent>::operator() (this=0x11922e0, __ptr=0x2742b10) at /opt/gcc-8.2.0/include/c++/8.2.0/bits/unique_ptr.h:81
#7 0x0000000000454c27 in std::unique_ptr<Parent, std::default_delete<Parent> >::~unique_ptr (this=0x11922e0, __in_chrg=<optimized out>) at /opt/gcc-8.2.0/include/c++/8.2.0/bits/unique_ptr.h:274
#8 0x000000000045a882 in std::_Destroy<std::unique_ptr<Parent, std::default_delete<Parent> > > (__pointer=0x11922e0) at /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_construct.h:98
#9 0x0000000000458f67 in std::_Destroy_aux<false>::__destroy<std::unique_ptr<Parent, std::default_delete<Parent> >*> (__first=0x11922e0, __last=0x11922e8) at /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_construct.h:108
#10 0x0000000000457636 in std::_Destroy<std::unique_ptr<Parent, std::default_delete<Parent> >*> (__first=0x11922e0, __last=0x11922e8) at /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_construct.h:137
#11 0x000000000045584d in std::_Destroy<std::unique_ptr<Parent, std::default_delete<Parent> >*, std::unique_ptr<Parent, std::default_delete<Parent> > > (__first=0x11922e0, __last=0x11922e8)
at /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_construct.h:206
#12 0x000000000049b53d in std::vector<std::unique_ptr<Parent, std::default_delete<Parent> >, std::allocator<std::unique_ptr<Parent, std::default_delete<Parent> > > >::~vector (this=0x7fffffffc4a8, __in_chrg=<optimized out>)
at /opt/gcc-8.2.0/include/c++/8.2.0/bits/stl_vector.h:567
#13 0x000000000048c677 in Owner::~Owner (this=0x7fffffffc4a8, __in_chrg=<optimized out>)
Printing this in frame 5 does show a valid object. Frame 4 source code of free() is:
static void _int_free(mstate av, mchunkptr p, int have_lock)
{
INTERNAL_SIZE_T size; /* its size */
mfastbinptr* fb; /* associated fastbin */
mchunkptr nextchunk; /* next contiguous chunk */
INTERNAL_SIZE_T nextsize; /* its size */
int nextinuse; /* true if nextchunk is used */
INTERNAL_SIZE_T prevsize; /* size of previous contiguous chunk */
mchunkptr bck; /* misc temp for linking */
mchunkptr fwd; /* misc temp for linking */
const char *errstr = NULL;
int locked = 0;
size = chunksize(p);
/* Little security check which won't hurt performance: the
allocator never wrapps around at the end of the address space.
Therefore we can exclude some size values which might appear
here by accident or by "design" from some intruder. */
if (__builtin_expect ((uintptr_t) p > (uintptr_t) -size, 0)
|| __builtin_expect (misaligned_chunk (p), 0))
{
errstr = "free(): invalid pointer";
errout:
if (have_lock || locked)
(void)mutex_unlock(&av->mutex);
malloc_printerr (check_action, errstr, chunk2mem(p), av); // CRASHES HERE
Does anyone have any advice how to proceed debugging this?
UPDATE:
I have created a small example in a unit test, creating Owner and a vector, calling assignVec() and the problem does not occur. However, once the vector is passed in, nothing else obtains the Parent memory.
UPDATE2:
We think the problem is with boost python needing to be informed of the smart pointer. Apparently Boost Python doesn't support unique_ptr and we're struggling to get it to recognize shared_ptr (both Boost and std), even using the typical register technique.
Based on the code you are giving and with the assumption boost::python has not a bug here I would guess your usage of moving semantics might be the cause:
void assignVec(std::vector<std::unique_ptr<Parent>>& vec)
{
_vec = std::move(vec);
}
Here you are moving from a l-value-reference to a vector, vector& to your member. Problem is: Usually one only moves from r-values references (vector&&) as they ca not accessed any more, because the only bind to temporaries or if the one explicitly acknowledges the moving by creating an r/x-values reference like you did by using std::move.
Problem is: I bet you caller of assignVec might not be aware of this, because then why did you not already use a r-value reference in the signature, so that the caller has to std::move explicitly? My assumption is, you caller does not, and is doing more than the one thing that is legal to moved from values: Destruct them.
Of course you ask yourself, why does it break then in the destructor? In my experience segmentation faults occur some expression after the cause, in this case I would say the undefined behaviour of the caller of assignVec by still using the vector given to assignVec.

Why the destrcution is called twice?

I have got one crash. and I use gdb to analyze the stack,I got the below result.
13 0x00007f423c6e9670 in ?? ()
#14 0x00007f42340496d8 in ?? ()
#15 0x0000000003cef568 in ?? ()
#16 0x00000000008da861 in HuffmanEnd ()
#17 0x00000000008d4a83 in faacEncClose ()
#18 0x00000000004fd797 in RecorderSession::~RecorderSession (this=0x7f423404ea90, __in_chrg=<value optimized out>)
at /root/Desktop/VideoRecoder/2.0/src/videorecorder/RecorderSession.cpp:203
#19 0x00000000004fdae9 in RecorderSession::~RecorderSession (this=0x7f423404ea90, __in_chrg=<value optimized out>)
at /root/Desktop/VideoRecoder/2.0/src/videorecorder/RecorderSession.cpp:203
#20 0x0000000000500d0b in RecorderSession::OnHangup (this=0x7f423404ea90) at /root/Desktop/VideoRecoder/2.0/src/videorecorder/RecorderSession.cpp:295
#21 0x000000000045e083 in CSipPhone::on_call_state (call_id=2, e=<value optimized out>)
As we see, the crash happens in the HuffmanEnd. But I don't understand why the ~RecorderSession is called twice although I use code "delete this" to delete the RecorderSession object as below:
int RecorderSession::OnHangup()
{
delete this;
return 0;
}
So does the "delete this" cause this phenomenon?
The chances are that your function OnHangup itself is already being called from the destructor of the object in question. Thus you are calling delete on yourself when the object is already in the middle of being destroyed, causing the double delete.
It seems your object is created by placement new, or as a local object on the stack, or as a namespace-scope / global, or as a member of another object.
In that case Dtor will be called one more time.

std::vector<std::mutex> stuck with optimization

I wrote a program which uses massive parallel execution. I am working with an Array of objects and an Array of mutexes for synchronization. My code Looks something like this:
std::vector<MyObject> objects;
std::vector<std::mutex> mutexes;
void work(int data)
{
for(unsigned int i = 0; i < objects.size(); ++i)
{
//Check if data Needs to be processed for objects[i]
if(dontNeedToProcess)continue;
mutexes[i].lock();
//Work with data for objects[i]
mutexes[i].unlock();
}
}
The function "work" is called by multiple threads with different data. After some hours (sometimes even days) the program is stuck. When I run it with gdb I can see that the program hangs while locking the mutex.
The Problem is now that I compiled the progrem with optimization (-O2) and the mutexes and "i" are optimized out.
Is it possible that the optimization causes this behavior when using an Array of mutexes?
Edit:
All the threads are at the same Position. The Backtrace Looks like the following:
#0 0xb7779d3c in __kernel_vsyscall ()
#1 0xb67ff672 in __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144
#2 0xb67fb1c2 in _L_lock_920 ()
from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3 0xb67fb043 in __GI___pthread_mutex_lock (mutex=0x9de56530)
at ../nptl/pthread_mutex_lock.c:114
#4 0xb69043c4 in pthread_mutex_lock (mutex=0x9de56530) at forward.c:192
#5 0xb72754f3 in pthread_mutex_lock ()
from /usr/lib/i386-linux-gnu/libasan.so.1
#6 0x0809ac85 in __gthread_mutex_lock (__mutex=<optimized out>)
at /usr/include/i386-linux-gnu/c++/4.9/bits/gthr-default.h:748
#7 lock (this=<optimized out>) at /usr/include/c++/4.9/mutex:135
...