I wrote a program which uses massive parallel execution. I am working with an Array of objects and an Array of mutexes for synchronization. My code Looks something like this:
std::vector<MyObject> objects;
std::vector<std::mutex> mutexes;
void work(int data)
{
for(unsigned int i = 0; i < objects.size(); ++i)
{
//Check if data Needs to be processed for objects[i]
if(dontNeedToProcess)continue;
mutexes[i].lock();
//Work with data for objects[i]
mutexes[i].unlock();
}
}
The function "work" is called by multiple threads with different data. After some hours (sometimes even days) the program is stuck. When I run it with gdb I can see that the program hangs while locking the mutex.
The Problem is now that I compiled the progrem with optimization (-O2) and the mutexes and "i" are optimized out.
Is it possible that the optimization causes this behavior when using an Array of mutexes?
Edit:
All the threads are at the same Position. The Backtrace Looks like the following:
#0 0xb7779d3c in __kernel_vsyscall ()
#1 0xb67ff672 in __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:144
#2 0xb67fb1c2 in _L_lock_920 ()
from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#3 0xb67fb043 in __GI___pthread_mutex_lock (mutex=0x9de56530)
at ../nptl/pthread_mutex_lock.c:114
#4 0xb69043c4 in pthread_mutex_lock (mutex=0x9de56530) at forward.c:192
#5 0xb72754f3 in pthread_mutex_lock ()
from /usr/lib/i386-linux-gnu/libasan.so.1
#6 0x0809ac85 in __gthread_mutex_lock (__mutex=<optimized out>)
at /usr/include/i386-linux-gnu/c++/4.9/bits/gthr-default.h:748
#7 lock (this=<optimized out>) at /usr/include/c++/4.9/mutex:135
...
Related
Goal: I want to modify internal information and access this information from many threads synchronously as fast as possible
I simplified code bellow, but this is how I tried to achieve this.
I have 2 shared pointers.
One is called m_mutable_data and the other is called m_const_data.
m_mutable_data is updated in strand guarded way. m_const_data is updated with contents of m_mutable_data every 60s also in the strand guarded way.
This is the only place m_const_data shared pointer is reset with new data. m_const_data is read synchronously by many threads, 1000+ times per second.
Code
class black_list_container : public std::enable_shared_from_this<black_list_container>
{
struct meta_data
{
bool blacked;
}
struct black_list_data
{
std::unordered_map<uint32_t,meta_data> data;
}
public:
#pragma optimize( "", off )
bool is_blacked(uint32_t id)
{
// This call is called from many different threads (1000+ calls per second)
// should be synchronous and as fast as possible
auto c = m_const_data;
return c->data[id].blacked;
}
#pragma optimize( "", on )
#pragma optimize( "", off )
void update_const_data()
{
// Called internaly by timer every 60s to update m_const_data with contents of m_mutable_data
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
auto snapshot = new black_list_data();
snapshot->data = m_mutable_data->data;
m_const_data.reset(snapshot);
});
}
#pragma optimize( "", on )
private:
void internal_modification_mutable_data()
{
// Called internaly by different metrics
// Guarded with strand
m_strand->post([self{shared_from_this()}]{
// .... do some modification on internal m_mutable_data
});
}
boost::asio::io_context::strand m_strand;
std::shared_ptr<black_list_data> m_mutable_data;
std::shared_ptr<black_list_data> m_const_data;
};
Very, very seldom this code crashes in method 'is_blacked' on line
auto c = m_const_data;
This is the backtrace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./STRATUM-01'.
Program terminated with signal 6, Aborted.
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64
(gdb) bt
#0 0x00007fe09aaf1387 in raise () from /lib64/libc.so.6
#1 0x00007fe09aaf2a78 in abort () from /lib64/libc.so.6
#2 0x00007fe09ab33ed7 in __libc_message () from /lib64/libc.so.6
#3 0x00007fe09ab3c299 in _int_free () from /lib64/libc.so.6
#4 0x00000000005fae36 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fe0440aeaa0) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:154
#5 0x00000000006b9205 in ~__shared_count (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:684
#6 ~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr_base.h:1123
#7 ~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/shared_ptr.h:93
#8 black_list_container_impl::is_blacked (this=0x7fe08c287e50, id=23654) at /var/lib/jenkins/workspace/validator/src/black_list_container.cpp:69
I'm not exactly sure why destruction of shared_ptr is called in frame #7
Obviously I did not achieve my goal so please direct me into pattern that actually achieves my goal in thread safe way.
I know I could have used
std::atomic<std::shared_ptr<black_list_data>> m_const_data;
but would not this affect performance while reading from many different threads?
I think I found the answer to my question in this article Atomic Smart Pointers.
So I have to change code in update_const_data() to
auto snapshot = std::make_shared<black_list_data>();
snapshot->data = m_mutable_data->data;
std::atomic_store(&m_const_data, snapshot);
and code in is_blacked() to
auto c = std::atomic_load(&m_const_data);
I do operations on an STL map in the following functions, all of which are protected by a mutex:-
static std::mutex track_active_lock_mtx;
typedef intrusive_ptr<WatchCtxInternal> WatchCtxInternal_h;
static std::map<WatchCtxInternal*, WatchCtxInternal_h> actives;
void* get_ptr(WatchCtxInternal_h ctx)
{
unique_lock<mutex> trackActiveLock(track_active_lock_mtx);
if(actives.find(ctx.get()) == actives.end()) {
actives.insert(make_pair(ctx.get(), ctx));
}
trackActiveLock.unlock();
return ctx.get();
}
void genericWatcher(void *watcherCtx)
{
unique_lock<mutex> trackActiveLock(track_active_lock_mtx);
auto it = actives.find((WatchCtxInternal*)watcherCtx);
if (it == actives.end()) {
return;
}
//do unrelated stuff
actives.erase(it);
}
I got a segmentation fault in the first function:-
Program terminated with signal SIGSEGV, Segmentation fault.
#0 _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0xf31256e8, __x=0x65687465) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1261
1261 if (!_M_impl._M_key_compare(_S_key(__x), __k))
(gdb) bt
#0 _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0xf31256e8, __x=0x65687465) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1261
#1 find (__k=<optimized out>, this=0xf6ac8e2c <actives>) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_tree.h:1913
#2 find (__x=<optimized out>, this=0xf6ac8e2c <actives>) at /volume/evo/files/opt/poky/1.8.2-4/sysroots/i586-poky-linux/usr/include/c++/4.9.2/bits/stl_map.h:860
#3 get_ptr (ctx=...)
(gdb)fr 3
(gdb) p ctx
$4 = {px = 0xf3124d30}
EDIT: I managed to get a stack trace using the Memcheck tool. What is happening is that the static map gets cleaned up as part of the process exit, but a callback to genericWatcher is occurring in the other thread before completely exiting:-
main.cpp
static void thread1(void *arg) {
//call genericWatcher repeatedly
}
int main() {
if(fork() == 0) {
pthread_create(..., thread1,..)
//call get_ptr() repeatedly
}
return 0;
}
Is there any way to prevent this? I could allocate a singleton that holds the actives map, but I try to avoid using singletons
The most likely point of failure is the erase call in your release callback because it's the only access point to your map that hasn't got any guarding mechanism. Are you sure at that point that your WatchCtx is part of the map's keys? If not, it sounds possible that the insert is already letting go.
But, like Velkan already said, valgrind (or your debugger of choice) will give you certainty.
#include < iostream >
#include < pthread.h >
using namespace std;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void* Func(void *)
{
pthread_mutex_lock(&mutex);
cout << "First thread execution" << endl;
pthread_mutex_unlock(&mutex);
}
int main()
{
pthread_t th1;
pthread_create(&th1, NULL, Func, NULL);
pthread_mutex_lock(&mutex);
cout << "In main thread" << endl;
pthread_mutex_lock(&mutex);
// pthread_join(th1, NULL); // Note this code is commented
return 0;
}
I have executed following program on linux fedora 22 (also on http://www.cpp.sh/) around 20 times, and out of 20 execution I have found following outputs:-
Output1:
In main thread
First thread execution
Output2:
First thread execution
In main thread
Output3:
In main thread
Output4:
In main thread
First thread execution
First thread execution
Output 1 to 3 are expected as main thread is not waiting child thread to exit. Execution sequence of both the threads (main and child) is fully dependent on Kernel thread scheduling.
But output 4 is strange !!! First thread execution gets printed two times !!!
Now if I run program after un-commentting code 'pthread_join(th1, NULL)' or add 'pthread_exit(NULL)', I do not get strange output (i.e. First thread execution never printed twice) ever, even I run code 10000 times.
My questions to experts are:
Without pthread_join/pthread_exit what is happening behind the scene so that First thread execution have got printed 2 times?
Responsibility of pthread_join is to get exit code of a particular thread, and after successful call of pthread_join, kernel will free resources of that particular thread. If I do not call pthread_join on a joinable thread then it will result in resource leak, but why above mentioned strange behavior ??
We might say, this is un-defined behavior, but it would be great if any expert provide technical explanation on this.
How pthread_join/pthread_exit can prevent above mentioned strange behavior ? What hidden thing it is doing here due to that strange behavior doesn't appear ?
Thanks to experts in advance..
I've observed this kind of double printing in a similar situation. While your thread was waiting in the write system call doing its normal output, specifically, in this stack:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff789114f in _IO_file_overflow () from /lib64/libc.so.6
the program was terminated normally, normal termination caused the output subsystem to flush all buffers. The output buffer on stdin was not yet marked free (the write system call didn't return yet), so it was written out again:
#0 0x00007ffff78f4640 in write () from /lib64/libc.so.6
#1 0x00007ffff788fb93 in _IO_file_write () from /lib64/libc.so.6
#2 0x00007ffff788fa72 in new_do_write () from /lib64/libc.so.6
#3 0x00007ffff7890e05 in _IO_do_write () from /lib64/libc.so.6
#4 0x00007ffff7890140 in _IO_file_sync () from /lib64/libc.so.6
#5 0x00007ffff7891f56 in _IO_default_setbuf () from /lib64/libc.so.6
#6 0x00007ffff7890179 in _IO_file_setbuf () from /lib64/libc.so.6
#7 0x00007ffff7892703 in _IO_cleanup () from /lib64/libc.so.6
#8 0x00007ffff78512f8 in __run_exit_handlers () from /lib64/libc.so.
In any case, join your threads (If you used C++ threads, it would have reminded you to do that) or otherwise synchronize access to the output stream.
The main thread might end earlier then the spawned thread.
Ending the main thread implies ending the whole process, along with all threads being brought down abruptly. This might invoke undefined behaviour, thus anything can happen.
To get around this
either join the spawned thread using pthread_join() from main(),
or end the main thread using pthread_exit(), which just ends the main thread and keeps the process from being ended.
I'm communicating with a hardware device using QSerialPort. New data does not emit the "readyRead"-Signal, so I decided to write a read thread using QThread.
This is the code:
void ReadThread::run()
{
while(true){
readData();
if (buffer.size() > 0) parseData();
}
}
and
void ReadThread::readData()
{
buffer.append(device->readAll();
}
with buffer being an private QByteArray and device being a pointer to the QSerialPort. ParseData will parse the data and emit some signals. Buffer is cleared when parseData is left.
This works, however after some time (sometimes 10 seconds, sometimes 1 hour) the program crashes with SIGSEGV with the following trace:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff3498700 (LWP 24870)]
malloc_consolidate (av=av#entry=0x7fffec000020) at malloc.c:4151
(gdb) bt
#0 malloc_consolidate (av=av#entry=0x7fffec000020) at malloc.c:4151
#1 0x00007ffff62c2ee8 in _int_malloc (av=av#entry=0x7fffec000020, bytes=bytes#entry=32769) at malloc.c:3423
#2 0x00007ffff62c4661 in _int_realloc (av=av#entry=0x7fffec000020, oldp=oldp#entry=0x7fffec0013b0, oldsize=oldsize#entry=64, nb=nb#entry=32784) at malloc.c:4286
#3 0x00007ffff62c57b9 in __GI___libc_realloc (oldmem=0x7fffec0013c0, bytes=32768) at malloc.c:3029
#4 0x00007ffff70d1cdd in QByteArray::reallocData(unsigned int, QFlags<QArrayData::AllocationOption>) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#5 0x00007ffff70d1f07 in QByteArray::resize(int) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#6 0x00007ffff799f9fc in free (bytes=<optimized out>, this=0x609458)
at ../../include/QtSerialPort/5.3.2/QtSerialPort/private/../../../../../src/serialport/qt4support/include/private/qringbuffer_p.h:140
#7 read (maxLength=<optimized out>, data=<optimized out>, this=0x609458)
at ../../include/QtSerialPort/5.3.2/QtSerialPort/private/../../../../../src/serialport/qt4support/include/private/qringbuffer_p.h:326
#8 QSerialPort::readData (this=<optimized out>, data=<optimized out>, maxSize=<optimized out>) at qserialport.cpp:1341
#9 0x00007ffff722bdf0 in QIODevice::read(char*, long long) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#10 0x00007ffff722cbaf in QIODevice::readAll() () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#11 0x00007ffff7bd0741 in readThread::readData (this=0x6066c0) at ../reader.cpp:212
#12 0x00007ffff7bc80d0 in readThread::run (this=0x6066c0) at ../reader.cpp:16
#13 0x00007ffff70cdd2e in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007ffff6e1c0a4 in start_thread (arg=0x7ffff3498700) at pthread_create.c:309
#15 0x00007ffff632f04d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
I'm not sure how to reproduce the problem correctly, since it appears randomly. If I comment out the "readData()" in my while loop, the crashes do not appear anymore (of course no data can be parsed, then).
Does anyone have a clue what this could be?
What is the buffer? Could it be, another thread is reading the data from the buffer and clears it afterwards?
Try to lock it (and all other data shared between threads) e.g. with a mutex
QMutex mx; // could be also member of the ReadThread class
void ReadThread::readData()
{
mx.lock();
buffer.append(device->readAll();
mx.unlock();
}
And do the same in the code which reads and clears the buffer from another thread (I'm not doing the assumption, that this is parseData())
Another possibility could be, parseData() calls some code running in GUI-Thread. This doesn't work in Qt4 and probably also in Qt5
You're using the instance of a QObject from multiple threads at once. This generally speaking leads to undefined behavior, as you've just seen. QSerialPort will work just fine on the GUI thread. Only once you get it to work there, you can move it to a worker thread.
Note that if the event loop (app.exec() call in main() or QThread::run()) isn't executing, the signals won't be happening. It looks as if you tried to write pseudo synchronous code and have (predictably) failed. Don't do that.
Something like this is supposed to work:
#include <QtCore>
#include <QtSerialPort>
int main(int argc, char ** argv) {
QCoreApplication app(argc, argv);
QSerialPort port;
port.setPortName(...);
port.setBaudRate(...);
... // etc
if (! port.open(QIODevice::ReadWrite)) {
qWarning() << "can't open the port";
return 1;
}
... // set the port
connect(&port, &QIODevice::readyRead, [&]{
qDebug() << "got" << port.readAll().size() << "bytes";
});
return app.exec(); // the signals will be emitted from here
}
Ensure that all serial port related objects are initialized and used only in the separate thread. Send received data or parsed events to the UI thread by using signal/slot-mechanism.
Note also that if you inherit QThread in readThread, the constructor may be executed in the UI thread and other functions in the readThread. In that case, start the readThread and run separate initialization function before other functions (for example, by sending proper signal from the UI thread).
I am trying to use std::shared_ptr to point to the data being produced by one thread and consumed by another. The storage field is a shared pointer to the base class,
Here's the simplest Google Test I could create that reproduced the problem:
#include "gtest/gtest.h"
#include <thread>
struct A
{
virtual ~A() {}
virtual bool isSub() { return false; }
};
struct B : public A
{
bool isSub() override { return true; }
};
TEST (SharedPointerTests, threadedProducerConsumer)
{
int loopCount = 10000;
shared_ptr<A> ptr;
thread producer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
ptr = make_shared<B>(); // <--- THREAD
});
thread consumer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
shared_ptr<A> state = ptr; // <--- THREAD
});
producer.join();
consumer.join();
}
When run, sometimes gives:
[ RUN ] SharedPointerTests.threadedProducerConsumer
pure virtual method called
terminate called without an active exception
Aborted (core dumped)
GDB shows the crash with two threads at the locations shown. The stacks follow:
Stack 1
#0 0x00000000006f430a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fffe00008c0)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:144
#1 0x00000000006f26a7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffdf960bc8,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:553
#2 0x00000000006f1692 in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffdf960bc0,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:810
#3 0x00000000006f16ca in std::shared_ptr<A>::~shared_ptr (this=0x7fffdf960bc0, __in_chrg=<optimized out>)
at /usr/include/c++/4.8/bits/shared_ptr.h:93
#4 0x00000000006e7288 in SharedPointerTests_threadedProducerConsumer_Test::__lambda2::operator() (__closure=0xb9c940)
at /home/drew/dev/SharedPointerTests.hh:54
#5 0x00000000006f01ce in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()>::_M_invoke<>(std::_Index_tuple<>) (this=0xb9c940) at /usr/include/c++/4.8/functional:1732
#6 0x00000000006efe13 in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()>::operator()(void) (
this=0xb9c940) at /usr/include/c++/4.8/functional:1720
#7 0x00000000006efb7c in std::thread::_Impl<std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda2()> >::_M_run(void) (this=0xb9c928) at /usr/include/c++/4.8/thread:115
#8 0x00007ffff6d19ac0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007ffff717bf8e in start_thread (arg=0x7fffdf961700) at pthread_create.c:311
#10 0x00007ffff647ee1d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Stack 2
#0 0x0000000000700573 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > >::_S_destroy<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > (__a=..., __p=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/alloc_traits.h:281
#1 0x00000000007003b6 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > >::destroy<std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2> > (__a=..., __p=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/alloc_traits.h:405
#2 0x00000000006ffe76 in std::_Sp_counted_ptr_inplace<B, std::allocator<B>, (__gnu_cxx::_Lock_policy)2>::_M_destroy (
this=0x7fffe00008f0) at /usr/include/c++/4.8/bits/shared_ptr_base.h:416
#3 0x00000000006f434c in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fffe00008f0)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:161
#4 0x00000000006f26a7 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffe8161b68,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:553
#5 0x00000000006f16b0 in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffe8161b60,
__in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:810
#6 0x00000000006f4c3f in std::__shared_ptr<A, (__gnu_cxx::_Lock_policy)2>::operator=<B>(std::__shared_ptr<B, (__gnu_cxx::_Lock_policy)2>&&) (this=0x7fffffffdcb0, __r=<unknown type in /home/drew/dev/unittests, CU 0x0, DIE 0x58b8c>)
at /usr/include/c++/4.8/bits/shared_ptr_base.h:897
#7 0x00000000006f2d2a in std::shared_ptr<A>::operator=<B>(std::shared_ptr<B>&&) (this=0x7fffffffdcb0,
__r=<unknown type in /home/drew/dev/unittests, CU 0x0, DIE 0x55e1c>)
at /usr/include/c++/4.8/bits/shared_ptr.h:299
#8 0x00000000006e7232 in SharedPointerTests_threadedProducerConsumer_Test::__lambda1::operator() (__closure=0xb9c7a0)
at /home/drew/dev/SharedPointerTests.hh:48
#9 0x00000000006f022c in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()>::_M_invoke<>(std::_Index_tuple<>) (this=0xb9c7a0) at /usr/include/c++/4.8/functional:1732
#10 0x00000000006efe31 in std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()>::operator()(void) (
this=0xb9c7a0) at /usr/include/c++/4.8/functional:1720
#11 0x00000000006efb9a in std::thread::_Impl<std::_Bind_simple<SharedPointerTests_threadedProducerConsumer_Test::TestBody()::__lambda1()> >::_M_run(void) (this=0xb9c788) at /usr/include/c++/4.8/thread:115
#12 0x00007ffff6d19ac0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#13 0x00007ffff717bf8e in start_thread (arg=0x7fffe8162700) at pthread_create.c:311
#14 0x00007ffff647ee1d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
I have tried various approaches here, including using std::dynamic_pointer_cast but I haven't had any luck.
In reality the producer stores many different subclasses of A by their type_id in a std::map<type_id const*,std::shared_ptr<A>> (one instance per type) which I look up from the consumer by type.
My understanding is that std::shared_ptr is threadsafe for these types of operations. What am I missing?
shared_ptr has thread-safety on its control block. When a shared_ptr is created and points to a newly created resource it creates a control block. According to MSDN this holds:
The shared_ptr objects that own a resource share a control block. The control block holds:
the number of shared_ptr objects that own the resource,
the number of weak_ptr objects that point to the resource,
the deleter for that resource if it has one,
the custom allocator for the control block if it has one.
This means that shared_ptr will ensure that there are no synchronization issues with multiple copies of shared_ptr pointing to the same memory. However, it does not manage the synchronization of the memory itself. See the section on thread safety (emphasis mine)
Multiple threads can read and write different shared_ptr objects at the same time, even when the objects are copies that share ownership.
Your code shares ptr which means you have a data race. Also note that it is possible for your producer thread to produce several objects before the consumer thread is scheduled to run, meaning that you lose some objects.
As has been pointed out in a comment, you can use atomic operations on shared_ptr. The producer thread then looks like:
thread producer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
{
auto p = std::make_shared<B>(); // <--- THREAD
std::atomic_store<A>( &ptr, p );
}
});
The object is created and then atomically stored into ptr. The consumer then needs to atomically load the object.
thread consumer([loopCount,&ptr]()
{
for (int i = 0; i < loopCount; i++)
{
auto state = std::atomic_load<A>( &ptr ); // <--- THREAD
}
});
This still has the disadvantage that objects will be lost when the producer thread is allowed to run for multiple iterations.
These examples were written in Visual Studio 2012. At this time, gcc hasn't fully implemented atomic shared_ptr access, as noted in section 20.7.2.5