A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs

A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs - c++

I have created a generic message queue for use in a multi-threaded application. Specifically, single producer, multi-consumer. Main code below.
1) I wanted to know if I should pass a shared_ptr allocated with new into the enqueue method by value, or is it better to have the queue wrapper allocate the memory itself and just pass in a genericMsg object by const reference?
2) Should I have my dequeue method return a shared_ptr, have a shared_ptr passed in as a parameter by reference (current strategy), or just have it directly return a genericMsg object?
3) Will I need signal/wait in enqueue/dequeue or will the read/write locks suffice?
4) Do I even need to use shared_ptrs? Or will this depend solely on the implementation I use? I like that the shared_ptrs will free memory once all references are no longer using the object. I can easily port this to regular pointers if that's recommended, though.
5) I'm storing a pair here because I'd like to discriminate what type of message I'm dealing with else w/o having to do an any_cast. Every message type has a unique ID that refers to a specific struct. Is there a better way of doing this?
Generic Message Type:
template<typename Message_T>
class genericMsg
{
public:
genericMsg()
{
id = 0;
size = 0;
}
genericMsg (unsigned int &_id, unsigned int &_size, Message_T &_data)
{
id = _id;
size = _size;
data = _data;
}
~genericMsg()
{}
unisgned int id;
unsigned int size;
Message_T data; //All structs stored here contain only POD types
};
Enqueue Methods:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(boost::shared_ptr< genericMsg<Message_T> > data)
{
WriteLock w_lock(myLock);
this->qData.push_back(std::make_pair(data->id, data));
}
VS:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(const genericMsg<Message_T> &data_in)
{
WriteLock w_lock(myLock);
boost::shared_ptr< genericMsg<Message_T> > data =
new genericMsg<Message_T>(data_in.id, data_in.size, data_in.data);
this->qData.push_back(std::make_pair(data_in.id, data));
}
Dequeue Method:
// ----------------------------------------------------------------
// -- Thread safe function that grabs a genericMsg object from the
// -- front of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
void dequeue(boost::shared_ptr< genericMsg<Message_T> > &msg)
{
ReadLock r_lock(myLock);
msg = boost::any_cast< boost::shared_ptr< genericMsg<Message_T> > >(qData.front().second);
qData.pop_front();
}
Get message ID:
inline unsigned int getMessageID()
{
ReadLock r_lock(myLock);
unsigned int tempID = qData.front().first;
return tempID;
}
Data Types:
std::deque < std::pair< unsigned int, boost::any> > qData;
Edit:
I have improved upon my design. I now have a genericMessage base class that I directly subclass from in order to derive the unique messages.
Generic Message Base Class:
class genericMessage
{
public:
virtual ~genericMessage() {}
unsigned int getID() {return id;}
unsigned int getSize() {return size;}
protected:
unsigned int id;
unsigned int size;
};
Producer Snippet:
boost::shared_ptr<genericMessage> tmp (new derived_msg1(MSG1_ID));
theQueue.enqueue(tmp);
Consumer Snippet:
boost::shared_ptr<genericMessage> tmp = theQueue.dequeue();
if(tmp->getID() == MSG1_ID)
{
boost::shared_ptr<derived_msg1> tObj = boost::dynamic_pointer_cast<derived_msg1>(tmp);
tObj->printData();
}
New Queue:
std::deque< boost::shared_ptr<genericMessage> > qData;
New Enqueue:
void mq_class::enqueue(const boost::shared_ptr<genericMessage> &data_in)
{
boost::unique_lock<boost::mutex> lock(mut);
this->qData.push_back(data_in);
cond.notify_one();
}
New Dequeue:
boost::shared_ptr<genericMessage> mq_class::dequeue()
{
boost::shared_ptr<genericMessage> ptr;
{
boost::unique_lock<boost::mutex> lock(mut);
while(qData.empty())
{
cond.wait(lock);
}
ptr = qData.front();
qData.pop_front();
}
return ptr;
}
Now, my question is am I doing dequeue correctly? Is there another way of doing it? Should I pass in a shared_ptr as a reference in this case to achieve what I want?

Edit (I added answers for parts 1, 2, and 4).
1) You should have a factory method that creates new genericMsgs and returns a std::unique_ptr. There is absolutely no good reason to pass genericMsg in by const reference and then have the queue wrap it in a smart pointer: Once you've passed by reference you have lost track of ownership, so if you do that the queue is going to have to construct (by copy) the entire genericMsg to wrap.
2) I can't think of any circumstance under which it would be safe to take a reference to a shared_ptr or unique_ptr or auto_ptr. shared_ptrs and unique_ptrs are for tracking ownership and once you've taken a reference to them (or the address of them) you have no idea how many references or pointers are still out there expecting the shared_ptr/unique_ptr object to contain a valid naked pointer.
unique_ptr is always preferred to a naked pointer, and is preferred to a shared_ptr in cases where you only have a single piece of code (validly) pointing to an object at a time.
https://softwareengineering.stackexchange.com/questions/133302/stdshared-ptr-as-a-last-resort
http://herbsutter.com/gotw/_103/
Bad practice to return unique_ptr for raw pointer like ownership semantics? (the answer explains why it is good practice not bad).
3) Yes, you need to use a std::condition_variable in your dequeue function. You need to test whether qData is empty or not before calling qData.front() or qData.pop_front(). If qData is empty you need to wait on a condition variable. When enqueue inserts an item it should signal the condition variable to wake up anyone who may have been waiting.
Your use of reader/writer locks is completely incorrect. Don't use reader/writer locks. Use std::mutex. A reader lock can only be used on a method that is completely const. You are modifying qData in dequeue, so a reader lock will lead to data races there. (Reader writer locks are only applicable when you have stupid code that is both const and holds locks for extended period of time. You are only keeping the lock for the period of time it takes to insert or remove from the queue, so even if you were const the added overhead of reader/writer locks would be a net lose.)
An example of implementing a (bounded) buffer using mutexes and condition_variables can be found at: Is this a correct way to implement a bounded buffer in C++.
4) unique_ptr is always preferred to naked pointers, and usually preferred to shared_ptr. (The main exception where shared_ptr might be better is for graph-like data structures.) In cases like yours where you are reading something in on side, creating a new object with a factory, moving the ownership to the queue and then moving ownership out of the queue to the consumer it sounds like you should be using unique_ptr.
5) You are reinventing tagged unions. Virtual functions were added to c++ specifically so you wouldn't need to do this. You should subclass your messages from a class that has a virtual function called do_it() (or better yet, operator()() or something like that). Then instead of tagging each struct, make each struct a subclass of your message class. When you dequeue each struct (or ptr to struct) just call do_it() on it. Strong static typing, no casts. See C++ std condition variable covering a lot of share variables for an example.
Also: if you are going to stick with the tagged unions: you can't have separate calls to get the id and the data item. Consider: If thread A calls to get the id, then thread B calls to get the id, then thread B retrieves the data item, now what happens when thread A calls to retrieve a data item? It gets a data item, but not with the type that it expected. You need to retrieve the id and the data item under the same critical section.

First of all, it's better to use 3rd-party concurrency containers than to implement them yourself, except it's for education purpose.
Your messages doesn't look to have costly constructors/destructor, so you can store them by value and forget about all your other questions. Use move semantics (if available) for optimizations.
If your profiler says "by value" is bad idea in your particular case:
I suppose your producer creates messages, puts them into your queue and loses any interest in them. In this case you don't need shared_ptr because you don't have shared ownership. You can use unique_ptr or even a raw pointer. It's implementation details and better to hide them inside the queue.
From performance point of view, it's better to implement lock-free queue. "locks vs. signals" depends completely on your application. For example, if you use thread pool and kind of a scheduler it's better to allow your clients to do something useful while queue is full/empty. In simpler cases reader/writer lock is just fine.

If I want to be thread safe, I usually use const objects and modify only on copy or create constructor. In this way you don't need to use any lock mechanism. In a threaded system, it is usually more effective than use mutexes on a single instance.
In your case only deque would need lock.

Related

Synchronizing method calls on shared object from multiple threads

I am thinking about how to implement a class that will contain private data that will be eventually be modified by multiple threads through method calls. For synchronization (using the Windows API), I am planning on using a CRITICAL_SECTION object since all the threads will spawn from the same process.
Given the following design, I have a few questions.
template <typename T> class Shareable
{
private:
const LPCRITICAL_SECTION sync; //Can be read and used by multiple threads
T *data;
public:
Shareable(LPCRITICAL_SECTION cs, unsigned elems) : sync{cs}, data{new T[elems]} { }
~Shareable() { delete[] data; }
void sharedModify(unsigned index, T &datum) //<-- Can this be validly called
//by multiple threads with synchronization being implicit?
{
EnterCriticalSection(sync);
/*
The critical section of code involving reads & writes to 'data'
*/
LeaveCriticalSection(sync);
}
};
// Somewhere else ...
DWORD WINAPI ThreadProc(LPVOID lpParameter)
{
Shareable<ActualType> *ptr = static_cast<Shareable<ActualType>*>(lpParameter);
T copyable = /* initialization */;
ptr->sharedModify(validIndex, copyable); //<-- OK, synchronized?
return 0;
}
The way I see it, the API calls will be conducted in the context of the current thread. That is, I assume this is the same as if I had acquired the critical section object from the pointer and called the API from within ThreadProc(). However, I am worried that if the object is created and placed in the main/initial thread, there will be something funky about the API calls.
When sharedModify() is called on the same object concurrently,
from multiple threads, will the synchronization be implicit, in the
way I described it above?
Should I instead get a pointer to the
critical section object and use that instead?
Is there some other
synchronization mechanism that is better suited to this scenario?

When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
It's not implicit, it's explicit. There's only only CRITICAL_SECTION and only one thread can hold it at a time.
Should I instead get a pointer to the critical section object and use that instead?
No. There's no reason to use a pointer here.
Is there some other synchronization mechanism that is better suited to this scenario?
It's hard to say without seeing more code, but this is definitely the "default" solution. It's like a singly-linked list -- you learn it first, it always works, but it's not always the best choice.

When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
Implicit from the caller's perspective, yes.
Should I instead get a pointer to the critical section object and use that instead?
No. In fact, I would suggest giving the Sharable object ownership of its own critical section instead of accepting one from the outside (and embrace RAII concepts to write safer code), eg:
template <typename T>
class Shareable
{
private:
CRITICAL_SECTION sync;
std::vector<T> data;
struct SyncLocker
{
CRITICAL_SECTION &sync;
SyncLocker(CRITICAL_SECTION &cs) : sync(cs) { EnterCriticalSection(&sync); }
~SyncLocker() { LeaveCriticalSection(&sync); }
}
public:
Shareable(unsigned elems) : data(elems)
{
InitializeCriticalSection(&sync);
}
Shareable(const Shareable&) = delete;
Shareable(Shareable&&) = delete;
~Shareable()
{
{
SyncLocker lock(sync);
data.clear();
}
DeleteCriticalSection(&sync);
}
void sharedModify(unsigned index, const T &datum)
{
SyncLocker lock(sync);
data[index] = datum;
}
Shareable& operator=(const Shareable&) = delete;
Shareable& operator=(Shareable&&) = delete;
};
Is there some other synchronization mechanism that is better suited to this scenario?
That depends. Will multiple threads be accessing the same index at the same time? If not, then there is not really a need for the critical section at all. One thread can safely access one index while another thread accesses a different index.
If multiple threads need to access the same index at the same time, a critical section might still not be the best choice. Locking the entire array might be a big bottleneck if you only need to lock portions of the array at a time. Things like the Interlocked API, or Slim Read/Write locks, might make more sense. It really depends on your thread designs and what you are actually trying to protect.

Is this inter-thread object sharing strategy sound?

I'm trying to come up with a fast way of solving the following problem:
I have a thread which produces data, and several threads which consume it. I don't need to queue produced data, because data is produced much more slowly than it is consumed (and even if this failed to be the case occasionally, it wouldn't be a problem if a data point were skipped occasionally). So, basically, I have an object that encapsulates the "most recent state", which only the producer thread is allowed to update.
My strategy is as follows (please let me know if I'm completely off my rocker):
I've created three classes for this example: Thing (the actual state object), SharedObject<Thing> (an object that can be local to each thread, and gives that thread access to the underlying Thing), and SharedObjectManager<Thing>, which wraps up a shared_ptr along with a mutex.
The instance of the SharedObjectManager (SOM) is a global variable.
When the producer starts, it instantiates a Thing, and tells the global SOM about it. It then makes a copy, and does all of it's updating work on that copy. When it is ready to commit it's changes to the Thing, it passes the new Thing to the global SOM, which locks it's mutex, updates the shared pointer it keeps, and then releases the lock.
Meanwhile, the consumer threads all intsantiate SharedObject<Thing>. these objects each keep a pointer to the global SOM, as well as a cached copy of the shared_ptr kept by the SOM... It keeps this cached until update() is explicitly called.
I believe this is getting hard to follow, so here's some code:
#include <mutex>
#include <iostream>
#include <memory>
class Thing
{
private:
int _some_member = 10;
public:
int some_member() const { return _some_member; }
void some_member(int val) {_some_member = val; }
};
// one global instance
template<typename T>
class SharedObjectManager
{
private:
std::shared_ptr<T> objPtr;
std::mutex objLock;
public:
std::shared_ptr<T> get_sptr()
{
std::lock_guard<std::mutex> lck(objLock);
return objPtr;
}
void commit_new_object(std::shared_ptr<T> new_object)
{
std::lock_guard<std::mutex> lck (objLock);
objPtr = new_object;
}
};
// one instance per consumer thread.
template<typename T>
class SharedObject
{
private:
SharedObjectManager<T> * som;
std::shared_ptr<T> cache;
public:
SharedObject(SharedObjectManager<T> * backend) : som(backend)
{update();}
void update()
{
cache = som->get_sptr();
}
T & operator *()
{
return *cache;
}
T * operator->()
{
return cache.get();
}
};
// no actual threads in this test, just a quick sanity check.
SharedObjectManager<Thing> glbSOM;
int main(void)
{
glbSOM.commit_new_object(std::make_shared<Thing>());
SharedObject<Thing> myobj(&glbSOM);
std::cout<<myobj->some_member()<<std::endl;
// prints "10".
}
The idea for use by the producer thread is:
// initialization - on startup
auto firstStateObj = std::make_shared<Thing>();
glbSOM.commit_new_object(firstStateObj);
// main loop
while (1)
{
// invoke copy constructor to copy the current live Thing object
auto nextState = std::make_shared<Thing>(*(glbSOM.get_sptr()));
// do stuff to nextState, gradually filling out it's new value
// based on incoming data from other sources, etc.
...
// commit the changes to the shared memory location
glbSOM.commit_new_object(nextState);
}
The use by consumers would be:
SharedObject<Thing> thing(&glbSOM);
while(1)
{
// think about the data contained in thing, and act accordingly...
doStuffWith(thing->some_member());
// re-cache the thing
thing.update();
}
Thanks!

That is way overengineered. Instead, I'd suggest to do following:
Create a pointer to Thing* theThing together with protection mutex. Either a global one, or shared by some other means. Initialize it to nullptr.
In your producer: use two local objects of Thing type - Thing thingOne and Thing thingTwo (remember, thingOne is no better than thingTwo, but one is called thingOne for a reason, but this is a thing thing. Watch out for cats.). Start with populating thingOne. When done, lock the mutex, copy thingOne address to theThing, unlock the mutex. Start populating thingTwo. When done, see above. Repeat untill killed.
In every listener: (make sure the pointer is not nullptr). Lock the mutex. Make a copy of the object pointed two by the theThing. Unlock the mutex. Work with your copy. Burn after reading. Repeat untill killed.

Read-write thread-safe smart pointer in C++, x86-64

I develop some lock free data structure and following problem arises.
I have writer thread that creates objects on heap and wraps them in smart pointer with reference counter. I also have a lot of reader threads, that work with these objects. Code can look like this:
SmartPtr ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
SmartPtr local(ptr);
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
SmartPtr newPtr(new Object);
ptr = newPtr;
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;) // wait for crash :(
}
When I create thread-local copy of ptr it means at least
Read an address.
Increment reference counter.
I can't do these two operations atomically and thus sometimes my readers work with deleted object.
The question is - what kind of smart pointer should I use to make read-write access from several threads with correct memory management possible? Solution should exist, since Java programmers don't even care about such a problem, simply relying on that all objects are references and are deleted only when nobody uses them.
For PowerPC I found http://drdobbs.com/184401888, looks nice, but uses Load-Linked and Store-Conditional instructions, that we don't have in x86.
As far I as I understand, boost pointers provide such functionality only using locks. I need lock free solution.

boost::shared_ptr have atomic_store which uses a "lock-free" spinlock which should be fast enough for 99% of possible cases.
boost::shared_ptr<Object> ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> local(boost::atomic_load(&ptr));
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> newPtr(new Object);
boost::atomic_store(&ptr, newPtr);
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;)
}
EDIT:
In response to comment below, the implementation is in "boost/shared_ptr.hpp"...
template<class T> void atomic_store( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock_pool<2>::scoped_lock lock( p );
p->swap( r );
}
template<class T> shared_ptr<T> atomic_exchange( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock & sp = boost::detail::spinlock_pool<2>::spinlock_for( p );
sp.lock();
p->swap( r );
sp.unlock();
return r; // return std::move( r )
}

With some jiggery-pokery you should be able to accomplish this using InterlockedCompareExchange128. Store the reference count and pointer in a 2 element __int64 array. If reference count is in array[0] and pointer in array[1] the atomic update would look like this:
while(true)
{
__int64 comparand[2];
comparand[0] = refCount;
comparand[1] = pointer;
if(1 == InterlockedCompareExchange128(
array,
pointer,
refCount + 1,
comparand))
{
// Pointer is ready for use. Exit the while loop.
}
}
If an InterlockedCompareExchange128 intrinsic function isn't available for your compiler then you may use the underlying CMPXCHG16B instruction instead, if you don't mind mucking around in assembly language.

The solution proposed by RobH doesn't work. It has the same problem as the original question: when accessing the reference count object, it might already have been deleted.
The only way I see of solving the problem without a global lock (as in boost::atomic_store) or conditional read/write instructions is to somehow delay the destruction of the object (or the shared reference count object if such thing is used). So zennehoy has a good idea but his method is too unsafe.
The way I might do it is by keeping copies of all the pointers in the writer thread so that the writer can control the destruction of the objects:
class Writer : public Thread {
virtual void Run() {
list<SmartPtr> ptrs; //list that holds all the old ptr values
for (;;) {
SmartPtr newPtr(new Object);
if(ptr)
ptrs.push_back(ptr); //push previous pointer into the list
ptr = newPtr;
//Periodically go through the list and destroy objects that are not
//referenced by other threads
for(auto it=ptrs.begin(); it!=ptrs.end(); )
if(it->refCount()==1)
it = ptrs.erase(it);
else
++it;
}
}
};
However there are still requirements for the smart pointer class. This doesn't work with shared_ptr as the reads and writes are not atomic. It almost works with boost::intrusive_ptr. The assignment on intrusive_ptr is implemented like this (pseudocode):
//create temporary from rhs
tmp.ptr = rhs.ptr;
if(tmp.ptr)
intrusive_ptr_add_ref(tmp.ptr);
//swap(tmp,lhs)
T* x = lhs.ptr;
lhs.ptr = tmp.ptr;
tmp.ptr = x;
//destroy temporary
if(tmp.ptr)
intrusive_ptr_release(tmp.ptr);
As far as I understand the only thing missing here is a compiler level memory fence before lhs.ptr = tmp.ptr;. With that added, both reading rhs and writing lhs would be thread-safe under strict conditions: 1) x86 or x64 architecture 2) atomic reference counting 3) rhs refcount must not go to zero during the assignment (guaranteed by the Writer code above) 4) only one thread writing to lhs (using CAS you could have several writers).
Anyway, you could create your own smart pointer class based on intrusive_ptr with necessary changes. Definitely easier than re-implementing shared_ptr. And besides, if you want performance, intrusive is the way to go.

The reason this works much more easily in java is garbage collection. In C++, you have to manually ensure that a value is not just starting to be used by a different thread when you want to delete it.
A solution I've used in a similar situation is to simply delay the deletion of the value. I create a separate thread that iterates through a list of things to be deleted. When I want to delete something, I add it to this list with a timestamp. The deleting thread waits until some fixed time after this timestamp before actually deleting the value. You just have to make sure that the delay is large enough to guarantee that any temporary use of the value has completed.
100 milliseconds would have been enough in my case, I chose a few seconds to be safe.

Is it safe to modify data of pointer in vector from another thread?

Things seem to be working but I'm unsure if this is the best way to go about it.
Basically I have an object which does asynchronous retrieval of data. This object has a vector of pointers which are allocated and de-allocated on the main thread. Using boost functions a process results callback is bound with one of the pointers in this vector. When it fires it will be running on some arbitrary thread and modify the data of the pointer.
Now I have critical sections around the parts that are pushing into the vector and erasing in case the asynch retrieval object is receives more requests but I'm wondering if I need some kind of guard in the callback that is modifying the pointer data as well.
Hopefully this slimmed down pseudo code makes things more clear:
class CAsyncRetriever
{
// typedefs of boost functions
class DataObject
{
// methods and members
};
public:
// Start single asynch retrieve with completion callback
void Start(SomeArgs)
{
SetupRetrieve(SomeArgs);
LaunchRetrieves();
}
protected:
void SetupRetrieve(SomeArgs)
{
// ...
{ // scope for data lock
boost::lock_guard<boost::mutex> lock(m_dataMutex);
m_inProgress.push_back(SmartPtr<DataObject>(new DataObject)));
m_callback = boost::bind(&CAsyncRetriever::ProcessResults, this, _1, m_inProgress.back());
}
// ...
}
void ProcessResults(DataObject* data)
{
// CALLED ON ANOTHER THREAD ... IS THIS SAFE?
data->m_SomeMember.SomeMethod();
data->m_SomeOtherMember = SomeStuff;
}
void Cleanup()
{
// ...
{ // scope for data lock
boost::lock_guard<boost::mutex> lock(m_dataMutex);
while(!m_inProgress.empty() && m_inProgress.front()->IsComplete())
m_inProgress.erase(m_inProgress.begin());
}
// ...
}
private:
std::vector<SmartPtr<DataObject>> m_inProgress;
boost::mutex m_dataMutex;
// other members
};
Edit: This is the actual code for the ProccessResults callback (plus comments for your benefit)
void ProcessResults(CRetrieveResults* pRetrieveResults, CRetData* data)
{
// pRetrieveResults is delayed binding that server passes in when invoking callback in thread pool
// data is raw pointer to ref counted object in vector of main thread (the DataObject* in question)
// if there was an error set the code on the atomic int in object
data->m_nErrorCode.Store_Release(pRetrieveResults->GetErrorCode());
// generic iterator of results bindings for generic sotrage class item
TPackedDataIterator<GenItem::CBind> dataItr(&pRetrieveResults->m_DataIter);
// namespace function which will iterate results and initialize generic storage
GenericStorage::InitializeItems<GenItem>(&data->m_items, dataItr, pRetrieveResults->m_nTotalResultsFound); // this is potentially time consuming depending on the amount of results and amount of columns that were bound in storage class definition (i.e.about 8 seconds for a million equipment items in release)
// atomic uint32_t that is incremented when kicking off async retrieve
m_nStarted.Decrement(); // this one is done processing
// boost function completion callback bound to interface that requested results
data->m_complete(data->m_items);
}

As it stands, it appears that the Cleanup code can destroy an object for which a callback to ProcessResults is in flight. That's going to cause problems when you deref the pointer in the callback.
My suggestion would be that you extend the semantics of your m_dataMutex to encompass the callback, though if the callback is long-running, or can happen inline within SetupRetrieve (sometimes this does happen - though here you state the callback is on a different thread, in which case you are OK) then things are more complex. Currently m_dataMutex is a bit confused about whether it controls access to the vector, or its contents, or both. With its scope clarified, ProcessResults could then be enhanced to verify validity of the payload within the lock.

No, it isn't safe.
ProcessResults operates on the data structure passed to it through DataObject. It indicates that you have shared state between different threads, and if both threads operate on the data structure concurrently you might have some trouble coming your way.

Updating a pointer should be an atomic operation, but you can use InterlockedExchangePointer (in Windows) to be sure. Not sure what the Linux equivalent would be.
The only consideration then would be if one thread is using an obsolete pointer. Does the other thread delete the object pointed to by the original pointer? If so, you have a definite problem.

Deleting pointer sometimes results in heap corruption

I have a multithreaded application that runs using a custom thread pool class. The threads all execute the same function, with different parameters.
These parameters are given to the threadpool class the following way:
// jobParams is a struct of int, double, etc...
jobParams* params = new jobParams;
params.value1 = 2;
params.value2 = 3;
int jobId = 0;
threadPool.addJob(jobId, params);
As soon as a thread has nothing to do, it gets the next parameters and runs the job function. I decided to take care of the deletion of the parameters in the threadpool class:
ThreadPool::~ThreadPool() {
for (int i = 0; i < this->jobs.size(); ++i) {
delete this->jobs[i].params;
}
}
However, when doing so, I sometimes get a heap corruption error:
Invalid Address specified to RtlFreeHeap
The strange thing is that in one case it works perfectly, but in another program it crashes with this error. I tried deleting the pointer at other places: in the thread after the execution of the job function (I get the same heap corruption error) or at the end of the job function itself (no error in this case).
I don't understand how deleting the same pointers (I checked, the addresses are the same) from different places changes anything. Does this have anything to do with the fact that it's multithreaded?
I do have a critical section that handles the access to the parameters. I don't think the problem is about synchronized access. Anyway, the destructor is called only once all threads are done, and I don't delete any pointer anywhere else. Can pointer be deleted automatically?
As for my code. The list of jobs is a queue of a structure, composed of the id of a job (used to be able to get the output of a specific job later) and the parameters.
getNextJob() is called by the threads (they have a pointer to the ThreadPool) each time they finished to execute their last job.
void ThreadPool::addJob(int jobId, void* params) {
jobData job; // jobData is a simple struct { int, void* }
job.ID = jobId;
job.params = params;
// insert parameters in the list
this->jobs.push(job);
}
jobData* ThreadPool::getNextJob() {
// get the data of the next job
jobData* job = NULL;
// we don't want to start a same job twice,
// so we make sure that we are only one at a time in this part
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}

Let's turn this on its head: Why are you using pointers at all?
class Params
{
int value1, value2; // etc...
}
class ThreadJob
{
int jobID; // or whatever...
Params params;
}
class ThreadPool
{
std::list<ThreadJob> jobs;
void addJob(int job, const Params & p)
{
ThreadJob j(job, p);
jobs.push_back(j);
}
}
No new, delete or pointers... Obviously some of the implementation details may be cocked, but you get the overall picture.

Thanks for extra code. Now we can see a problem -
in getNextJob
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
After the "pop", the memory pointed to by 'job' is undefined. Don't use a reference, copy the actual data!
Try something like this (it's still generic, because JobData is generic):
jobData ThreadPool::getNextJob() // get the data of the next job
{
jobData job;
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = (this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Also, while you're adding jobs to the queue you must ALSO lock the mutex, to prevent list corruption. AFAIK std::lists are NOT inherently thread-safe...?

Using operator delete on pointer to void results in undefined behavior according to the specification.
Chapter 5.3.5 of the draft of the C++ specification. Paragraph 3.
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73)
And corresponding footnote.
This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void

All access to the job queue must be synchronized, i.e. performed only from 1 thread at a time by locking the job queue prior to access. Do you already have a critical section or some similar pattern to guard the shared resource? Synchronization issues often lead to weird behaviour and bugs which are hard to reproduce.

It's hard to give a definitive answer with this amount of code. But generally speaking, multithreaded programming is all about synchronizing access to data that might be accessed from multiple threads. If there is no long or other synchronization primitive protecting access to the threadpool class itself, then you can potentially have multiple threads reaching your deletion loop at the same time, at which point you're pretty much guaranteed to be double-freeing memory.
The reason you're getting no crash when you delete a job's params at the end of the job function might be because access to a single job's params is already implicitly serialized by your work queue. Or you might just be getting lucky. In either case, it's best to think about locks and synchronization primitive as not being something that protects code, but as being something that protects data (I've always thought the term "critical section" was a bit misleading here, as it tends to lead people to think of a 'section of lines of code' rather than in terms of data access).. In this case, since you want to access your jobs data from multiple thread, you need to be protecting it via a lock or some other synchronization primitive.

If you try to delete an object twice, the second time will fail, because the heap is already freed. This is the normal behavior.
Now, since you are in a multithreading context... it might be that the deletions are done "almost" in parallel, which might avoid the error on the second deletion, because the first one is not yet finalized.

Use smart pointers or other RAII to handle your memory.
If you have access to boost or tr1 lib you can do something like this.
class ThreadPool
{
typedef pair<int, function<void (void)> > Job;
list< Job > jobList;
HANDLE mutex;
public:
void addJob(int jobid, const function<void (void)>& job) {
jobList.push_back( make_pair(jobid, job) );
}
Job getNextJob() {
struct MutexLocker {
HANDLE& mutex;
MutexLocker(HANDLE& mutex) : mutex(mutex){
WaitForSingleObject(mutex, INFINITE);
}
~MutexLocker() {
ReleaseMutex(mutex);
}
};
Job job = make_pair(-1, function<void (void)>());
const MutexLocker locker(this->mutex);
if (!this->jobList.empty()) {
job = this->jobList.front();
this->jobList.pop();
}
return job;
}
};
void workWithDouble( double value );
void workWithInt( int value );
void workWithValues( int, double);
void test() {
ThreadPool pool;
//...
pool.addJob( 0, bind(&workWithDouble, 0.1));
pool.addJob( 1, bind(&workWithInt, 1));
pool.addJob( 2, bind(&workWithValues, 1, 0.1));
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs - c++

If I want to be thread safe, I usually use const objects and modify only on copy or create constructor. In this way you don't need to use any lock mechanism. In a threaded system, it is usually more effective than use mutexes on a single instance. In your case only deque would need lock.

Related

Synchronizing method calls on shared object from multiple threads

Is this inter-thread object sharing strategy sound?

Read-write thread-safe smart pointer in C++, x86-64

Is it safe to modify data of pointer in vector from another thread?

Deleting pointer sometimes results in heap corruption

Categories

Resources