Synchronizing method calls on shared object from multiple threads - c++

I am thinking about how to implement a class that will contain private data that will be eventually be modified by multiple threads through method calls. For synchronization (using the Windows API), I am planning on using a CRITICAL_SECTION object since all the threads will spawn from the same process.
Given the following design, I have a few questions.
template <typename T> class Shareable
{
private:
const LPCRITICAL_SECTION sync; //Can be read and used by multiple threads
T *data;
public:
Shareable(LPCRITICAL_SECTION cs, unsigned elems) : sync{cs}, data{new T[elems]} { }
~Shareable() { delete[] data; }
void sharedModify(unsigned index, T &datum) //<-- Can this be validly called
//by multiple threads with synchronization being implicit?
{
EnterCriticalSection(sync);
/*
The critical section of code involving reads & writes to 'data'
*/
LeaveCriticalSection(sync);
}
};
// Somewhere else ...
DWORD WINAPI ThreadProc(LPVOID lpParameter)
{
Shareable<ActualType> *ptr = static_cast<Shareable<ActualType>*>(lpParameter);
T copyable = /* initialization */;
ptr->sharedModify(validIndex, copyable); //<-- OK, synchronized?
return 0;
}
The way I see it, the API calls will be conducted in the context of the current thread. That is, I assume this is the same as if I had acquired the critical section object from the pointer and called the API from within ThreadProc(). However, I am worried that if the object is created and placed in the main/initial thread, there will be something funky about the API calls.
When sharedModify() is called on the same object concurrently,
from multiple threads, will the synchronization be implicit, in the
way I described it above?
Should I instead get a pointer to the
critical section object and use that instead?
Is there some other
synchronization mechanism that is better suited to this scenario?

When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
It's not implicit, it's explicit. There's only only CRITICAL_SECTION and only one thread can hold it at a time.
Should I instead get a pointer to the critical section object and use that instead?
No. There's no reason to use a pointer here.
Is there some other synchronization mechanism that is better suited to this scenario?
It's hard to say without seeing more code, but this is definitely the "default" solution. It's like a singly-linked list -- you learn it first, it always works, but it's not always the best choice.

When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
Implicit from the caller's perspective, yes.
Should I instead get a pointer to the critical section object and use that instead?
No. In fact, I would suggest giving the Sharable object ownership of its own critical section instead of accepting one from the outside (and embrace RAII concepts to write safer code), eg:
template <typename T>
class Shareable
{
private:
CRITICAL_SECTION sync;
std::vector<T> data;
struct SyncLocker
{
CRITICAL_SECTION &sync;
SyncLocker(CRITICAL_SECTION &cs) : sync(cs) { EnterCriticalSection(&sync); }
~SyncLocker() { LeaveCriticalSection(&sync); }
}
public:
Shareable(unsigned elems) : data(elems)
{
InitializeCriticalSection(&sync);
}
Shareable(const Shareable&) = delete;
Shareable(Shareable&&) = delete;
~Shareable()
{
{
SyncLocker lock(sync);
data.clear();
}
DeleteCriticalSection(&sync);
}
void sharedModify(unsigned index, const T &datum)
{
SyncLocker lock(sync);
data[index] = datum;
}
Shareable& operator=(const Shareable&) = delete;
Shareable& operator=(Shareable&&) = delete;
};
Is there some other synchronization mechanism that is better suited to this scenario?
That depends. Will multiple threads be accessing the same index at the same time? If not, then there is not really a need for the critical section at all. One thread can safely access one index while another thread accesses a different index.
If multiple threads need to access the same index at the same time, a critical section might still not be the best choice. Locking the entire array might be a big bottleneck if you only need to lock portions of the array at a time. Things like the Interlocked API, or Slim Read/Write locks, might make more sense. It really depends on your thread designs and what you are actually trying to protect.

Related

A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs

I have created a generic message queue for use in a multi-threaded application. Specifically, single producer, multi-consumer. Main code below.
1) I wanted to know if I should pass a shared_ptr allocated with new into the enqueue method by value, or is it better to have the queue wrapper allocate the memory itself and just pass in a genericMsg object by const reference?
2) Should I have my dequeue method return a shared_ptr, have a shared_ptr passed in as a parameter by reference (current strategy), or just have it directly return a genericMsg object?
3) Will I need signal/wait in enqueue/dequeue or will the read/write locks suffice?
4) Do I even need to use shared_ptrs? Or will this depend solely on the implementation I use? I like that the shared_ptrs will free memory once all references are no longer using the object. I can easily port this to regular pointers if that's recommended, though.
5) I'm storing a pair here because I'd like to discriminate what type of message I'm dealing with else w/o having to do an any_cast. Every message type has a unique ID that refers to a specific struct. Is there a better way of doing this?
Generic Message Type:
template<typename Message_T>
class genericMsg
{
public:
genericMsg()
{
id = 0;
size = 0;
}
genericMsg (unsigned int &_id, unsigned int &_size, Message_T &_data)
{
id = _id;
size = _size;
data = _data;
}
~genericMsg()
{}
unisgned int id;
unsigned int size;
Message_T data; //All structs stored here contain only POD types
};
Enqueue Methods:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(boost::shared_ptr< genericMsg<Message_T> > data)
{
WriteLock w_lock(myLock);
this->qData.push_back(std::make_pair(data->id, data));
}
VS:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(const genericMsg<Message_T> &data_in)
{
WriteLock w_lock(myLock);
boost::shared_ptr< genericMsg<Message_T> > data =
new genericMsg<Message_T>(data_in.id, data_in.size, data_in.data);
this->qData.push_back(std::make_pair(data_in.id, data));
}
Dequeue Method:
// ----------------------------------------------------------------
// -- Thread safe function that grabs a genericMsg object from the
// -- front of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
void dequeue(boost::shared_ptr< genericMsg<Message_T> > &msg)
{
ReadLock r_lock(myLock);
msg = boost::any_cast< boost::shared_ptr< genericMsg<Message_T> > >(qData.front().second);
qData.pop_front();
}
Get message ID:
inline unsigned int getMessageID()
{
ReadLock r_lock(myLock);
unsigned int tempID = qData.front().first;
return tempID;
}
Data Types:
std::deque < std::pair< unsigned int, boost::any> > qData;
Edit:
I have improved upon my design. I now have a genericMessage base class that I directly subclass from in order to derive the unique messages.
Generic Message Base Class:
class genericMessage
{
public:
virtual ~genericMessage() {}
unsigned int getID() {return id;}
unsigned int getSize() {return size;}
protected:
unsigned int id;
unsigned int size;
};
Producer Snippet:
boost::shared_ptr<genericMessage> tmp (new derived_msg1(MSG1_ID));
theQueue.enqueue(tmp);
Consumer Snippet:
boost::shared_ptr<genericMessage> tmp = theQueue.dequeue();
if(tmp->getID() == MSG1_ID)
{
boost::shared_ptr<derived_msg1> tObj = boost::dynamic_pointer_cast<derived_msg1>(tmp);
tObj->printData();
}
New Queue:
std::deque< boost::shared_ptr<genericMessage> > qData;
New Enqueue:
void mq_class::enqueue(const boost::shared_ptr<genericMessage> &data_in)
{
boost::unique_lock<boost::mutex> lock(mut);
this->qData.push_back(data_in);
cond.notify_one();
}
New Dequeue:
boost::shared_ptr<genericMessage> mq_class::dequeue()
{
boost::shared_ptr<genericMessage> ptr;
{
boost::unique_lock<boost::mutex> lock(mut);
while(qData.empty())
{
cond.wait(lock);
}
ptr = qData.front();
qData.pop_front();
}
return ptr;
}
Now, my question is am I doing dequeue correctly? Is there another way of doing it? Should I pass in a shared_ptr as a reference in this case to achieve what I want?
Edit (I added answers for parts 1, 2, and 4).
1) You should have a factory method that creates new genericMsgs and returns a std::unique_ptr. There is absolutely no good reason to pass genericMsg in by const reference and then have the queue wrap it in a smart pointer: Once you've passed by reference you have lost track of ownership, so if you do that the queue is going to have to construct (by copy) the entire genericMsg to wrap.
2) I can't think of any circumstance under which it would be safe to take a reference to a shared_ptr or unique_ptr or auto_ptr. shared_ptrs and unique_ptrs are for tracking ownership and once you've taken a reference to them (or the address of them) you have no idea how many references or pointers are still out there expecting the shared_ptr/unique_ptr object to contain a valid naked pointer.
unique_ptr is always preferred to a naked pointer, and is preferred to a shared_ptr in cases where you only have a single piece of code (validly) pointing to an object at a time.
https://softwareengineering.stackexchange.com/questions/133302/stdshared-ptr-as-a-last-resort
http://herbsutter.com/gotw/_103/
Bad practice to return unique_ptr for raw pointer like ownership semantics? (the answer explains why it is good practice not bad).
3) Yes, you need to use a std::condition_variable in your dequeue function. You need to test whether qData is empty or not before calling qData.front() or qData.pop_front(). If qData is empty you need to wait on a condition variable. When enqueue inserts an item it should signal the condition variable to wake up anyone who may have been waiting.
Your use of reader/writer locks is completely incorrect. Don't use reader/writer locks. Use std::mutex. A reader lock can only be used on a method that is completely const. You are modifying qData in dequeue, so a reader lock will lead to data races there. (Reader writer locks are only applicable when you have stupid code that is both const and holds locks for extended period of time. You are only keeping the lock for the period of time it takes to insert or remove from the queue, so even if you were const the added overhead of reader/writer locks would be a net lose.)
An example of implementing a (bounded) buffer using mutexes and condition_variables can be found at: Is this a correct way to implement a bounded buffer in C++.
4) unique_ptr is always preferred to naked pointers, and usually preferred to shared_ptr. (The main exception where shared_ptr might be better is for graph-like data structures.) In cases like yours where you are reading something in on side, creating a new object with a factory, moving the ownership to the queue and then moving ownership out of the queue to the consumer it sounds like you should be using unique_ptr.
5) You are reinventing tagged unions. Virtual functions were added to c++ specifically so you wouldn't need to do this. You should subclass your messages from a class that has a virtual function called do_it() (or better yet, operator()() or something like that). Then instead of tagging each struct, make each struct a subclass of your message class. When you dequeue each struct (or ptr to struct) just call do_it() on it. Strong static typing, no casts. See C++ std condition variable covering a lot of share variables for an example.
Also: if you are going to stick with the tagged unions: you can't have separate calls to get the id and the data item. Consider: If thread A calls to get the id, then thread B calls to get the id, then thread B retrieves the data item, now what happens when thread A calls to retrieve a data item? It gets a data item, but not with the type that it expected. You need to retrieve the id and the data item under the same critical section.
First of all, it's better to use 3rd-party concurrency containers than to implement them yourself, except it's for education purpose.
Your messages doesn't look to have costly constructors/destructor, so you can store them by value and forget about all your other questions. Use move semantics (if available) for optimizations.
If your profiler says "by value" is bad idea in your particular case:
I suppose your producer creates messages, puts them into your queue and loses any interest in them. In this case you don't need shared_ptr because you don't have shared ownership. You can use unique_ptr or even a raw pointer. It's implementation details and better to hide them inside the queue.
From performance point of view, it's better to implement lock-free queue. "locks vs. signals" depends completely on your application. For example, if you use thread pool and kind of a scheduler it's better to allow your clients to do something useful while queue is full/empty. In simpler cases reader/writer lock is just fine.
If I want to be thread safe, I usually use const objects and modify only on copy or create constructor. In this way you don't need to use any lock mechanism. In a threaded system, it is usually more effective than use mutexes on a single instance.
In your case only deque would need lock.

Is this an acceptable way to lock a container using C++?

I need to implement (in C++) a thread safe container in such a way that only one thread is ever able to add or remove items from the container. I have done this kind of thing before by sharing a mutex between threads. This leads to a lot of mutex objects being littered throughout my code and makes things very messy and hard to maintain.
I was wondering if there is a neater and more object oriented way to do this. I thought of the following simple class wrapper around the container (semi-pseudo C++ code)
class LockedList {
private:
std::list<MyClass> m_List;
public:
MutexObject Mutex;
};
so that locking could be done in the following way
LockedList lockableList; //create instance
lockableList.Mutex.Lock(); // Lock object
... // search and add or remove items
lockableList.Mutex.Unlock(); // Unlock object
So my question really is to ask if this is a good approach from a design perspective? I know that allowing public access to members is frowned upon from a design perspective, does the above design have any serious flaws in it. If so is there a better way to implement thread safe container objects?
I have read a lot of books on design and C++ in general but there really does seem to be a shortage of literature regarding multithreaded programming and multithreaded software design.
If the above is a poor approach to solving the problem I have could anyone suggest a way to improve it, or point me towards some information that explains good ways to design classes to be thread safe??? Many thanks.
I would rather design a resourece owner that locks a mutex and returns an object that can be used by the thread. Once the thread has finished with it and stops using the object the resource is automatically returned to its owner and the lock released.
template<typename Resource>
class ResourceOwner
{
Lock lock;
Resource resource;
public:
ResourceHolder<Resource> getExclusiveAccess()
{
// Let the ResourceHolder lock and unlock the lock
// So while a thread holds a copy of this object only it
// can access the resource. Once the thread releases all
// copies then the lock is released allowing another
// thread to call getExclusiveAccess().
//
// Make it behave like a form of smart pointer
// 1) So you can pass it around.
// 2) So all properties of the resource are provided via ->
// 3) So the lock is automatically released when the thread
// releases the object.
return ResourceHolder<Resource>(lock, resource);
}
};
The resource holder (not thought hard so this can be improved)
template<typename Resource>
class ResourceHolder<
{
// Use a shared_ptr to hold the scopped lock
// When first created will lock the lock. When the shared_ptr
// destroyes the scopped lock (after all copies are gone)
// this will unlock the lock thus allowding other to use
// getExclusiveAccess() on the owner
std::shared_ptr<scopped_lock> locker;
Resource& resource; // local reference on the resource.
public:
ResourceHolder(Lock& lock, Resource& r)
: locker(new scopped_lock(lock))
, resource(r)
{}
// Access to the resource via the -> operator
// Thus allowing you to use all normal functionality of
// the resource.
Resource* operator->() {return &resource;}
};
Now a lockable list is:
ResourceOwner<list<int>> lockedList;
void threadedCode()
{
ResourceHolder<list<int>> list = lockedList.getExclusiveAccess();
list->push_back(1);
}
// When list goes out of scope here.
// It is destroyed and the the member locker will unlock `lock`
// in its destructor thus allowing the next thread to call getExclusiveAccess()
I would do something like this to make it more exception-safe by using RAII.
class LockedList {
private:
std::list<MyClass> m_List;
MutexObject Mutex;
friend class LockableListLock;
};
class LockableListLock {
private:
LockedList& list_;
public:
LockableListLock(LockedList& list) : list_(list) { list.Mutex.Lock(); }
~LockableListLock(){ list.Mutex.Unlock(); }
}
You would use it like this
LockableList list;
{
LockableListLock lock(list); // The list is now locked.
// do stuff to the list
} // The list is automatically unlocked when lock goes out of scope.
You could also make the class force you to lock it before doing anything with it by adding wrappers around the interface for std::list in LockableListLock so instead of accessing the list through the LockedList class, you would access the list through the LockableListLock class. For instance, you would make this wrapper around std::list::begin()
std::list::iterator LockableListLock::begin() {
return list_.m_List.begin();
}
and then use it like this
LockableList list;
LockableListLock lock(list);
// list.begin(); //This is a compiler error so you can't
//access the list without locking it
lock.begin(); // This gets you the beginning of the list
Okay, I'll state a little more directly what others have already implied: at least part, and quite possibly all, of this design is probably not what you want. At the very least, you want RAII-style locking.
I'd also make the locked (or whatever you prefer to call it) a template, so you can decouple the locking from the container itself.
// C++ like pesudo-code. Not intended to compile as-is.
struct mutex {
void lock() { /* ... */ }
void unlock() { /* ... */ }
};
struct lock {
lock(mutex &m) { m.lock(); }
~lock(mutex &m) { m.unlock(); }
};
template <class container>
class locked {
typedef container::value_type value_type;
typedef container::reference_type reference_type;
// ...
container c;
mutex m;
public:
void push_back(reference_type const t) {
lock l(m);
c.push_back(t);
}
void push_front(reference_type const t) {
lock l(m);
c.push_front(t);
}
// etc.
};
This makes the code fairly easy to write and (for at least some cases) still get correct behavior -- e.g., where your single-threaded code might look like:
std::vector<int> x;
x.push_back(y);
...your thread-safe code would look like:
locked<std::vector<int> > x;
x.push_back(y);
Assuming you provide the usual begin(), end(), push_front, push_back, etc., your locked<container> will still be usable like a normal container, so it works with standard algorithms, iterators, etc.
The problem with this approach is that it makes LockedList non-copyable. For details on this snag, please look at this question:
Designing a thread-safe copyable class
I have tried various things over the years, and a mutex declared beside the the container declaration always turns out to be the simplest way to go ( once all the bugs have been fixed after naively implementing other methods ).
You do not need to 'litter' your code with mutexes. You just need one mutex, declared beside the container it guards.
It's hard to say that the coarse grain locking is a bad design decision. We'd need to know about the system that the code lives in to talk about that. It is a good starting point if you don't know that it won't work however. Do the simplest thing that could possibly work first.
You could improve that code by making it less likely to fail if you scope without unlocking though.
struct ScopedLocker {
ScopedLocker(MutexObject &mo_) : mo(mo_) { mo.Lock(); }
~ScopedLocker() { mo.Unlock(); }
MutexObject &mo;
};
You could also hide the implementation from users.
class LockedList {
private:
std::list<MyClass> m_List;
MutexObject Mutex;
public:
struct ScopedLocker {
ScopedLocker(LockedList &ll);
~ScopedLocker();
};
};
Then you just pass the locked list to it without them having to worry about details of the MutexObject.
You can also have the list handle all the locking internally, which is alright in some cases. The design issue is iteration. If the list locks internally, then operations like this are much worse than letting the user of the list decide when to lock.
void foo(LockedList &list) {
for (size_t i = 0; i < 100000000; i++) {
list.push_back(i);
}
}
Generally speaking, it's a hard topic to give advice on because of problems like this. More often than not, it's more about how you use an object. There are a lot of leaky abstractions when you try and write code that solves multi-processor programming. That is why you see more toolkits that let people compose the solution that meets their needs.
There are books that discuss multi-processor programming, though they are few. With all the new C++11 features coming out, there should be more literature coming within the next few years.
I came up with this (which I'm sure can be improved to take more than two arguments):
template<class T1, class T2>
class combine : public T1, public T2
{
public:
/// We always need a virtual destructor.
virtual ~combine() { }
};
This allows you to do:
// Combine an std::mutex and std::map<std::string, std::string> into
// a single instance.
combine<std::mutex, std::map<std::string, std::string>> mapWithMutex;
// Lock the map within scope to modify the map in a thread-safe way.
{
// Lock the map.
std::lock_guard<std::mutex> locked(mapWithMutex);
// Modify the map.
mapWithMutex["Person 1"] = "Jack";
mapWithMutex["Person 2"] = "Jill";
}
If you wish to use an std::recursive_mutex and an std::set, that would also work.

How and what data must be synced in multithreaded c++

I build a little application which has a render thread and some worker threads for tasks which can be made nearby the rendering, e.g. uploading files onto some server. Now in those worker threads I use different objects to store feedback information and share these with the render thread to read them for output purpose. So render = output, worker = input. Those shared objects are int, float, bool, STL string and STL list.
I had this running a few months and all was fine except 2 random crashes during output, but I learned about thread syncing now. I read int, bool, etc do not require syncing and I think it makes sense, but when I look at string and list I fear potential crashes if 2 threads attempt to read/write an object the same time. Basically I expect one thread changes the size of the string while the other might use the outdated size to loop through its characters and then read from unallocated memory. Today evening I want to build a little test scenario with 2 threads writing/reading the same object in a loop, however I was hoping to get some ideas here aswell.
I was reading about the CriticalSection in Win32 and thought it may be worth a try. Yet I am unsure what the best way would be to implement it. If I put it at the start and at the end of a read/function it feels like some time was wasted. And if I wrap EnterCriticalSection and LeaveCriticalSection in Set and Get Functions for each object I want to have synced across the threads, it is alot of adminstration.
I think I must crawl through more references.
Okay I am still not sure how to proceed. I was studying the links provided by StackedCrooked but do still have no image of how to do this.
I put copied/modified together this now and have no idea how to continue or what to do: someone has ideas?
class CSync
{
public:
CSync()
: m_isEnter(false)
{ InitializeCriticalSection(&m_CriticalSection); }
~CSync()
{ DeleteCriticalSection(&m_CriticalSection); }
bool TryEnter()
{
m_isEnter = TryEnterCriticalSection(&m_CriticalSection)==0 ? false:true;
return m_isEnter;
}
void Enter()
{
if(!m_isEnter)
{
EnterCriticalSection(&m_CriticalSection);
m_isEnter=true;
}
}
void Leave()
{
if(m_isEnter)
{
LeaveCriticalSection(&m_CriticalSection);
m_isEnter=false;
}
}
private:
CRITICAL_SECTION m_CriticalSection;
bool m_isEnter;
};
/* not needed
class CLockGuard
{
public:
CLockGuard(CSync& refSync) : m_refSync(refSync) { Lock(); }
~CLockGuard() { Unlock(); }
private:
CSync& m_refSync;
CLockGuard(const CLockGuard &refcSource);
CLockGuard& operator=(const CLockGuard& refcSource);
void Lock() { m_refSync.Enter(); }
void Unlock() { m_refSync.Leave(); }
};*/
template<class T> class Wrap
{
public:
Wrap(T* pp, const CSync& sync)
: p(pp)
, m_refSync(refSync)
{}
Call_proxy<T> operator->() { m_refSync.Enter(); return Call_proxy<T>(p); }
private:
T* p;
CSync& m_refSync;
};
template<class T> class Call_proxy
{
public:
Call_proxy(T* pp, const CSync& sync)
: p(pp)
, m_refSync(refSync)
{}
~Call_proxy() { m_refSync.Leave(); }
T* operator->() { return p; }
private:
T* p;
CSync& m_refSync;
};
int main
{
CSync sync;
Wrap<string> safeVar(new string);
// safeVar what now?
return 0;
};
Okay so I was preparing a little test now to see if my attempts do something good, so first I created a setup to make the application crash, I believed...
But that does not crash!? Does that mean now I need no syncing? What does the program need to effectively crash? And if it does not crash why do I even bother. It seems I am missing some point again. Any ideas?
string gl_str, str_test;
void thread1()
{
while(true)
{
gl_str = "12345";
str_test = gl_str;
}
};
void thread2()
{
while(true)
{
gl_str = "123456789";
str_test = gl_str;
}
};
CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE)thread1, NULL, 0, NULL );
CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE)thread2, NULL, 0, NULL );
Just added more stuff and now it crashes when calling clear(). Good.
void thread1()
{
while(true)
{
gl_str = "12345";
str_test = gl_str;
gl_str.clear();
gl_int = 124;
}
};
void thread2()
{
while(true)
{
gl_str = "123456789";
str_test = gl_str;
gl_str.clear();
if(gl_str.empty())
gl_str = "aaaaaaaaaaaaa";
gl_int = 244;
if(gl_int==124)
gl_str.clear();
}
};
The rules is simple: if the object can be modified in any thread, all accesses to it require synchronization. The type of object doesn't matter: even bool or int require external synchronization of some sort (possibly by means of a special, system dependent function, rather than with a lock). There are no exceptions, at least in C++. (If you're willing to use inline assembler, and understand the implications of fences and memory barriers, you may be able to avoid a lock.)
I read int, bool, etc do not require syncing
This is not true:
A thread may store a copy of the variable in a CPU register and keep using the old value even in the original variable has been modified by another thread.
Simple operations like i++ are not atomic.
The compiler may reorder reads and writes to the variable. This may cause synchronization issues in multithreaded scenarios.
See Lockless Programming Considerations for more details.
You should use mutexes to protect against race conditions. See this article for a quick introduction to the boost threading library.
First, you do need protection even for accessing the most primitive of data types.
If you have an int x somewhere, you can write
x += 42;
... but that will mean, at the lowest level: read the old value of x, calculate a new value, write the new value to the variable x. If two threads do that at about the same time, strange things will happen. You need a lock/critical section.
I'd recommend using the C++11 and related interfaces, or, if that is not available, the corresponding things from the boost::thread library. If that is not an option either, critical sections on Win32 and pthread_mutex_* for Unix.
NO, Don't Start Writing Multithreaded Programs Yet!
Let's talk about invariants first.
In a (hypothetical) well-defined program, every class has an invariant.
The invariant is some logical statement that is always true about an instance's state, i.e. about the values of all its member variables. If the invariant ever becomes false, the object is broken, corrupted, your program may crash, bad things have already happened. All your functions assume that the invariant is true when they are called, and they make sure that it is still true afterwards.
When a member function changes a member variable, the invariant might temporarily become false, but that is OK because the member function will make sure that everything "fits together" again before it exits.
You need a lock that protects the invariant - whenever you do something that might affect the invariant, take the lock and do not release it until you've made sure that the invariant is restored.

Thread safe container

There is some exemplary class of container in pseudo code:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
// addition of data
}
data get(size_t which)
{
// returning some data
}
void remove(size_t which)
{
// delete specified object
}
private:
data d;
};
How this container can be made thread safe? I heard about mutexes - where these mutexes should be placed? Should mutex be static for a class or maybe in global scope? What is good library for this task in C++?
First of all mutexes should not be static for a class as long as you going to use more than one instance. There is many cases where you should or shouldn't use use them. So without seeing your code it's hard to say. Just remember, they are used to synchronise access to shared data. So it's wise to place them inside methods that modify or rely on object's state. In your case I would use one mutex to protect whole object and lock all three methods. Like:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
lock_guard<Mutex> lock(mutex);
// addition of data
}
data get(size_t which)
{
lock_guard<Mutex> lock(mutex);
// getting copy of value
// return that value
}
void remove(size_t which)
{
lock_guard<Mutex> lock(mutex);
// delete specified object
}
private:
data d;
Mutex mutex;
};
Intel Thread Building Blocks (TBB) provides a bunch of thread-safe container implementations for C++. It has been open sourced, you can download it from: http://threadingbuildingblocks.org/ver.php?fid=174 .
First: sharing mutable state between threads is hard. You should be using a library that has been audited and debugged.
Now that it is said, there are two different functional issue:
you want a container to provide safe atomic operations
you want a container to provide safe multiple operations
The idea of multiple operations is that multiple accesses to the same container must be executed successively, under the control of a single entity. They require the caller to "hold" the mutex for the duration of the transaction so that only it changes the state.
1. Atomic operations
This one appears simple:
add a mutex to the object
at the start of each method grab a mutex with a RAII lock
Unfortunately it's also plain wrong.
The issue is re-entrancy. It is likely that some methods will call other methods on the same object. If those once again attempt to grab the mutex, you get a dead lock.
It is possible to use re-entrant mutexes. They are a bit slower, but allow the same thread to lock a given mutex as much as it wants. The number of unlocks should match the number of locks, so once again, RAII.
Another approach is to use dispatching methods:
class C {
public:
void foo() { Lock lock(_mutex); foo_impl(); }]
private:
void foo_impl() { /* do something */; foo_impl(); }
};
The public methods are simple forwarders to private work-methods and simply lock. Then one just have to ensure that private methods never take the mutex...
Of course there are risks of accidentally calling a locking method from a work-method, in which case you deadlock. Read on to avoid this ;)
2. Multiple operations
The only way to achieve this is to have the caller hold the mutex.
The general method is simple:
add a mutex to the container
provide a handle on this method
cross your fingers that the caller will never forget to hold the mutex while accessing the class
I personally prefer a much saner approach.
First, I create a "bundle of data", which simply represents the class data (+ a mutex), and then I provide a Proxy, in charge of grabbing the mutex. The data is locked so that the proxy only may access the state.
class ContainerData {
protected:
friend class ContainerProxy;
Mutex _mutex;
void foo();
void bar();
private:
// some data
};
class ContainerProxy {
public:
ContainerProxy(ContainerData& data): _data(data), _lock(data._mutex) {}
void foo() { data.foo(); }
void bar() { foo(); data.bar(); }
};
Note that it is perfectly safe for the Proxy to call its own methods. The mutex will be released automatically by the destructor.
The mutex can still be reentrant if multiple Proxies are desired. But really, when multiple proxies are involved, it generally turns into a mess. In debug mode, it's also possible to add a "check" that the mutex is not already held by this thread (and assert if it is).
3. Reminder
Using locks is error-prone. Deadlocks are a common cause of error and occur as soon as you have two mutexes (or one and re-entrancy). When possible, prefer using higher level alternatives.
Add mutex as an instance variable of class. Initialize it in constructor, and lock it at the very begining of every method, including destructor, and unlock at the end of method. Adding global mutex for all instances of class (static member or just in gloabl scope) may be a performance penalty.
The is also a very nice collection of lock-free containers (including maps) by Max Khiszinsky
LibCDS1 Concurrent Data Structures
Here is the documentation page:
http://libcds.sourceforge.net/doc/index.html
It can be kind of intimidating to get started, because it is fully generic and requires you register a chosen garbage collection strategy and initialize that. Of course, the threading library is configurable and you need to initialize that as well :)
See the following links for some getting started info:
initialization of CDS and the threading manager
http://sourceforge.net/projects/libcds/forums/forum/1034512/topic/4600301/
the unit tests ((cd build && ./build.sh ----debug-test for debug build)
Here is base template for 'main':
#include <cds/threading/model.h> // threading manager
#include <cds/gc/hzp/hzp.h> // Hazard Pointer GC
int main()
{
// Initialize \p CDS library
cds::Initialize();
// Initialize Garbage collector(s) that you use
cds::gc::hzp::GarbageCollector::Construct();
// Attach main thread
// Note: it is needed if main thread can access to libcds containers
cds::threading::Manager::attachThread();
// Do some useful work
...
// Finish main thread - detaches internal control structures
cds::threading::Manager::detachThread();
// Terminate GCs
cds::gc::hzp::GarbageCollector::Destruct();
// Terminate \p CDS library
cds::Terminate();
}
Don't forget to attach any additional threads you are using:
#include <cds/threading/model.h>
int myThreadFunc(void *)
{
// initialize libcds thread control structures
cds::threading::Manager::attachThread();
// Now, you can work with GCs and libcds containers
....
// Finish working thread
cds::threading::Manager::detachThread();
}
1 (not to be confuse with Google's compact datastructures library)

Deleting pointer sometimes results in heap corruption

I have a multithreaded application that runs using a custom thread pool class. The threads all execute the same function, with different parameters.
These parameters are given to the threadpool class the following way:
// jobParams is a struct of int, double, etc...
jobParams* params = new jobParams;
params.value1 = 2;
params.value2 = 3;
int jobId = 0;
threadPool.addJob(jobId, params);
As soon as a thread has nothing to do, it gets the next parameters and runs the job function. I decided to take care of the deletion of the parameters in the threadpool class:
ThreadPool::~ThreadPool() {
for (int i = 0; i < this->jobs.size(); ++i) {
delete this->jobs[i].params;
}
}
However, when doing so, I sometimes get a heap corruption error:
Invalid Address specified to RtlFreeHeap
The strange thing is that in one case it works perfectly, but in another program it crashes with this error. I tried deleting the pointer at other places: in the thread after the execution of the job function (I get the same heap corruption error) or at the end of the job function itself (no error in this case).
I don't understand how deleting the same pointers (I checked, the addresses are the same) from different places changes anything. Does this have anything to do with the fact that it's multithreaded?
I do have a critical section that handles the access to the parameters. I don't think the problem is about synchronized access. Anyway, the destructor is called only once all threads are done, and I don't delete any pointer anywhere else. Can pointer be deleted automatically?
As for my code. The list of jobs is a queue of a structure, composed of the id of a job (used to be able to get the output of a specific job later) and the parameters.
getNextJob() is called by the threads (they have a pointer to the ThreadPool) each time they finished to execute their last job.
void ThreadPool::addJob(int jobId, void* params) {
jobData job; // jobData is a simple struct { int, void* }
job.ID = jobId;
job.params = params;
// insert parameters in the list
this->jobs.push(job);
}
jobData* ThreadPool::getNextJob() {
// get the data of the next job
jobData* job = NULL;
// we don't want to start a same job twice,
// so we make sure that we are only one at a time in this part
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Let's turn this on its head: Why are you using pointers at all?
class Params
{
int value1, value2; // etc...
}
class ThreadJob
{
int jobID; // or whatever...
Params params;
}
class ThreadPool
{
std::list<ThreadJob> jobs;
void addJob(int job, const Params & p)
{
ThreadJob j(job, p);
jobs.push_back(j);
}
}
No new, delete or pointers... Obviously some of the implementation details may be cocked, but you get the overall picture.
Thanks for extra code. Now we can see a problem -
in getNextJob
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
After the "pop", the memory pointed to by 'job' is undefined. Don't use a reference, copy the actual data!
Try something like this (it's still generic, because JobData is generic):
jobData ThreadPool::getNextJob() // get the data of the next job
{
jobData job;
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = (this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Also, while you're adding jobs to the queue you must ALSO lock the mutex, to prevent list corruption. AFAIK std::lists are NOT inherently thread-safe...?
Using operator delete on pointer to void results in undefined behavior according to the specification.
Chapter 5.3.5 of the draft of the C++ specification. Paragraph 3.
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73)
And corresponding footnote.
This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void
All access to the job queue must be synchronized, i.e. performed only from 1 thread at a time by locking the job queue prior to access. Do you already have a critical section or some similar pattern to guard the shared resource? Synchronization issues often lead to weird behaviour and bugs which are hard to reproduce.
It's hard to give a definitive answer with this amount of code. But generally speaking, multithreaded programming is all about synchronizing access to data that might be accessed from multiple threads. If there is no long or other synchronization primitive protecting access to the threadpool class itself, then you can potentially have multiple threads reaching your deletion loop at the same time, at which point you're pretty much guaranteed to be double-freeing memory.
The reason you're getting no crash when you delete a job's params at the end of the job function might be because access to a single job's params is already implicitly serialized by your work queue. Or you might just be getting lucky. In either case, it's best to think about locks and synchronization primitive as not being something that protects code, but as being something that protects data (I've always thought the term "critical section" was a bit misleading here, as it tends to lead people to think of a 'section of lines of code' rather than in terms of data access).. In this case, since you want to access your jobs data from multiple thread, you need to be protecting it via a lock or some other synchronization primitive.
If you try to delete an object twice, the second time will fail, because the heap is already freed. This is the normal behavior.
Now, since you are in a multithreading context... it might be that the deletions are done "almost" in parallel, which might avoid the error on the second deletion, because the first one is not yet finalized.
Use smart pointers or other RAII to handle your memory.
If you have access to boost or tr1 lib you can do something like this.
class ThreadPool
{
typedef pair<int, function<void (void)> > Job;
list< Job > jobList;
HANDLE mutex;
public:
void addJob(int jobid, const function<void (void)>& job) {
jobList.push_back( make_pair(jobid, job) );
}
Job getNextJob() {
struct MutexLocker {
HANDLE& mutex;
MutexLocker(HANDLE& mutex) : mutex(mutex){
WaitForSingleObject(mutex, INFINITE);
}
~MutexLocker() {
ReleaseMutex(mutex);
}
};
Job job = make_pair(-1, function<void (void)>());
const MutexLocker locker(this->mutex);
if (!this->jobList.empty()) {
job = this->jobList.front();
this->jobList.pop();
}
return job;
}
};
void workWithDouble( double value );
void workWithInt( int value );
void workWithValues( int, double);
void test() {
ThreadPool pool;
//...
pool.addJob( 0, bind(&workWithDouble, 0.1));
pool.addJob( 1, bind(&workWithInt, 1));
pool.addJob( 2, bind(&workWithValues, 1, 0.1));
}