Wait-free or lock-free initialization - c++

I've been looking at lazy initialization (let's say- global variables, but it could be of anything). So far, what I've come up with is something like
enum state {
uninitialized,
initializing,
initialized
};
state s;
char memory[sizeof(T)];
T& initialize() {
auto val = compare_and_swap(&s, state::uninitialized, state::initializing);
if (val == initialized)
return *(T*)memory;
if (val == initializing) {
while(atomic_read(&s) != state::initialized);
return *(T*)memory;
}
new (memory) T();
atomic_write(&s, state::initialized);
return *(T*)memory;
}
In the case where it's already been initialized, then it's wait-free. But I've got a problem with the case where one thread is initializing. The number of steps required to finish initializing or wait for initialization to finish isn't proportional to the number of threads. But if the initializing thread is paused, the other threads all have to wait arbitrarily until it's resumed. So in the general case it's not lock-free or wait-free.
Is it possible to create lazy initialization which is wait-free or lock-free?

If you're willing to initialize more than one object, then you can make lockfree code by only storing a pointer:
std::atomic<T *> p { nullptr };
T & get()
{
T * q = p.load();
if (!q)
{
T * r = new T;
if (p.compare_exchange_strong(q, r))
{
return *r;
}
else
{
delete r;
return *q;
}
}
return *q;
}
The cost of lock-free algorithms is that you generally have to "try and fail", so you have to pay the price of trying locally even if you have to discard the result.
As you rightly pointed out, if only a single thread performs the initialization, you always depend on that single thread, and you cannot be lock-free.
You will need corresponding clean-up code, too, if the destructor has side effects:
delete p.exchange(nullptr);

I wouldnt have thought either is possible in the general case. If you did this as part of global initialization, before main is called and before the threads are created, then you stand a chance. How can it be wait-free or lock-free?
It could be lock-free if you allow the initialise to fail and retry later.
It cant be wait-free if another thread has the item.
If you care about the performance here, then you could use the thread id to hash into a bucket of items, to try and reduce the frequency of contention.

Related

When would getters and setters with mutex be thread safe?

Consider the following class:
class testThreads
{
private:
int var; // variable to be modified
std::mutex mtx; // mutex
public:
void set_var(int arg) // setter
{
std::lock_guard<std::mutex> lk(mtx);
var = arg;
}
int get_var() // getter
{
std::lock_guard<std::mutex> lk(mtx);
return var;
}
void hundred_adder()
{
for(int i = 0; i < 100; i++)
{
int got = get_var();
set_var(got + 1);
sleep(0.1);
}
}
};
When I create two threads in main(), each with a thread function of hundred_adder modifying the same variable var, the end result of the var is always different i.e. not 200 but some other number.
Conceptually speaking, why is this use of mutex with getter and setter functions not thread-safe? Do the lock-guards fail to prevent the race-condition to var? And what would be an alternative solution?
Thread a: get 0
Thread b: get 0
Thread a: set 1
Thread b: set 1
Lo and behold, var is 1 even though it should've been 2.
It should be obvious that you need to lock the whole operation:
for(int i = 0; i < 100; i++){
std::lock_guard<std::mutex> lk(mtx);
var += 1;
}
Alternatively, you could make the variable atomic (even a relaxed one could do in your case).
int got = get_var();
set_var(got + 1);
Your get_var() and set_var() themselves are thread safe. But this combined sequence of get_var() followed by set_var() is not. There is no mutex that protects this entire sequence.
You have multiple concurrent threads executing this. You have multiple threads calling get_var(). After the first one finishes it and unlocks the mutex, another thread can lock the mutex immediately and obtain the same value for got that the first thread did. There's absolutely nothing that prevents multiple threads from locking and obtaining the same got, concurrently.
Then both threads will call set_var(), updating the mutex-protected int to the same value.
That's just one possibility that can happen here. You could easily have multiple threads acquiring the mutex sequentially and thus incrementing var by several values, only to be followed by some other, stalled thread, that called get_var() several seconds ago, and only now getting around to calling set_var(), thus resetting var to a much smaller value.
The code show in thread-safe in a sense that it will never set or get partial value of the variable.
But your usage of the methods does not guarantee that value will correctly change: reading and writing from multiple threads can collide with each other. Both threads read the value (11), both increment it (to 12) and than both set to the same (12) - now you counted 2 but effectively incremented only once.
Option to fix:
provide "safe increment" operation
provide equivalent of InterlockedCompareExchange to make sure value you are updating correspond to original one and retry as necessary
wrap calling code into separate mutex or use other synchronization mechanism to prevent operations to intermix.
Why don't you just use std::atomic for the shared data (var in this case)? That will be more safe efficient.
This is an absolute classic.
One thread obtains the value of var, releases the mutex and another obtains the same value before the first thread has chance to update it.
Consequently the process risks losing increments.
There are three obvious solutions:
void testThreads::inc_var(){
std::lock_guard<std::mutex> lk(mtx);
++var;
}
That's safe because the mutex is held until the variable is updated.
Next up:
bool testThreads::compare_and_inc_var(int val){
std::lock_guard<std::mutex> lk(mtx);
if(var!=val) return false;
++var;
return true;
}
Then write code like:
int val;
do{
val=get_var();
}while(!compare_and_inc_var(val));
This works because the loop repeats until it confirms it's updating the value it read. This could result in live-lock though in this case it has to be transient because a thread can only fail to make progress because another does.
Finally replace int var with std::atomic<int> var and either use ++var or var.compare_exchange(val,val+1) or var.fetch_add(1); to update it.
NB: Notice compare_exchange(var,var+1) is invalid...
++ is guaranteed to be atomic on std::atomic<> types but despite 'looking' like a single operation in general no such guarantee exists for int.
std::atomic<> also provides appropriate memory barriers (and ways to hint what kind of barrier is needed) to ensure proper inter-thread communication.
std::atomic<> should be a wait-free, lock-free implementation where available. Check your documentation and the flag is_lock_free().

Using a mutex to block execution from outside the critical section

I'm not sure I got the terminology right but here goes - I have this function that is used by multiple threads to write data (using pseudo code in comments to illustrate what I want)
//these are initiated in the constructor
int* data;
std::atomic<size_t> size;
void write(int value) {
//wait here while "read_lock"
//set "write_lock" to "write_lock" + 1
auto slot = size.fetch_add(1, std::memory_order_acquire);
data[slot] = value;
//set "write_lock" to "write_lock" - 1
}
the order of the writes is not important, all I need here is for each write to go to a unique slot
Every once in a while though, I need one thread to read the data using this function
int* read() {
//set "read_lock" to true
//wait here while "write_lock"
int* ret = data;
data = new int[capacity];
size = 0;
//set "read_lock" to false
return ret;
}
so it basically swaps out the buffer and returns the old one (I've removed capacity logic to make the snippets shorter)
In theory this should lead to 2 operating scenarios:
1 - just a bunch of threads writing into the container
2 - when some thread executes the read function, all new writers will have to wait, the reader will wait until all existing writes are finished, it will then do the read logic and scenario 1 can continue.
The question part is that I don't know what kind of a barrier to use for the locks -
A spinlock would be wasteful since there are many containers like this and they all need cpu cycles
I don't know how to apply std::mutex since I only want the write function to be in a critical section if the read function is triggered. Wrapping the whole write function in a mutex would cause unnecessary slowdown for operating scenario 1.
So what would be the optimal solution here?
If you have C++14 capability then you can use a std::shared_timed_mutex to separate out readers and writers. In this scenario it seems you need to give your writer threads shared access (allowing other writer threads at the same time) and your reader threads unique access (kicking all other threads out).
So something like this may be what you need:
class MyClass
{
public:
using mutex_type = std::shared_timed_mutex;
using shared_lock = std::shared_lock<mutex_type>;
using unique_lock = std::unique_lock<mutex_type>;
private:
mutable mutex_type mtx;
public:
// All updater threads can operate at the same time
auto lock_for_updates() const
{
return shared_lock(mtx);
}
// Reader threads need to kick all the updater threads out
auto lock_for_reading() const
{
return unique_lock(mtx);
}
};
// many threads can call this
void do_writing_work(std::shared_ptr<MyClass> sptr)
{
auto lock = sptr->lock_for_updates();
// update the data here
}
// access the data from one thread only
void do_reading_work(std::shared_ptr<MyClass> sptr)
{
auto lock = sptr->lock_for_reading();
// read the data here
}
The shared_locks allow other threads to gain a shared_lock at the same time but prevent a unique_lock gaining simultaneous access. When a reader thread tries to gain a unique_lock all shared_locks will be vacated before the unique_lock gets exclusive control.
You can also do this with regular mutexes and condition variables rather than shared. Supposedly shared_mutex has higher overhead, so I'm not sure which will be faster. With Gallik's solution you'd presumably be paying to lock the shared mutex on every write call; I got the impression from your post that write gets called way more than read so maybe this is undesirable.
int* data; // initialized somewhere
std::atomic<size_t> size = 0;
std::atomic<bool> reading = false;
std::atomic<int> num_writers = 0;
std::mutex entering;
std::mutex leaving;
std::condition_variable cv;
void write(int x) {
++num_writers;
if (reading) {
--num_writers;
if (num_writers == 0)
{
std::lock_guard l(leaving);
cv.notify_one();
}
{ std::lock_guard l(entering); }
++num_writers;
}
auto slot = size.fetch_add(1, std::memory_order_acquire);
data[slot] = x;
--num_writers;
if (reading && num_writers == 0)
{
std::lock_guard l(leaving);
cv.notify_one();
}
}
int* read() {
int* other_data = new int[capacity];
{
std::unique_lock enter_lock(entering);
reading = true;
std::unique_lock leave_lock(leaving);
cv.wait(leave_lock, [] () { return num_writers == 0; });
swap(data, other_data);
size = 0;
reading = false;
}
return other_data;
}
It's a bit complicated and took me some time to work out, but I think this should serve the purpose pretty well.
In the common case where only writing is happening, reading is always false. So you do the usual, and pay for two additional atomic increments and two untaken branches. So the common path does not need to lock any mutexes, unlike the solution involving a shared mutex, this is supposedly expensive: http://permalink.gmane.org/gmane.comp.lib.boost.devel/211180.
Now, suppose read is called. The expensive, slow heap allocation happens first, meanwhile writing continues uninterrupted. Next, the entering lock is acquired, which has no immediate effect. Now, reading is set to true. Immediately, any new calls to write enter the first branch, and eventually hit the entering lock which they are unable to acquire (as its already taken), and those threads then get put to sleep.
Meanwhile, the read thread is now waiting on the condition that the number of writers is 0. If we're lucky, this could actually go through right away. If however there are threads in write in either of the two locations between incrementing and decrementing num_writers, then it will not. Each time a write thread decrements num_writers, it checks if it has reduced that number to zero, and when it does it will signal the condition variable. Because num_writers is atomic which prevents various reordering shenanigans, it is guaranteed that the last thread will see num_writers == 0; it could also be notified more than once but this is ok and cannot result in bad behavior.
Once that condition variable has been signalled, that shows that all writers are either trapped in the first branch or are done modifying the array. So the read thread can now safely swap the data, and then unlock everything, and then return what it needs to.
As mentioned before, in typical operation there are no locks, just increments and untaken branches. Even when a read does occur, the read thread will have one lock and one condition variable wait, whereas a typical write thread will have about one lock/unlock of a mutex and that's all (one, or a small number of write threads, will also perform a condition variable notification).

Do I need to use volatile keyword if I declare a variable between mutexes and return it?

Let's say I have the following function.
std::mutex mutex;
int getNumber()
{
mutex.lock();
int size = someVector.size();
mutex.unlock();
return size;
}
Is this a place to use volatile keyword while declaring size? Will return value optimization or something else break this code if I don't use volatile? The size of someVector can be changed from any of the numerous threads the program have and it is assumed that only one thread (other than modifiers) calls getNumber().
No. But beware that the size may not reflect the actual size AFTER the mutex is released.
Edit:If you need to do some work that relies on size being correct, you will need to wrap that whole task with a mutex.
You haven't mentioned what the type of the mutex variable is, but assuming it is an std::mutex (or something similar meant to guarantee mutual exclusion), the compiler is prevented from performing a lot of optimizations. So you don't need to worry about return value optimization or some other optimization allowing the size() query from being performed outside of the mutex block.
However, as soon as the mutex lock is released, another waiting thread is free to access the vector and possibly mutate it, thus changing the size. Now, the number returned by your function is outdated. As Mats Petersson mentions in his answer, if this is an issue, then the mutex lock needs to be acquired by the caller of getNumber(), and held until the caller is done using the result. This will ensure that the vector's size does not change during the operation.
Explicitly calling mutex::lock followed by mutex::unlock quickly becomes unfeasible for more complicated functions involving exceptions, multiple return statements etc. A much easier alternative is to use std::lock_guard to acquire the mutex lock.
int getNumber()
{
std::lock_guard<std::mutex> l(mutex); // lock is acquired
int size = someVector.size();
return size;
} // lock is released automatically when l goes out of scope
Volatile is a keyword that you use to tell the compiler to literally actually write or read the variable and not to apply any optimizations. Here is an example
int example_function() {
int a;
volatile int b;
a = 1; // this is ignored because nothing reads it before it is assigned again
a = 2; // same here
a = 3; // this is the last one, so a write takes place
b = 1; // b gets written here, because b is volatile
b = 2; // and again
b = 3; // and again
return a + b;
}
What is the real use of this? I've seen it in delay functions (keep the CPU busy for a bit by making it count up to a number) and in systems where several threads might look at the same variable. It can sometimes help a bit with multi-threaded things, but it isn't really a threading thing and is certainly not a silver bullet

Confusion about implementation error within shared_ptr destructor

I have just seen Herb Sutter's talk: C++ and Beyond 2012: Herb Sutter - atomic<> Weapons, 2 of 2
He shows bug in implementation of std::shared_ptr destructor:
if( control_block_ptr->refs.fetch_sub(1, memory_order_relaxed ) == 0 )
delete control_block_ptr; // B
He says, that due to memory_order_relaxed, delete can be placed before fetch_sub.
At 1:25:18 - Release doesn't keep line B below, where it should be
How that is possible? There is happens-before / sequenced-before relationship, because they are both in single thread. I might be wrong, but there is also carries-a-dependency-to between fetch_sub and delete.
If he is right, which ISO items support that?
Imagine a code that releases a shared pointer:
auto tmp = &(the_ptr->a);
*tmp = 10;
the_ptr.dec_ref();
If dec_ref() doesn't have a "release" semantic, it's perfectly fine for a compiler (or CPU) to move things from before dec_ref() to after it (for example):
auto tmp = &(the_ptr->a);
the_ptr.dec_ref();
*tmp = 10;
And this is not safe, since dec_ref() also can be called from other thread in the same time and delete the object.
So, it must have a "release" semantic for things before dec_ref() to stay there.
Lets now imagine that object's destructor looks like this:
~object() {
auto xxx = a;
printf("%i\n", xxx);
}
Also we will modify example a bit and will have 2 threads:
// thread 1
auto tmp = &(the_ptr->a);
*tmp = 10;
the_ptr.dec_ref();
// thread 2
the_ptr.dec_ref();
Then, the "aggregated" code will look like:
// thread 1
auto tmp = &(the_ptr->a);
*tmp = 10;
{ // the_ptr.dec_ref();
if (0 == atomic_sub(...)) {
{ //~object()
auto xxx = a;
printf("%i\n", xxx);
}
}
}
// thread 2
{ // the_ptr.dec_ref();
if (0 == atomic_sub(...)) {
{ //~object()
auto xxx = a;
printf("%i\n", xxx);
}
}
}
However, if we only have a "release" semantic for atomic_sub(), this code can be optimized that way:
// thread 2
auto xxx = the_ptr->a; // "auto xxx = a;" from destructor moved here
{ // the_ptr.dec_ref();
if (0 == atomic_sub(...)) {
{ //~object()
printf("%i\n", xxx);
}
}
}
But that way, destructor will not always print the last value of "a" (this code is not race free anymore). That's why we also need acquire semantic for atomic_sub (or, strictly speaking, we need an acquire barrier when counter becomes 0 after decrement).
Looks like he is talking about synchronization of actions on shared object itself, which are not shown on his code blocks (and as the result - confusing).
That's why he put acq_rel - because all actions on the object should happens before its destruction, all in order.
But I'm still not sure why he talks about swapping delete with fetch_sub.
This is a late reply.
Let's start out with this simple type:
struct foo
{
~foo() { std::cout << value; }
int value;
};
And we'll use this type in a shared_ptr, as follows:
void runs_in_separate_thread(std::shared_ptr<foo> my_ptr)
{
my_ptr->value = 5;
my_ptr.reset();
}
int main()
{
std::shared_ptr<foo> my_ptr(new foo);
std::async(std::launch::async, runs_in_separate_thread, my_ptr);
my_ptr.reset();
}
Two threads will be running in parallel, both sharing ownership of a foo object.
With a correct shared_ptr implementation
(that is, one with memory_order_acq_rel), this program has defined behavior.
The only value that this program will print is 5.
With an incorrect implementation (using memory_order_relaxed) there
are no such guarantees. The behavior is undefined because a data race of
foo::value is introduced. The trouble occurs only for cases when the destructor
gets called in the main thread. With a relaxed memory order, the write
to foo::value in the other thread may not propagate to the destructor in the main thread.
A value other than 5 could be printed.
So what's a data race? Well, check out the definition and pay attention to the last bullet point:
When an evaluation of an expression writes to a memory location and another evaluation reads or modifies the same memory location, the expressions are said to conflict. A program that has two conflicting evaluations has a data race unless either
both conflicting evaluations are atomic operations (see std::atomic)
one of the conflicting evaluations happens-before another (see std::memory_order)
In our program, one thread will write to foo::value and one thread will
read from foo::value. These are supposed to be sequential; the write
to foo::value should always happen before the read. Intuitively, it
makes sense that they would be as the destructor is supposed to be the last
thing that happens to an object.
memory_order_relaxed does not offer such ordering guarantees though and so memory_order_acq_rel is required.
In the talk Herb shows memory_order_release not memory_order_relaxed, but relaxed would have even more problems.
Unless delete control_block_ptr accesses control_block_ptr->refs (which it probably doesn't) then the atomic operation does not carry-a-dependency-to the delete. The delete operation might not touch any memory in the control block, it might just return that pointer to the freestore allocator.
But I'm not sure if Herb is talking about the compiler moving the delete before the atomic operation, or just referring to when the side effects become visible to other threads.

Need some advice to make the code multithreaded

I received a code that is not for multi-threaded app, now I have to modify the code to support for multi-threaded.
I have a Singleton class(MyCenterSigltonClass) that based on instruction in:
http://en.wikipedia.org/wiki/Singleton_pattern
I made it thread-safe
Now I see inside the class that contains 10-12 members, some with getter/setter methods.
Some members are declared as static and are class pointer like:
static Class_A* f_static_member_a;
static Class_B* f_static_member_b;
for these members, I defined a mutex(like mutex_a) INSIDE the class(Class_A) , I didn't add the mutex directly in my MyCenterSigltonClass, the reason is they are one to one association with my MyCenterSigltonClass, I think I have option to define mutex in the class(MyCenterSigltonClass) or (Class_A) for f_static_member_a.
1) Am I right?
Also, my Singleton class(MyCenterSigltonClass) contains some other members like
Class_C f_classC;
for these kind of member variables, should I define a mutex for each of them in MyCenterSigltonClass to make them thread-safe? what would be a good way to handle these cases?
Appreciate for any suggestion.
-Nima
Whether the members are static or not doesn't really matter. How you protect the member variables really depends on how they are accessed from public methods.
You should think about a mutex as a lock that protects some resource from concurrent read/write access. You don't need to think about protecting the internal class objects necessarily, but the resources within them. You also need to consider the scope of the locks you'll be using, especially if the code wasn't originally designed to be multithreaded. Let me give a few simple examples.
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
};
class B
{
private:
A* mA;
public:
B()
{
int values[5] = { 1, 2, 3, 4, 5 };
mA = new A(5, values);
}
const A* getA() const { return mA; }
};
In this code, there's no need to protect mA because there's no chance of conflicting access across multiple threads. None of the threads can modify the state of mA, so all concurrent access just reads from mA. However, if we modify class A:
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = 0;
mValues = NULL;
setValues(count, values);
}
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
void setValues(int count, int* values)
{
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
};
We can now have multiple threads calling B::getA() and one thread can read from mA while another thread writes to mA. Consider the following thread interaction:
Thread A: a->getValues(maxCount, values);
Thread B: a->setValues(newCount, newValues);
It's possible that Thread B will delete mValues while Thread A is in the middle of copying it. In this case, you would need a mutex within class A to protect access to mValues and mValuesCount:
int getValues(int count, int* values) const
{
// TODO: Lock mutex.
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
int returnCount = mValuesCount;
// TODO: Unlock mutex.
return returnCount;
}
void setValues(int count, int* values)
{
// TODO: Lock mutex.
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
// TODO: Unlock mutex.
}
This will prevent concurrent read/write on mValues and mValuesCount. Depending on the locking mechanisms available in your environment, you may be able to use a read-only locking mechanism in getValues() to prevent multiple threads from blocking on concurrent read access.
However, you'll also need to understand the scope of the locking you need to implement if class A is more complex:
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = 0;
mValues = NULL;
setValues(count, values);
}
int getValueCount() const { return mValuesCount; }
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
void setValues(int count, int* values)
{
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
};
In this case, you could have the following thread interaction:
Thread A: int maxCount = a->getValueCount();
Thread A: // allocate memory for "maxCount" int values
Thread B: a->setValues(newCount, newValues);
Thread A: a->getValues(maxCount, values);
Thread A has been written as though calls to getValueCount() and getValues() will be an uninterrupted operation, but Thread B has potentially changed the count in the middle of Thread A's operations. Depending on whether the new count is larger or smaller than the original count, it may take a while before you discover this problem. In this case, class A would need to be redesigned or it would need to provide some kind of transaction support so the thread using class A could block/unblock other threads:
Thread A: a->lockValues();
Thread A: int maxCount = a->getValueCount();
Thread A: // allocate memory for "maxCount" int values
Thread B: a->setValues(newCount, newValues); // Blocks until Thread A calls unlockValues()
Thread A: a->getValues(maxCount, values);
Thread A: a->unlockValues();
Thread B: // Completes call to setValues()
Since the code wasn't initially designed to be multithreaded, it's very likely you'll run into these kinds of issues where one method call uses information from an earlier call, but there was never a concern for the state of the object changing between those calls.
Now, begin to imagine what could happen if there are complex state dependencies among the objects within your singleton and multiple threads can modify the state of those internal objects. It can all become very, very messy with a large number of threads and debugging can become very difficult.
So as you try to make your singleton thread-safe, you need to look at several layers of object interactions. Some good questions to ask:
Do any of the methods on the singleton reveal internal state that may change between method calls (as in the last example I mention)?
Are any of the internal objects revealed to clients of the singleton?
If so, do any of the methods on those internal objects reveal internal state that may change between method calls?
If internal objects are revealed, do they share any resources or state dependencies?
You may not need any locking if you're just reading state from internal objects (first example). You may need to provide simple locking to prevent concurrent read/write access (second example). You may need to redesign the classes or provide clients with the ability to lock object state (third example). Or you may need to implement more complex locking where internal objects share state information across threads (e.g. a lock on a resource in class Foo requires a lock on a resource in class Bar, but locking that resource in class Bar doesn't necessarily require a lock on a resource in class Foo).
Implementing thread-safe code can become a complex task depending on how all your objects interact. It can be much more complicated than the examples I've given here. Just be sure you clearly understand how your classes are used and how they interact (and be prepared to spend some time tracking down difficult to reproduce bugs).
If this is the first time you're doing threading, consider not accessing the singleton from the background thread. You can get it right, but you probably won't get it right the first time.
Realize that if your singleton exposes pointers to other objects, these should be made thread safe as well.
You don't have to define a mutex for each member. For example, you could instead use a single mutex to synchronize access each to member, e.g.:
class foo
{
public:
...
void some_op()
{
// acquire "lock_" and release using RAII ...
Lock(lock_);
a++;
}
void set_b(bar * b)
{
// acquire "lock_" and release using RAII ...
Lock(lock_);
b_ = b;
}
private:
int a_;
bar * b_;
mutex lock_;
}
Of course a "one lock" solution may be not suitable in your case. That's up to you to decide. Regardless, simply introducing locks doesn't make the code thread-safe. You have to use them in the right place in the right way to avoid race conditions, deadlocks, etc. There are lots of concurrency issues you could run in to.
Furthermore you don't always need mutexes, or other threading mechanisms like TSS, to make code thread-safe. For example, the following function "func" is thread-safe:
class Foo;
void func (Foo & f)
{
f.some_op(); // Foo::some_op() of course needs to be thread-safe.
}
// Thread 1
Foo a;
func(&a);
// Thread 2
Foo b;
func(&b);
While the func function above is thread-safe the operations it invokes may not be thread-safe. The point is you don't always need to pepper your code with mutexes and other threading mechanisms to make the code thread safe. Sometimes restructuring the code is sufficient.
There's a lot of literature on multithreaded programming. It's definitely not easy to get right so take your time in understanding the nuances, and take advantage of existing frameworks like Boost.Thread to mitigate some of the inherent and accidental complexities that exist in the lower-level multithreading APIs.
I'd really recommend the Interlocked.... Methods to increment, decrement and CompareAndSwap values when using code that needs to be multi-thread-aware. I don't have 1st-hand C++ experience but a quick search for http://www.bing.com/search?q=c%2B%2B+interlocked reveals lots of confirming advice. If you need perf, these will likely be faster than locking.
As stated by #Void a mutex alone is not always the solution to a concurrency problem:
Regardless, simply introducing locks doesn't make the code
thread-safe. You have to use them in the right place in the right way
to avoid race conditions, deadlocks, etc. There are lots of
concurrency issues you could run in to.
I want to add another example:
class MyClass
{
mutex m_mutex;
AnotherClass m_anotherClass;
void setObject(AnotherClass& anotherClass)
{
m_mutex.lock();
m_anotherClass = anotherClass;
m_mutex.unlock();
}
AnotherClass getObject()
{
AnotherClass anotherClass;
m_mutex.lock();
anotherClass = m_anotherClass;
m_mutex.unlock();
return anotherClass;
}
}
In this case the getObject() method is always safe because is protected with mutex and you have a copy of the object which is returned to the caller which may be a different class and thread. This means you are working on a copy which might be old (in the meantime another thread might have changed the m_anotherClass by calling setObject() ).Now what if you turn m_anotherClass to a pointer instead of an object-variable ?
class MyClass
{
mutex m_mutex;
AnotherClass *m_anotherClass;
void setObject(AnotherClass *anotherClass)
{
m_mutex.lock();
m_anotherClass = anotherClass;
m_mutex.unlock();
}
AnotherClass * getObject()
{
AnotherClass *anotherClass;
m_mutex.lock();
anotherClass = m_anotherClass;
m_mutex.unlock();
return anotherClass;
}
}
This is an example where a mutex is not enough to solve all the problems.
With pointers you can have a copy only of the pointer but the pointed object is the same in the both the caller and the method. So even if the pointer was valid at the time that the getObject() was called you don't have any guarantee that the pointed value will exists during the operation you are performing with it. This is simply because you don't have control on the object lifetime. That's why you should use object-variables as much as possible and avoid pointers (if you can).