About shared_mutex and shared_ptr across multiple threads - c++

I implemented code such that multiple instances running on different threads reads other instances' data using reader-writer lock and shared_ptr. It seemed fine, but I am not 100% sure about that and I came up with some questions about usage of those.
Detail
I have multiple instances of a class called Chunk and each instance does some calculations in a dedicated thread. A chunk needs to read neighbour chunks' data as well as its own data, but it doesn't write neighbours' data, so reader-writer lock is used. Also, neighbours can be set at runtime. For example, I might want o set a different neighbour chunk at runtime, sometimes just nullptr. It is possible to delete a chunk at runtime, too. Raw pointers can be used but I thought shared_ptr and weak_ptr are better for this, in order to keep track of the lifetime. Own data in shared_ptr and neighbours' data in weak_ptr.
I provided a simpler version of my code below. ChunkData has data and a mutex for it. I use InitData for data initialization and DoWork function is called in a dedicated thread after that. other functions can be called from main thread.
This seems to work, but I am not so confident. Especially, about use of shared_ptr across multiple threads.
What happens if a thread calls shared_ptr's reset() (in ctor and InitData) and other uses it with weak_ptr's lock (in DoWork)? Does this need a lock dataMutex or chunkMutex?
How about copy(in SetNeighbour)? Do I need locks for this as well?
I think other parts are ok, but please let me know if you find anything dangerous. Appreciate that.
By the way, I considered about storing shared_ptr of Chunk instead of ChunkData, but decided not to use this method because internal code, which I don't manage, has GC system and it can delete a pointer to Chunk when I don't expect it.
class Chunk
{
public:
class ChunkData
{
public:
shared_mutex dataMutex; // mutex to read/write data
int* data;
int size;
ChunkData() : data(nullptr) { }
~ChunkData()
{
if (data)
{
delete[] data;
data = nullptr;
}
}
};
private:
mutex chunkMutex; // mutex to read/write member variables
shared_ptr<ChunkData> chunkData;
weak_ptr<ChunkData> neighbourChunkData;
string result;
public:
Chunk(string _name)
: chunkData(make_shared<ChunkData>())
{
}
~Chunk()
{
EndProcess();
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset();
}
void InitData(int size)
{
ChunkData* NewData = new ChunkData();
NewData->size = size;
NewData->data = new int[size];
{
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset(NewData);
cout << "init chunk " << name << endl;
}
}
// This is executed in other thread. e.g. thread t(&Chunk::DoWork, this);
void DoWork()
{
lock_guard lock(chunkMutex); // we modify some members such as result(string) reading chunk data, so need this.
if (chunkData)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data)
{
// read chunkData->data[i] and modify some members such as result(string)
for (int i = 0; i < chunkData->size; ++i)
{
// Is this fine, or should I write data result outside of readLock scope?
result += to_string(chunkData->data[i]) + " ";
}
}
}
// does this work?
if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock())
{
shared_lock readLock(neighbour->dataMutex);
if (neighbour->data)
{
// read neighbour->data[i] and modify some members as above
}
}
}
shared_ptr<ChunkData> GetChunkData()
{
unique_lock lock(chunkMutex);
return chunkData;
}
void SetNeighbour(Chunk* neighbourChunk)
{
if (neighbourChunk)
{
// safe?
shared_ptr<ChunkData> newNeighbourData = neighbourChunk->GetChunkData();
unique_lock lock(chunkMutex); // lock for chunk properties
{
shared_lock readLock(newNeighbourData->dataMutex); // not sure if this is needed.
neighbourChunkData = newNeighbourData;
}
}
}
int GetDataAt(int index)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}
};
Edit 1
I added more detail for DoWork function. Chunk data is read and chunk's member variables are edited in the function.
After Homer512's anwer, I came up with other questions.
A) In DoWork function I write a member variable inside a read lock. Should I only read data in a read lock scope and if I need to modify other data based on read data, do I have to do outside of the read lock? For example, copy the whole array to a local variable in a read lock, and modify other members outside of the read lock using the local.
B) I followed Homer512 and modifed GetDataAt/SetDataAt as below. I do read/write lock chunkData->dataMutex before unlocking chunkMutex. I also do this in DoWork function. Should I instead do locks separately? For example, make a local variable shared_ptr and set chunkData to it in a chunkMutex lock, unlock it, then lastly read/write lock that local variable's dataMutex and read/write data.
int GetDataAt(int index)
{
lock_guard chunkLock(chunkMutex);
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
lock_guard chunkLock(chunkMutex);
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}

I have several remarks:
~ChunkData: You could change your data member from int* to unique_ptr<int[]> to get the same result without an explicit destructor. Your code is correct though, just less convenient.
~Chunk: I don't think you need a lock or call the reset method. By the time the destructor runs, by definition, no one should have a reference to the Chunk object. So the lock can never be contested. And reset is unnecessary because the shared_ptr destructor will handle that.
InitData: Yes, the lock is needed because InitData can race with DoWork. You could avoid this by moving InitData to the constructor but I assume there are reasons for this division. You could also change the shared_ptr to std::atomic<std::shared_ptr<ChunkData> > to avoid the lock.
It is more efficient to write InitData like this:
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
// deletes old chunkData outside locked region if it was initialized before
}
make_shared avoids an additional memory allocation for the reference counter. This also moves all allocations and deallocations out of the critical section.
DoWork: Your comment "ready chunkData->data[i] and modify some members". You only take a shared_lock but say that you modify members. Well, which is it, reading or writing? Or do you mean to say that you modify Chunk but not ChunkData, with Chunk being protected by its own mutex?
SetNeighbour: You need to lock both your own chunkMutex and the neighbour's. You should not lock both at the same time to avoid the dining philosopher's problem (though std::lock solves this).
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
std::shared_ptr<ChunkData> newNeighbourData;
{
std::lock_guard<std::mutex> lock(neighbourChunk->chunkMutex);
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> lock(this->chunkMutex);
this->neighbourChunkData = newNeighbourData;
}
GetDataAt and SetDataAt: You need to lock chunkMutex. Otherwise you might race with InitData. There is no need to use std::lock because the order of locks is never swapped around.
EDIT 1:
DoWork: The line if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock()) doesn't keep the neighbur alive. Move the variable declaration out of the if to keep the reference.
EDIT: Alternative design proposal
What I'm bothered with is that your DoWork may be unable to proceed if InitData is still running or waiting to run. How do you want to deal with this? I suggest you make it possible to wait until the work can be done. Something like this:
class Chunk
{
std::mutex chunkMutex;
std::shared_ptr<ChunkData> chunkData;
std::weak_ptr<ChunkData> neighbourChunkData;
std::condition_variable chunkSet;
void waitForChunk(std::unique_lock<std::mutex>& lock)
{
while(! chunkData)
chunkSet.wait(lock);
}
public:
// modified version of my code above
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
chunkSet.notify_all();
}
void DoWork()
{
std::unique_lock<std::mutex> ownLock(chunkMutex);
waitForChunk(lock); // blocks until other thread finishes InitData
{
shared_lock readLock(chunkData->dataMutex);
...
}
shared_ptr<ChunkData> neighbour = neighbourChunkData.lock();
if(! neighbour)
return;
shared_lock readLock(neighbour->dataMutex);
...
}
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
shared_ptr<ChunkData> newNeighbourData;
{
std::unique_lock<std::mutex> lock(neighbourChunk->chunkMutex);
neighbourChunk->waitForChunk(lock); // wait until neighbor has finished InitData
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> ownLock(this->chunkMutex);
this->neighbourChunkData = std::move(newNeighbourData);
}
};
The downside to this is that you could deadlock if InitData is never called or if it failed with an exception. There are ways around this, like using an std::shared_future which knows that it is valid (set when InitData is scheduled) and whether it failed (records exception of associated promise or packaged_task).

Related

Destructor, when object's dynamic variable is locked by mutex will not free it?

I'm trying to solve some complicated (for me at least) asynchronous scenario at once, but I think it will be better to understand more simple case.
Consider an object, that has allocated memory, carrying by variable:
#include <thread>
#include <mutex>
using namespace std;
mutex mu;
class Object
{
public:
char *var;
Object()
{
var = new char[1]; var[0] = 1;
}
~Object()
{
mu.lock();
delete[]var; // destructor should free all dynamic memory on it's own, as I remember
mu.unlock();
}
}*object = nullptr;
int main()
{
object = new Object();
return 0;
}
What if while, it's var variable in detached, i.e. asynchronous thread, will be used, in another thread this object will be deleted?
void do_something()
{
for(;;)
{
mu.lock();
if(object)
if(object->var[0] < 255)
object->var[0]++;
else
object->var[0] = 0;
mu.unlock();
}
}
int main()
{
object = new Object();
thread th(do_something);
th.detach();
Sleep(1000);
delete object;
object = nullptr;
return 0;
}
Is is it possible that var will not be deleted in destructor?
Do I use mutex with detached threads correctly in code above?
2.1 Do I need cover by mutex::lock and mutex::unlock also delete object line?
I also once again separately point that I need new thread to be asynchronous. I do not need the main thread to be hanged, while new is running. I need two threads at once.
P.S. From a list of commentaries and answers one of most important thing I finally understood - mutex. The biggest mistake I thought is that already locked mutex skips the code between lock and unlock.
Forget about shared variables, mutex itself has noting to do with it. Mutex is just a mechanism for safely pause threads:
mutex mu;
void a()
{
mu.lock();
Sleep(1000);
mu.unlock();
}
int main()
{
thread th(a);
th.detach();
mu.lock(); // hangs here, until mu.unlock from a() will be called
mu.unlock();
return;
}
The concept is extremely simple - mutex object (imagine) has flag isLocked, when (any) thread calls lock method and isLocked is false, it just sets isLocked to true. But if isLocked is true already, mutex somehow on low-level hangs thread that called lock until isLocked will not become false. You can find part of source code of lock method scrolling down this page. Instead of mutex, probably just a bool variable could be used, but it will cause undefined behaviour.
Why is it referred to shared stuff? Because using same variable (memory) simultaneously from multiple threads makes undefined behaviour, so one thread, reaching some variable that currently can be used by another - should wait, until another will finish working with it, that's why mutex is used here.
Why accessing mutex itself from different threads does not make undefined behaviour? I don't know, going to google it.
Do I use mutex with detached threads correctly in code above?
Those are orthogonal concepts. I don't think mutex is used correctly since you only have one thread mutating and accessing the global variable, and you use the mutex to synchronize waits and exits. You should join the thread instead.
Also, detached threads are usually a code smell. There should be a way to wait all threads to finish before exiting the main function.
Do I need cover by mutex::lock and mutex::unlock also delete object line?
No since the destructor will call mu.lock() so you're fine here.
Is is it possible that var will not be deleted in destructor?
No, it will make you main thread to wait though. There are solutions to do this without using a mutex though.
There's usually two way to attack this problem. You can block the main thread until all other thread are done, or use shared ownership so both the main and the thread own the object variable, and only free when all owner are gone.
To block all thread until everyone is done then do cleanup, you can use std::barrier from C++20:
void do_something(std::barrier<std::function<void()>>& sync_point)
{
for(;;)
{
if(object)
if(object->var[0] < 255)
object->var[0]++;
else
object->var[0] = 0;
} // break at a point so the thread exits
sync_point.arrive_and_wait();
}
int main()
{
object = new Object();
auto const on_completion = []{ delete object; };
// 2 is the number of threads. I'm counting the main thread since
// you're using detached threads
std::barrier<std::function<void()>> sync_point(2, on_completion);
thread th(do_something, std::ref(sync_point));
th.detach();
Sleep(1000);
sync_point.arrive_and_wait();
return 0;
}
Live example
This will make all the threads (2 of them) wait until all thread gets to the sync point. Once that sync point is reached by all thread, it will run the on_completion function, which will delete the object once when no one needs it anymore.
The other solution would be to use a std::shared_ptr so anyone can own the pointer and free it only when no one is using it anymore. Note that you will need to remove the object global variable and replace it with a local variable to track the shared ownership:
void do_something(std::shared_ptr<Object> object)
{
for(;;)
{
if(object)
if(object->var[0] < 255)
object->var[0]++;
else
object->var[0] = 0;
}
}
int main()
{
std::shared_ptr<Object> object = std::make_shared<Object>();
// You need to pass it as parameter otherwise it won't be safe
thread th(do_something, object);
th.detach();
Sleep(1000);
// If the thread is done, this line will call delete
// If the thread is not done, the thread will call delete
// when its local `object` variable goes out of scope.
object = nullptr;
return 0;
}
Is is it possible that var will not be deleted in destructor?
With
~Object()
{
mu.lock();
delete[]var; // destructor should free all dynamic memory on it's own, as I remember
mu.unlock();
}
You might have to wait that lock finish, but var would be deleted.
Except that your program exhibits undefined behaviour with non protected concurrent access to object. (delete object isn't protected, and you read it in your another thread), so everything can happen.
Do I use mutex with detached threads correctly in code above?
Detached or not is irrelevant.
And your usage of mutex is wrong/incomplete.
which variable should your mutex be protecting?
It seems to be a mix between object and var.
If it is var, you might reduce scope in do_something (lock only in if-block)
And it misses currently some protection to object.
2.1 Do I need cover by mutex::lock and mutex::unlock also delete object line?
Yes object need protection.
But you cannot use that same mutex for that. std::mutex doesn't allow to lock twice in same thread (a protected delete[]var; inside a protected delete object) (std::recursive_mutex allows that).
So we come back to the question which variable does the mutex protect?
if only object (which is enough in your sample), it would be something like:
#include <thread>
#include <mutex>
using namespace std;
mutex mu;
class Object
{
public:
char *var;
Object()
{
var = new char[1]; var[0] = 1;
}
~Object()
{
delete[]var; // destructor should free all dynamic memory on it's own, as I remember
}
}*object = nullptr;
void do_something()
{
for(;;)
{
mu.lock();
if(object)
if(object->var[0] < 255)
object->var[0]++;
else
object->var[0] = 0;
mu.unlock();
}
}
int main()
{
object = new Object();
thread th(do_something);
th.detach();
Sleep(1000);
mu.lock(); // or const std::lock_guard<std::mutex> lock(mu); and get rid of unlock
delete object;
object = nullptr;
mu.unlock();
return 0;
}
Alternatively, as you don't have to share data between thread, you might do:
int main()
{
Object object;
thread th(do_something);
Sleep(1000);
th.join();
return 0;
}
and get rid of all mutex
Have a look at this, it shows the use of scoped_lock, std::async and managment of lifecycles through scopes (demo here : https://onlinegdb.com/FDw9fG9rS)
#include <future>
#include <mutex>
#include <chrono>
#include <iostream>
// using namespace std; <== dont do this
// mutex mu; avoid global variables.
class Object
{
public:
Object() :
m_var{ 1 }
{
}
~Object()
{
}
void do_something()
{
using namespace std::chrono_literals;
for(std::size_t n = 0; n < 30; ++n)
{
// extra scope to reduce time of the lock
{
std::scoped_lock<std::mutex> lock{ m_mtx };
m_var++;
std::cout << ".";
}
std::this_thread::sleep_for(150ms);
}
}
private:
std::mutex m_mtx;
char m_var;
};
int main()
{
Object object;
// extra scope to manage lifecycle of future
{
// use a lambda function to start the member function of object
auto future = std::async(std::launch::async, [&] {object.do_something(); });
std::cout << "do something started\n";
// destructor of future will synchronize with end of thread;
}
std::cout << "\n work done\n";
// safe to go out of scope now and destroy the object
return 0;
}
All you assumed and asked in your question is right. The variable will always be freed.
But your code has one big problem. Lets look at your example:
int main()
{
object = new Object();
thread th(do_something);
th.detach();
Sleep(1000);
delete object;
object = nullptr;
return 0;
}
You create a thread that will call do_something(). But lets just assume that right after the thread creation the kernel interrupts the thread and does something else, like updating the stackoverflow tab in your web browser with this answer. So do_something() isn't called yet and won't be for a while since we all know how slow browsers are.
Meanwhile the main function sleeps 1 second and then calls delete object;. That calls Object::~Object(), which acquires the mutex and deletes the var and releases the mutex and finally frees the object.
Now assume that right at this point the kernel interrupts the main thread and schedules the other thread. object still has the address of the object that was deleted. So your other thread will acquire the mutex, object is not nullptr so it accesses it and BOOM.
PS: object isn't atomic so calling object = nullptr in main() will also race with if (object).

Mutex Safety with Interrupts (Embedded Firmware)

Edit #Mike pointed out that my try_lock function in the code below is unsafe and that accessor creation can produce a race condition as well. The suggestions (from everyone) have convinced me that I'm going down the wrong path.
Original Question
The requirements for locking on an embedded microcontroller are different enough from multithreading that I haven't been able to convert multithreading examples to my embedded applications. Typically I don't have an OS or threads of any kind, just main and whatever interrupt functions are called by the hardware periodically.
It's pretty common that I need to fill up a buffer from an interrupt, but process it in main. I've created the IrqMutex class below to try to safely implement this. Each person trying to access the buffer is assigned a unique id through IrqMutexAccessor, then they each can try_lock() and unlock(). The idea of a blocking lock() function doesn't work from interrupts because unless you allow the interrupt to complete, no other code can execute so the unlock() code never runs. I do however use a blocking lock from the main() code occasionally.
However, I know that the double-check lock doesn't work without C++11 memory barriers (which aren't available on many embedded platforms). Honestly despite reading quite a bit about it, I don't really understand how/why the memory access reordering can cause a problem. I think that the use of volatile sig_atomic_t (possibly combined with the use of unique IDs) makes this different from the double-check lock. But I'm hoping someone can: confirm that the following code is correct, explain why it isn't safe, or offer a better way to accomplish this.
class IrqMutex {
friend class IrqMutexAccessor;
private:
std::sig_atomic_t accessorIdEnum;
volatile std::sig_atomic_t owner;
protected:
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
bool have_lock(std::sig_atomic_t accessorId) {
return (owner == accessorId);
}
bool try_lock(std::sig_atomic_t accessorId) {
// Only try to get a lock, while it isn't already owned.
while (owner == SIG_ATOMIC_MIN) {
// <-- If an interrupt occurs here, both attempts can get a lock at the same time.
// Try to take ownership of this Mutex.
owner = accessorId; // SET
// Double check that we are the owner.
if (owner == accessorId) return true;
// Someone else must have taken ownership between CHECK and SET.
// If they released it after CHECK, we'll loop back and try again.
// Otherwise someone else has a lock and we have failed.
}
// This shouldn't happen unless they called try_lock on something they already owned.
if (owner == accessorId) return true;
// If someone else owns it, we failed.
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
// Double check that the owner called this function (not strictly required)
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
// We still return true if the mutex was unlocked anyway.
return (owner == SIG_ATOMIC_MIN);
}
public:
IrqMutex(void) : accessorIdEnum(SIG_ATOMIC_MIN), owner(SIG_ATOMIC_MIN) {}
};
// This class is used to manage our unique accessorId.
class IrqMutexAccessor {
friend class IrqMutex;
private:
IrqMutex& mutex;
const std::sig_atomic_t accessorId;
public:
IrqMutexAccessor(IrqMutex& m) : mutex(m), accessorId(m.nextAccessor()) {}
bool have_lock(void) { return mutex.have_lock(accessorId); }
bool try_lock(void) { return mutex.try_lock(accessorId); }
bool unlock(void) { return mutex.unlock(accessorId); }
};
Because there is one processor, and no threading the mutex serves what I think is a subtly different purpose than normal. There are two main use cases I run into repeatedly.
The interrupt is a Producer and takes ownership of a free buffer and loads it with a packet of data. The interrupt/Producer may keep its ownership lock for a long time spanning multiple interrupt calls. The main function is the Consumer and takes ownership of a full buffer when it is ready to process it. The race condition rarely happens, but if the interrupt/Producer finishes with a packet and needs a new buffer, but they are all full it will try to take the oldest buffer (this is a dropped packet event). If the main/Consumer started to read and process that oldest buffer at exactly the same time they would trample all over each other.
The interrupt is just a quick change or increment of something (like a counter). However, if we want to reset the counter or jump to some new value with a call from the main() code we don't want to try to write to the counter as it is changing. Here main actually does a blocking loop to obtain a lock, however I think its almost impossible to have to actually wait here for more than two attempts. Once it has a lock, any calls to the counter interrupt will be skipped, but that's generally not a big deal for something like a counter. Then I update the counter value and unlock it so it can start incrementing again.
I realize these two samples are dumbed down a bit, but some version of these patterns occur in many of the peripherals in every project I work on and I'd like once piece of reusable code that can safely handle this across various embedded platforms. I included the C tag, because all of this is directly convertible to C code, and on some embedded compilers that's all that is available. So I'm trying to find a general method that is guaranteed to work in both C and C++.
struct ExampleCounter {
volatile long long int value;
IrqMutex mutex;
} exampleCounter;
struct ExampleBuffer {
volatile char data[256];
volatile size_t index;
IrqMutex mutex; // One mutex per buffer.
} exampleBuffers[2];
const volatile char * const REGISTER;
// This accessor shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutex(exampleCounter.mutex);
void __irqQuickFunction(void) {
// Obtain a lock, add the data then unlock all within one function call.
if (myMutex.try_lock()) {
exampleCounter.value++;
myMutex.unlock();
} else {
// If we failed to obtain a lock, we skipped this update this one time.
}
}
// These accessors shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
void __irqLongFunction(void) {
static size_t bufferIndex = 0;
// Check if we have a lock.
if (!myMutex[bufferIndex].have_lock() and !myMutex[bufferIndex].try_lock()) {
// If we can't get a lock try the other buffer
bufferIndex = (bufferIndex + 1) % 2;
// One buffer should always be available so the next line should always be successful.
if (!myMutex[bufferIndex].try_lock()) return;
}
// ... at this point we know we have a lock ...
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
exampleBuffers[bufferIndex].data[exampleBuffers[bufferIndex].index++] = c;
// We may keep the lock for multiple function calls until the end of packet.
static const char END_PACKET_SIGNAL = '\0';
if (c == END_PACKET_SIGNAL) {
// Unlock this buffer so it can be read from main.
myMutex[bufferIndex].unlock();
// Switch to the other buffer for next time.
bufferIndex = (bufferIndex + 1) % 2;
}
}
int main(void) {
while (true) {
// Mutex for counter
static IrqMutexAccessor myCounterMutex(exampleCounter.mutex);
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
// Skip any updates that occur while we are updating the counter.
while(!myCounterMutex.try_lock()) {
// Wait for the interrupt to release its lock.
}
// Set the counter to a new value.
exampleCounter.value = 500;
// Updates will start again as soon as we unlock it.
myCounterMutex.unlock();
}
// Mutexes for __irqLongFunction.
static IrqMutexAccessor myBufferMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
// Process buffers from __irqLongFunction.
for (size_t i = 0; i < 2; i++) {
// Obtain a lock so we can read the data.
if (!myBufferMutexes[i].try_lock()) continue;
// Check that the buffer isn't empty.
if (exampleBuffers[i].index == 0) {
myBufferMutexes[i].unlock(); // Don't forget to unlock.
continue;
}
// ... read and do something with the data here ...
exampleBuffer.index = 0;
myBufferMutexes[i].unlock();
}
}
}
}
Also note that I used volatile on any variable that is read-by or written-by the interrupt routine (unless the variable was only accessed from the interrupt like the static bufferIndex value in __irqLongFunction). I've read that mutexes remove some of need for volatile in multithreaded code, but I don't think that applies here. Did I use the right amount of volatile? I used it on: ExampleBuffer[].data[256], ExampleBuffer[].index, and ExampleCounter.value.
I apologize for the long answer, but perhaps it is fitting for a long question.
To answer your first question, I would say that your implementation of IrqMutex is not safe. Let me try to explain where I see problems.
Function nextAccessor
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
This function has a race condition, because the increment operator is not atomic, despite it being on an atomic value marked volatile. It involves 3 operations: reading the current value of accessorIdEnum, incrementing it, and writing the result back. If two IrqMutexAccessors are created at the same time, it's possible that they both get the same ID.
Function try_lock
The try_lock function also has a race condition. One thread (eg main), could go into the while loop, and then before taking ownership, another thread (eg an interrupt) can also go into the while loop and take ownership of the lock (returning true). Then the first thread can continue, moving onto owner = accessorId, and thus "also" take the lock. So two threads (or your main thread and an interrupt) can try_lock on an unowned mutex at the same time and both return true.
Disabling interrupts by RAII
We can achieve some level of simplicity and encapsulation by using RAII for interrupt disabling, for example the following class:
class InterruptLock {
public:
InterruptLock() {
prevInterruptState = currentInterruptState();
disableInterrupts();
}
~InterruptLock() {
restoreInterrupts(prevInterruptState);
}
private:
int prevInterruptState; // Whatever type this should be for the platform
InterruptLock(const InterruptLock&); // Not copy-constructable
};
And I would recommend disabling interrupts to get the atomicity you need within the mutex implementation itself. For example something like:
bool try_lock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == SIG_ATOMIC_MIN) {
owner = accessorId;
return true;
}
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
return false;
}
Depending on your platform, this might look different, but you get the idea.
As you said, this provides a platform to abstract away from the disabling and enabling interrupts in general code, and encapsulates it to this one class.
Mutexes and Interrupts
Having said how I would consider implementing the mutex class, I would not actually use a mutex class for your use-cases. As you pointed out, mutexes don't really play well with interrupts, because an interrupt can't "block" on trying to acquire a mutex. For this reason, for code that directly exchanges data with an interrupt, I would instead strongly consider just directly disabling interrupts (for a very short time while the main "thread" touches the data).
So your counter might simply look like this:
volatile long long int exampleCounter;
void __irqQuickFunction(void) {
exampleCounter++;
}
...
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
InterruptLock lock;
exampleCounter = 500;
}
In my mind, this is easier to read, easier to reason about, and won't "slip" when there's contention (ie miss timer beats).
Regarding the buffer use-case, I would strongly recommend against holding a lock for multiple interrupt cycles. A lock/mutex should be held for just the slightest moment required to "touch" a piece of memory - just long enough to read or write it. Get in, get out.
So this is how the buffering example might look:
struct ExampleBuffer {
char data[256];
} exampleBuffers[2];
ExampleBuffer* volatile bufferAwaitingConsumption = nullptr;
ExampleBuffer* volatile freeBuffer = &exampleBuffers[1];
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
receiveBuffer->data[index++] = c;
// End of packet?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
bufferAwaitingConsumption = receiveBuffer;
// Move on to the next buffer
receiveBuffer = freeBuffer;
freeBuffer = nullptr;
index = 0;
}
}
int main(void) {
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet;
{
InterruptLock lock;
packet = bufferAwaitingConsumption;
bufferAwaitingConsumption = nullptr;
}
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
{
InterruptLock lock;
freeBuffer = packet;
}
}
}
}
This code is arguably easier to reason about, since there are only two memory locations shared between the interrupt and the main loop: one to pass packets from the interrupt to the main loop, and one to pass empty buffers back to the interrupt. We also only touch those variables under "lock", and only for the minimum time needed to "move" the value. (for simplicity I've skipped over the buffer overflow logic when the main loop takes too long to free the buffer).
It's true that in this case one may not even need the locks, since we're just reading and writing simple value, but the cost of disabling the interrupts is not much, and the risk of making mistakes otherwise, is not worth it in my opinion.
Edit
As pointed out in the comments, the above solution was meant to only tackle the multithreading problem, and omitted overflow checking. Here is more complete solution which should be robust under overflow conditions:
const size_t BUFFER_COUNT = 2;
struct ExampleBuffer {
char data[256];
ExampleBuffer* next;
} exampleBuffers[BUFFER_COUNT];
volatile size_t overflowCount = 0;
class BufferList {
public:
BufferList() : first(nullptr), last(nullptr) { }
// Atomic enqueue
void enqueue(ExampleBuffer* buffer) {
InterruptLock lock;
if (last)
last->next = buffer;
else {
first = buffer;
last = buffer;
}
}
// Atomic dequeue (or returns null)
ExampleBuffer* dequeueOrNull() {
InterruptLock lock;
ExampleBuffer* result = first;
if (first) {
first = first->next;
if (!first)
last = nullptr;
}
return result;
}
private:
ExampleBuffer* first;
ExampleBuffer* last;
} freeBuffers, buffersAwaitingConsumption;
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Recovery from overflow?
if (!receiveBuffer) {
// Try get another free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
// Still no buffer?
if (!receiveBuffer) {
overflowCount++;
return;
}
}
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
if (index < sizeof(receiveBuffer->data))
receiveBuffer->data[index++] = c;
// End of packet, or out of space?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
buffersAwaitingConsumption.enqueue(receiveBuffer);
// Move on to the next free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
index = 0;
}
}
size_t getAndResetOverflowCount() {
InterruptLock lock;
size_t result = overflowCount;
overflowCount = 0;
return result;
}
int main(void) {
// All buffers are free at the start
for (int i = 0; i < BUFFER_COUNT; i++)
freeBuffers.enqueue(&exampleBuffers[i]);
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet = dequeueOrNull();
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
freeBuffers.enqueue(packet);
}
size_t overflowBytes = getAndResetOverflowCount();
if (overflowBytes) {
// ...
}
}
}
The key changes:
If the interrupt runs out of free buffers, it will recover
If the interrupt receives data while it doesn't have a receive buffer, it will communicate that to the main thread via getAndResetOverflowCount
If you keep getting buffer overflows, you can simply increase the buffer count
I've encapsulated the multithreaded access into a queue class implemented as a linked list (BufferList), which supports atomic dequeue and enqueue. The previous example also used queues, but of length 0-1 (either an item is enqueued or it isn't), and so the implementation of the queue was just a single variable. In the case of running out of free buffers, the receive queue could have 2 items, so I upgraded it to a proper queue rather than adding more shared variables.
If the interrupt is the producer and mainline code is the consumer, surely it's as simple as disabling the interrupt for the duration of the consume operation?
That's how I used to do it in my embedded micro controller days.

Am I using this deque in a thread safe manner?

I'm trying to understand multi threading in C++. In the following bit of code, will the deque 'tempData' declared in retrieve() always have every element processed once and only once, or could there be multiple copies of tempData across multiple threads with stale data, causing some elements to be processed multiple times? I'm not sure if passing by reference actually causes there to be only one copy in this case?
static mutex m;
void AudioAnalyzer::analysisThread(deque<shared_ptr<AudioAnalysis>>& aq)
{
while (true)
{
m.lock();
if (aq.empty())
{
m.unlock();
break;
}
auto aa = aq.front();
aq.pop_front();
m.unlock();
if (false) //testing
{
retrieveFromDb(aa);
}
else
{
analyzeAudio(aa);
}
}
}
void AudioAnalyzer::retrieve()
{
deque<shared_ptr<AudioAnalysis>>tempData(data);
vector<future<void>> futures;
for (int i = 0; i < NUM_THREADS; ++i)
{
futures.push_back(async(bind(&AudioAnalyzer::analysisThread, this, _1), ref(tempData)));
}
for (auto& f : futures)
{
f.get();
}
}
Looks OK to me.
Threads have shared memory and if the reference to tempData turns up as a pointer in the thread then every thread sees exactly the same pointer value and the same single copy of tempData. [You can check that if you like with a bit of global code or some logging.]
Then the mutex ensures single-threaded access, at least in the threads.
One problem: somewhere there must be a push onto the deque, and that may need to be locked by the mutex as well. [Obviously the push_back onto the futures queue is just local.]

Thread locks occuring using boost::thread. What's wrong with my condition variables?

I wrote a Link class for passing data between two nodes in a network. I've implemented it with two deques (one for data going from node 0 to node 1, and the other for data going from node 1 to node 0). I'm trying to multithread the application, but I'm getting threadlocks. I'm trying to prevent reading from and writing to the same deque at the same time. In reading more about how I originally implemented this, I think I'm using the condition variables incorrectly (and maybe shouldn't be using the boolean variables?). Should I have two mutexes, one for each deque? Please help if you can. Thanks!
class Link {
public:
// other stuff...
void push_back(int sourceNodeID, Data newData);
void get(int destinationNodeID, std::vector<Data> &linkData);
private:
// other stuff...
std::vector<int> nodeIDs_;
// qVector_ has two deques, one for Data from node 0 to node 1 and
// one for Data from node 1 to node 0
std::vector<std::deque<Data> > qVector_;
void initialize(int nodeID0, int nodeID1);
boost::mutex mutex_;
std::vector<boost::shared_ptr<boost::condition_variable> > readingCV_;
std::vector<boost::shared_ptr<boost::condition_variable> > writingCV_;
std::vector<bool> writingData_;
std::vector<bool> readingData_;
};
The push_back function:
void Link::push_back(int sourceNodeID, Data newData)
{
int idx;
if (sourceNodeID == nodeIDs_[0]) idx = 1;
else
{
if (sourceNodeID == nodeIDs_[1]) idx = 0;
else throw runtime_error("Link::push_back: Invalid node ID");
}
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (readingData_[idx]) readingCV_[idx]->wait(lock);
writingData_[idx] = true;
qVector_[idx].push_back(newData);
writingData_[idx] = false;
writingCV_[idx]->notify_all();
}
The get function:
void Link::get(int destinationNodeID,
std::vector<Data> &linkData)
{
int idx;
if (destinationNodeID == nodeIDs_[0]) idx = 0;
else
{
if (destinationNodeID == nodeIDs_[1]) idx = 1;
else throw runtime_error("Link::get: Invalid node ID");
}
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (writingData_[idx]) writingCV_[idx]->wait(lock);
readingData_[idx] = true;
std::copy(qVector_[idx].begin(),qVector_[idx].end(),back_inserter(linkData));
qVector_[idx].erase(qVector_[idx].begin(),qVector_[idx].end());
readingData_[idx] = false;
readingCV_[idx]->notify_all();
return;
}
and here's initialize (in case it's helpful)
void Link::initialize(int nodeID0, int nodeID1)
{
readingData_ = std::vector<bool>(2,false);
writingData_ = std::vector<bool>(2,false);
for (int i = 0; i < 2; ++i)
{
readingCV_.push_back(make_shared<boost::condition_variable>());
writingCV_.push_back(make_shared<boost::condition_variable>());
}
nodeIDs_.reserve(2);
nodeIDs_.push_back(nodeID0);
nodeIDs_.push_back(nodeID1);
qVector_.reserve(2);
qVector_.push_back(std::deque<Data>());
qVector_.push_back(std::deque<Data>());
}
I'm trying to multithread the application, but I'm getting threadlocks.
What is a "threadlock"? It's difficult to see what your code is trying to accomplish. Consider, first, your push_back() code, whose synchronized portion looks like this:
boost::unique_lock<boost::mutex> lock(mutex_);
while (readingData_[idx]) readingCV_[idx]->wait(lock);
writingData_[idx] = true;
qVector_[idx].push_back(newData);
writingData_[idx] = false;
writingCV_[idx]->notify_all();
Your writingData[idx] boolean starts off as false, and becomes true only momentarily while a thread has the mutex locked. By the time the mutex is released, it is false again. So for any other thread that has to wait to acquire the mutex, writingData[idx] will never be true.
But in your get() code, you have
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (writingData_[idx]) writingCV_[idx]->wait(lock);
By the time a thread gets the lock on the mutex, writingData[idx] is back to false and so the while loop (and wait on the CV) is never entered.
An exactly symmetric analysis applies to the readingData[idx] boolean, which also is always false outside the mutex lock.
So your condition variables are never waited on. You need to completely rethink your design.
Start with one mutex per queue (the deque is overkill for simply passing data), and for each queue associate a condition variable with the queue being non-empty. The get() method will thus wait until the queue is non-empty, which will be signalled in the push_back() method. Something like this (untested code):
template <typename Data>
class BasicQueue
{
public:
void push( Data const& data )
{
boost::unique_lock _lock( mutex_ );
queue_.push_back( data );
not_empty_.notify_all();
}
void get ( Data& data )
{
boost::unique_lock _lock( mutex_ );
while ( queue_.size() == 0 )
not_empty_.wait( _lock ); // this releases the mutex
// mutex is reacquired here, with queue_.size() > 0
data = queue_.front();
queue_.pop_front();
}
private:
std::queue<Data> queue_;
boost::mutex mutex_;
boost::condition_variable not_empty_;
};
Yes. You need two mutexes. Your deadlocks are almost certainly a result of contention on the single mutex. If you break into your running program with a debugger you will see where the threads are hanging. Also I don't see why you would need the bools. (EDIT: It may be possible to come up with a design that uses a single mutex but it's simpler and safer to stick with one mutex per shared data structure)
A rule of thumb would be to have one mutex per shared data structure you are trying to protect. That mutex guards the data structure against concurrent access and provides thread safety. In your case one mutex per deque. E.g.:
class SafeQueue
{
private:
std::deque<Data> q_;
boost::mutex m_;
boost::condition_variable v_;
public:
void push_back(Data newData)
{
boost::lock_guard<boost::mutex> lock(m_);
q_.push_back(newData);
// notify etc.
}
// ...
};
In terms of notification via condition variables see here:
Using condition variable in a producer-consumer situation
So there would also be one condition_variable per object which the producer would notify and the consumer would wait on. Now you can create two of these queues for communicating in both directions. Keep in mind that with only two threads you can still deadlock if both threads are blocked (waiting for data) and both queues are empty.

Need some advice to make the code multithreaded

I received a code that is not for multi-threaded app, now I have to modify the code to support for multi-threaded.
I have a Singleton class(MyCenterSigltonClass) that based on instruction in:
http://en.wikipedia.org/wiki/Singleton_pattern
I made it thread-safe
Now I see inside the class that contains 10-12 members, some with getter/setter methods.
Some members are declared as static and are class pointer like:
static Class_A* f_static_member_a;
static Class_B* f_static_member_b;
for these members, I defined a mutex(like mutex_a) INSIDE the class(Class_A) , I didn't add the mutex directly in my MyCenterSigltonClass, the reason is they are one to one association with my MyCenterSigltonClass, I think I have option to define mutex in the class(MyCenterSigltonClass) or (Class_A) for f_static_member_a.
1) Am I right?
Also, my Singleton class(MyCenterSigltonClass) contains some other members like
Class_C f_classC;
for these kind of member variables, should I define a mutex for each of them in MyCenterSigltonClass to make them thread-safe? what would be a good way to handle these cases?
Appreciate for any suggestion.
-Nima
Whether the members are static or not doesn't really matter. How you protect the member variables really depends on how they are accessed from public methods.
You should think about a mutex as a lock that protects some resource from concurrent read/write access. You don't need to think about protecting the internal class objects necessarily, but the resources within them. You also need to consider the scope of the locks you'll be using, especially if the code wasn't originally designed to be multithreaded. Let me give a few simple examples.
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
};
class B
{
private:
A* mA;
public:
B()
{
int values[5] = { 1, 2, 3, 4, 5 };
mA = new A(5, values);
}
const A* getA() const { return mA; }
};
In this code, there's no need to protect mA because there's no chance of conflicting access across multiple threads. None of the threads can modify the state of mA, so all concurrent access just reads from mA. However, if we modify class A:
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = 0;
mValues = NULL;
setValues(count, values);
}
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
void setValues(int count, int* values)
{
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
};
We can now have multiple threads calling B::getA() and one thread can read from mA while another thread writes to mA. Consider the following thread interaction:
Thread A: a->getValues(maxCount, values);
Thread B: a->setValues(newCount, newValues);
It's possible that Thread B will delete mValues while Thread A is in the middle of copying it. In this case, you would need a mutex within class A to protect access to mValues and mValuesCount:
int getValues(int count, int* values) const
{
// TODO: Lock mutex.
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
int returnCount = mValuesCount;
// TODO: Unlock mutex.
return returnCount;
}
void setValues(int count, int* values)
{
// TODO: Lock mutex.
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
// TODO: Unlock mutex.
}
This will prevent concurrent read/write on mValues and mValuesCount. Depending on the locking mechanisms available in your environment, you may be able to use a read-only locking mechanism in getValues() to prevent multiple threads from blocking on concurrent read access.
However, you'll also need to understand the scope of the locking you need to implement if class A is more complex:
class A
{
private:
int mValuesCount;
int* mValues;
public:
A(int count, int* values)
{
mValuesCount = 0;
mValues = NULL;
setValues(count, values);
}
int getValueCount() const { return mValuesCount; }
int getValues(int count, int* values) const
{
if (mValues && values)
{
memcpy(values, mValues, (count < mValuesCount) ? count : mValuesCount);
}
return mValuesCount;
}
void setValues(int count, int* values)
{
delete [] mValues;
mValuesCount = count;
mValues = (count > 0) ? new int[count] : NULL;
if (mValues)
{
memcpy(mValues, values, count * sizeof(int));
}
}
};
In this case, you could have the following thread interaction:
Thread A: int maxCount = a->getValueCount();
Thread A: // allocate memory for "maxCount" int values
Thread B: a->setValues(newCount, newValues);
Thread A: a->getValues(maxCount, values);
Thread A has been written as though calls to getValueCount() and getValues() will be an uninterrupted operation, but Thread B has potentially changed the count in the middle of Thread A's operations. Depending on whether the new count is larger or smaller than the original count, it may take a while before you discover this problem. In this case, class A would need to be redesigned or it would need to provide some kind of transaction support so the thread using class A could block/unblock other threads:
Thread A: a->lockValues();
Thread A: int maxCount = a->getValueCount();
Thread A: // allocate memory for "maxCount" int values
Thread B: a->setValues(newCount, newValues); // Blocks until Thread A calls unlockValues()
Thread A: a->getValues(maxCount, values);
Thread A: a->unlockValues();
Thread B: // Completes call to setValues()
Since the code wasn't initially designed to be multithreaded, it's very likely you'll run into these kinds of issues where one method call uses information from an earlier call, but there was never a concern for the state of the object changing between those calls.
Now, begin to imagine what could happen if there are complex state dependencies among the objects within your singleton and multiple threads can modify the state of those internal objects. It can all become very, very messy with a large number of threads and debugging can become very difficult.
So as you try to make your singleton thread-safe, you need to look at several layers of object interactions. Some good questions to ask:
Do any of the methods on the singleton reveal internal state that may change between method calls (as in the last example I mention)?
Are any of the internal objects revealed to clients of the singleton?
If so, do any of the methods on those internal objects reveal internal state that may change between method calls?
If internal objects are revealed, do they share any resources or state dependencies?
You may not need any locking if you're just reading state from internal objects (first example). You may need to provide simple locking to prevent concurrent read/write access (second example). You may need to redesign the classes or provide clients with the ability to lock object state (third example). Or you may need to implement more complex locking where internal objects share state information across threads (e.g. a lock on a resource in class Foo requires a lock on a resource in class Bar, but locking that resource in class Bar doesn't necessarily require a lock on a resource in class Foo).
Implementing thread-safe code can become a complex task depending on how all your objects interact. It can be much more complicated than the examples I've given here. Just be sure you clearly understand how your classes are used and how they interact (and be prepared to spend some time tracking down difficult to reproduce bugs).
If this is the first time you're doing threading, consider not accessing the singleton from the background thread. You can get it right, but you probably won't get it right the first time.
Realize that if your singleton exposes pointers to other objects, these should be made thread safe as well.
You don't have to define a mutex for each member. For example, you could instead use a single mutex to synchronize access each to member, e.g.:
class foo
{
public:
...
void some_op()
{
// acquire "lock_" and release using RAII ...
Lock(lock_);
a++;
}
void set_b(bar * b)
{
// acquire "lock_" and release using RAII ...
Lock(lock_);
b_ = b;
}
private:
int a_;
bar * b_;
mutex lock_;
}
Of course a "one lock" solution may be not suitable in your case. That's up to you to decide. Regardless, simply introducing locks doesn't make the code thread-safe. You have to use them in the right place in the right way to avoid race conditions, deadlocks, etc. There are lots of concurrency issues you could run in to.
Furthermore you don't always need mutexes, or other threading mechanisms like TSS, to make code thread-safe. For example, the following function "func" is thread-safe:
class Foo;
void func (Foo & f)
{
f.some_op(); // Foo::some_op() of course needs to be thread-safe.
}
// Thread 1
Foo a;
func(&a);
// Thread 2
Foo b;
func(&b);
While the func function above is thread-safe the operations it invokes may not be thread-safe. The point is you don't always need to pepper your code with mutexes and other threading mechanisms to make the code thread safe. Sometimes restructuring the code is sufficient.
There's a lot of literature on multithreaded programming. It's definitely not easy to get right so take your time in understanding the nuances, and take advantage of existing frameworks like Boost.Thread to mitigate some of the inherent and accidental complexities that exist in the lower-level multithreading APIs.
I'd really recommend the Interlocked.... Methods to increment, decrement and CompareAndSwap values when using code that needs to be multi-thread-aware. I don't have 1st-hand C++ experience but a quick search for http://www.bing.com/search?q=c%2B%2B+interlocked reveals lots of confirming advice. If you need perf, these will likely be faster than locking.
As stated by #Void a mutex alone is not always the solution to a concurrency problem:
Regardless, simply introducing locks doesn't make the code
thread-safe. You have to use them in the right place in the right way
to avoid race conditions, deadlocks, etc. There are lots of
concurrency issues you could run in to.
I want to add another example:
class MyClass
{
mutex m_mutex;
AnotherClass m_anotherClass;
void setObject(AnotherClass& anotherClass)
{
m_mutex.lock();
m_anotherClass = anotherClass;
m_mutex.unlock();
}
AnotherClass getObject()
{
AnotherClass anotherClass;
m_mutex.lock();
anotherClass = m_anotherClass;
m_mutex.unlock();
return anotherClass;
}
}
In this case the getObject() method is always safe because is protected with mutex and you have a copy of the object which is returned to the caller which may be a different class and thread. This means you are working on a copy which might be old (in the meantime another thread might have changed the m_anotherClass by calling setObject() ).Now what if you turn m_anotherClass to a pointer instead of an object-variable ?
class MyClass
{
mutex m_mutex;
AnotherClass *m_anotherClass;
void setObject(AnotherClass *anotherClass)
{
m_mutex.lock();
m_anotherClass = anotherClass;
m_mutex.unlock();
}
AnotherClass * getObject()
{
AnotherClass *anotherClass;
m_mutex.lock();
anotherClass = m_anotherClass;
m_mutex.unlock();
return anotherClass;
}
}
This is an example where a mutex is not enough to solve all the problems.
With pointers you can have a copy only of the pointer but the pointed object is the same in the both the caller and the method. So even if the pointer was valid at the time that the getObject() was called you don't have any guarantee that the pointed value will exists during the operation you are performing with it. This is simply because you don't have control on the object lifetime. That's why you should use object-variables as much as possible and avoid pointers (if you can).