pthread mutex lock and unlock per variable - c++

I'm wondering what the best practice for locking and unlocking mutexes for variables withing an object that is shared between threads.
This is what I have been doing and seems to work just fine so far, just wondering if this is excessive or not though:
class sharedobject
{
private:
bool m_Var1;
pthread_mutex_t var1mutex;
public:
sharedobject()
{
var1mutex = PTHREAD_MUTEX_INITIALIZER;
}
bool GetVar1()
{
pthread_mutex_lock(&var1mutex);
bool temp = m_Var1;
pthread_mutex_unlock(&var1mutex);
return temp;
}
void SetVar1(bool status)
{
pthread_mutex_lock(&var1mutex);
m_Var1 = status;
pthread_mutex_unlock(&var1mutex);
}
};
this isn't my actual code, but it shows how i am using mutexes for every variable that is shared in an object between threads. The reason i don't have a mutex for the entire object is because one thread might take seconds to complete an operation on part of the object, while another thread checks the status of the object, and again another thread gets data from the object
my question is, is it a good practice to create a mutex for every variable within an object that is accessed between threads, then lock and unlock that variable when it is read or written to?
I use trylock for variables i'm checking the status to (so i don't create extra threads while the variable is being processed, and don't make the program wait to get a lock)
I haven't had a lot of experience working with threading. I would like the make the program thread safe, but it also needs to perform as best as possible.

if the members you're protecting are read-write, and may be accessed at any time by more than one thread, then what you're doing is not excessive - it's necessary.
If you can prove that a member will not change (is immutable) then there is no need to protect it with a mutex.
Many people prefer multi-threaded solutions where each thread has an immutable copy of data rather than those in which many threads access the same copy. This eliminates the need for memory barriers and very often improves execution times and code safety.
Your mileage may vary.

Related

Does a mutex lock itself, or the memory positions in question?

Let's say we've got a global variable, and a global non-member function.
int GlobalVariable = 0;
void GlobalFunction();
and we have
std::mutex MutexObject;
then inside one of the threads, we have this block of code:
{
std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
now, inside another thread running in parallel, what happens if we do thing like this:
{
//std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
So the question is, does a mutex lock only itself from getting owned while being owned by another thread, not caring in the process about what is being tried to be accessed in the critical code? or does the compiler, or in run-time, the OS actually designate the memory location being accessed in the critical code as blocked for now by MutexObject?
My guess is the former, but I need to hear from an experienced programmer; Thanks for taking the time to read my question.
It’s the former. There’s no relationship between the mutex and the objects you’re protecting with the mutex. (In general, it's not possible for the compiler to deduce exactly which objects a given block of code will modify.) The magic behind the mutex comes entirely from the temporal ordering guarantees it makes: that everything the thread does before releasing the mutex is visible to the next thread after it’s grabbed the mutex. But the two threads both need to actively use the mutex for that to happen.
A system which actually cares about what memory locations a thread has accessed and modified, and builds safely on top of that, is “transactional memory”. It’s not widely used.

Why we keep mutex instead of declaring it before guard every time?

Please consider this classical approach, I have simplified it to highlight the exact question:
#include <iostream>
#include <mutex>
using namespace std;
class Test
{
public:
void modify()
{
std::lock_guard<std::mutex> guard(m_);
// modify data
}
private:
/// some private data
std::mutex m_;
};
This is the classical approach of using std::mutex to avoid data races.
The question is why are we keeping an extra std::mutex in our class? Why can't we declare it every time before the declaration of std::lock_guard like this?
void modify()
{
std::mutex m_;
std::lock_guard<std::mutex> guard(m_);
// modify data
}
Lets say two threads are calling modify in parallel. So each thread gets its own, new mutex. So the guard has no effect as each guard is locking a different mutex. The resource you are trying to protect from race-conditions will be exposed.
The misunderstanding comes from what the mutex is and what the lock_guard is good for.
A mutex is an object that is shared among different threads, and each thread can lock and release the mutex. That's how synchronization among different threads works. So you can work with m_.lock() and m_.unlock() as well, yet you have to be very careful that all code paths (including exceptional exits) in your function actually unlocks the mutex.
To avoid the pitfall of missing unlocks, a lock_guard is a wrapper object which locks the mutex at wrapper object creation and unlocks it at wrapper object destruction. Since the wrapper object is an object with automatic storage duration, you will never miss an unlock - that's why.
A local mutex does not make sense, as it would be local and not a shared ressource. A local lock_guard perfectly makes sense, as the autmoatic storage duration prevents missing locks / unlocks.
Hope it helps.
This all depends on the context of what you want to prevent from being executed in parallel.
A mutex will work when multiple threads try to access the same mutex object. So when 2 threads try to access and acquire the lock of a mutex object, only one of them will succeed.
Now in your second example, if two threads call modify(), each thread will have its own instance of that mutex, so nothing will stop them from running that function in parallel as if there's no mutex.
So to answer your question: It depends on the context. The mission of the design is to ensure that all threads that should not be executed in parallel will hit the same mutex object at the critical part.
Synchronization of threads involves checking if there is another thread executing the critical section. A mutex is the objects that holds the state for us to check if it was "locked" by a thread. lock_guard on the other hand is a wrapper that locks the mutex on initialization and unlocks it during destruction.
Having realized that, it should be clearer why there has to be only one instance of the mutex that all lock_guards need access to - they need to check if it's clear to enter the critical section against the same object. In the second snippet of your question each function call creates a separate mutex that is seen and accessible only in its local context.
You need a mutex at class level. Otherwise, each thread has a mutex for itself, and therefore the mutex has no effect.
If for some reason you don't want your mutex to be stored in a class attribute, you could use a static mutex as shown below.
void modify()
{
static std::mutex myMutex;
std::lock_guard<std::mutex> guard(myMutex);
// modify data
}
Note that here there is only 1 mutex for all the class instances. If the mutex is stored in an attribute, you would have one mutex per class instance. Depending on your needs, you might prefer one solution or the other.

Should a critical section or mutex be really member variable or when should it be?

I have seen code where mutex or critical section is declared as member variable of the class to make it thread safe something like the following.
class ThreadSafeClass
{
public:
ThreadSafeClass() { x = new int; }
~ThreadSafeClass() {};
void reallocate()
{
std::lock_guard<std::mutex> lock(m);
delete x;
x = new int;
}
int * x;
std::mutex m;
};
But doesn't that make it thread safe only if the same object was being shared by multiple threads? In other words, if each thread was creating its own instance of this class, they will be very much independent and its member variables will never conflict with each other and synchronization will not even be needed in that case!?
It appears to me that defining the mutex as member variable really reduces synchronization to the events when the same object is being shared by multiple threads. It doesn't really make the class any thread safer if each thread has its own copy of the class (for example if the class were to access other global objects). Is this a correct assessment?
If you can guarantee that any given object will only be accessed by one thread then a mutex is an unnecessary expense. It however must be well documented on the class's contract to prevent misuse.
PS: new and delete have their own synchronization mechanisms, so even without a lock they will create contention.
EDIT: The more you keep threads independent from each other the better (because it eliminates the need for locks). However, if your class will work heavily with a shared resource (e.g. database, file, socket, memory, etc ...) then having a per-thread instance is of little advantage so you might as well share an object between threads. Real independence is achieved by having different threads work with separate memory locations or resources.
If you will have potentially long waits on your locks, then it might be a good idea to have a single instance running in its own thread and take "jobs" from a synchronized queue.

What is the easiest way to provide an undetermined-lifespan bool to share between threads?

If I want to have some bool flag that I want to share between threads and whose lifespan is unclear because thread1, thread2, ... could be the particular last thread to use it, how can I provide such a type?
I could obviously have a shared_ptr<bool> with a mutex to synchronize access to it. Without the shared_ptr, however, I would just use an atomic<bool> because it would do the job.
Now, can I combine both of the concepts by using a shared_ptr<atomic<bool>>?
If not, what would be the easiest way to have an undetermined-lifespan bool to share between threads? Is it the mutex?
It might be necessary to say that I have multiple jobs in my system and for each of the jobs I would like to provide an shared abort flag. If the job is already done, some thread that would like to abort the thread should not crash when it tries to set the flag. And if the thread that likes to abort the job does not keep the flag (or shared_ptr) to it, then the thread should still be able to read the flag without crash. However, if no thread uses the bool anymore, the memory should be frees naturally.
Once you have created your atomic bool:
std::shared_ptr<std::atomic<bool>> flag = std::make_shared<std::atomic<bool>>(false /*or true*/);
You should be fine to use this among threads. Reference counting and memory deallocation on std::shared_ptr are thread safe.
The other thing that might be of interest is if you want some threads to opt out of reference counting, then you can use:
std::weak_ptr<std::atomic<bool>> weak_flag = flag;
...
std::shared_ptr<std::atomic<bool>> temporary_flag = weak_flag.lock();
if (temporary_flag != nullptr)
{
// you now have safe access to the allocated std::atomic<bool> and it cannot go out of scope while you are using it
}
// now let temporary_flag go out of scope to release your temporary reference count

Boost, mutex concept

I am new to multi-threading programming, and confused about how Mutex works. In the Boost::Thread manual, it states:
Mutexes guarantee that only one thread can lock a given mutex. If a code section is surrounded by a mutex locking and unlocking, it's guaranteed that only a thread at a time executes that section of code. When that thread unlocks the mutex, other threads can enter to that code region:
My understanding is that Mutex is used to protect a section of code from being executed by multiple threads at the same time, NOT protect the memory address of a variable. It's hard for me to grasp the concept, what happen if I have 2 different functions trying to write to the same memory address.
Is there something like this in Boost library:
lock a memory address of a variable, e.g., double x, lock (x); So
that other threads with a different function can not write to x.
do something with x, e.g., x = x + rand();
unlock (x)
Thanks.
The mutex itself only ensures that only one thread of execution can lock the mutex at any given time. It's up to you to ensure that modification of the associated variable happens only while the mutex is locked.
C++ does give you a way to do that a little more easily than in something like C. In C, it's pretty much up to you to write the code correctly, ensuring that anywhere you modify the variable, you first lock the mutex (and, of course, unlock it when you're done).
In C++, it's pretty easy to encapsulate it all into a class with some operator overloading:
class protected_int {
int value; // this is the value we're going to share between threads
mutex m;
public:
operator int() { return value; } // we'll assume no lock needed to read
protected_int &operator=(int new_value) {
lock(m);
value = new_value;
unlock(m);
return *this;
}
};
Obviously I'm simplifying that a lot (to the point that it's probably useless as it stands), but hopefully you get the idea, which is that most of the code just treats the protected_int object as if it were a normal variable.
When you do that, however, the mutex is automatically locked every time you assign a value to it, and unlocked immediately thereafter. Of course, that's pretty much the simplest possible case -- in many cases, you need to do something like lock the mutex, modify two (or more) variables in unison, then unlock. Regardless of the complexity, however, the idea remains that you centralize all the code that does the modification in one place, so you don't have to worry about locking the mutex in the rest of the code. Where you do have two or more variables together like that, you generally will have to lock the mutex to read, not just to write -- otherwise you can easily get an incorrect value where one of the variables has been modified but the other hasn't.
No, there is nothing in boost(or elsewhere) that will lock memory like that.
You have to protect the code that access the memory you want protected.
what happen if I have 2 different functions trying to write to the same
memory address.
Assuming you mean 2 functions executing in different threads, both functions should lock the same mutex, so only one of the threads can write to the variable at a given time.
Any other code that accesses (either reads or writes) the same variable will also have to lock the same mutex, failure to do so will result in indeterministic behavior.
It is possible to do non-blocking atomic operations on certain types using Boost.Atomic. These operations are non-blocking and generally much faster than a mutex. For example, to add something atomically you can do:
boost::atomic<int> n = 10;
n.fetch_add(5, boost:memory_order_acq_rel);
This code atomically adds 5 to n.
In order to protect a memory address shared by multiple threads in two different functions, both functions have to use the same mutex ... otherwise you will run into a scenario where threads in either function can indiscriminately access the same "protected" memory region.
So boost::mutex works just fine for the scenario you describe, but you just have to make sure that for a given resource you're protecting, all paths to that resource lock the exact same instance of the boost::mutex object.
I think the detail you're missing is that a "code section" is an arbitrary section of code. It can be two functions, half a function, a single line, or whatever.
So the portions of your 2 different functions that hold the same mutex when they access the shared data, are "a code section surrounded by a mutex locking and unlocking" so therefore "it's guaranteed that only a thread at a time executes that section of code".
Also, this is explaining one property of mutexes. It is not claiming this is the only property they have.
Your understanding is correct with respect to mutexes. They protect the section of code between the locking and unlocking.
As per what happens when two threads write to the same location of memory, they are serialized. One thread writes its value, the other thread writes to it. The problem with this is that you don't know which thread will write first (or last), so the code is not deterministic.
Finally, to protect a variable itself, you can find a near concept in atomic variables. Atomic variables are variables that are protected by either the compiler or the hardware, and can be modified atomically. That is, the three phases you comment (read, modify, write) happen atomically. Take a look at Boost atomic_count.