Avoiding data race of boolean variables with pthreads

Avoiding data race of boolean variables with pthreads - c++

in my code I have the following structure:
Parent thread
somedatatype thread1_continue, thread2_continue; // Does bool guarantee no data race?
Thread 1:
while (thread1_continue) {
// Do some work
}
Thread 2:
while (thread2_continue) {
// Do some work
}
So I wonder which data type should be thread1_continue or thread2_continue to avoid data race. And also if there is any data type or technique in pthread to solve this problem.

There is no built-in basic type that guarantees thread safety, no matter how small. Even if you are working with bool or unsigned char, neither reading nor writing is guaranteed to be atomic. In other words: there is a chance that if more threads are independantly working with the same memory, one thread can overwrite this memory only partially while the other reads the trash value ~ in that case the behavior is undefined.
You could use mutex to wrap the critical section with lock and unlock calls to ensure the mutual exclusion - there will be only 1 thread that will be able to execute that code. For more sophisticated synchronization there are semaphores, condition variables or even patterns / idioms describing how the synchronization can be handled using these (light switch, turniket, etc.). Just study more about these, some simple examples can be found here :)
Note that there might be some more complex types / wrappers available that wrap the way the object is being accessed - such as std::atomic template in C++11, which does nothing but internally handles the synchronization for you so that you don't need to do that explicitly. With std::atomic there is a guarantee that: "if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined".

For booleans (and others), be sure to avoid
thread 1 loop
{
do actions1;
myFlag = true;
do more1;
}
thread 2 loop
{
do actions2;
if (myFlag)
{
myFlag = false;
do flagged actions;
}
do more2;
}
This nearly always works until myBool is set by thread1 while thread2 is in between checking and resetting myBool. There are CPU-dependent primitives to handle test-and-set, but the normal solution is lock when accessing shared resources, even booleans.

Related

Multithreaded access to global variable : should I use mutex

Suppose I have 2 threads : std::thread thd1; std::thread thd2; Thread thd1 periodically sets some global variable calling the folowing setFlag function :
static std::int32_t g_flag;
static std::mutex io_mutex;
void setFlag( std::int32_t flag )
{
//std::lock_guard<std::mutex> lk(io_mutex);
g_flag = flag;
}
And thread thd2 periodically reads this flag
std::int32_t getFlag()
{
//std::lock_guard<std::mutex> lk(io_mutex);
return g_flag;
}
The question is - should I use mutex in this case? Is it safe to access variable in read-write manner from several threads without having mutex?

Accessing a memory location for a write in one thread and either a read or write in another thread without synchronization and at least one of them non-atomically, is known as a data race and causes undefined behavior in C++.
In your code the write access to g_flag of thread 1 is not synchronized with the read access of thread 2 to the same variable.
Therefore your program has undefined behavior (as none of the accesses are atomic).
One possible solution for this is to use a mutex as you are demonstrating correctly in the commented code, which will synchronize the read and write access, such that one happens-before the other, although the order in which these happen-before is still indeterminate.
Another possibility is to declare g_flag as an atomic:
std::atomic<std::int32_t> g_flag{};
As mentioned above, atomic accesses (which std::atomic provides) are specifically exempt from causing data races and undefined behavior when accessed potentially in parallel for write and read.
An atomic will (in general) not make the other thread wait as a mutex/lock does. This does however also make it trickier to use correctly if you are accessing other shared memory as well.
Instead there are further options for std::atomic to specify whether and how other memory accesses around the atomic access will be ordered, i.e. whether and to what degree it will also cause synchronization between the threads.
Without further details I cannot determine what the appropriate tool is in your case.

C++ member update visibility inside a critical section when not atomic

I stumbled across the following Code Review StackExchange and decided to read it for practice. In the code, there is the following:
Note: I am not looking for a code review and this is just a copy paste of the code from the link so you can focus in on the issue at hand without the other code interfering. I am not interested in implementing a 'smart pointer', just understanding the memory model:
// Copied from the link provided (all inside a class)
unsigned int count;
mutex m_Mutx;
void deref()
{
m_Mutx.lock();
count--;
m_Mutx.unlock();
if (count == 0)
{
delete rawObj;
count = 0;
}
}
Seeing this makes me immediately think "what if two threads enter when count == 1 and neither see the updates of each other? Can both end up seeing count as zero and double delete? And is it possible for two threads to cause count to become -1 and then deletion never happens?
The mutex will make sure one thread enters the critical section, however does this guarantee that all threads will be properly updated? What does the C++ memory model tell me so I can say this is a race condition or not?
I looked at the Memory model cppreference page and std::memory_order cppreference, however the latter page seems to deal with a parameter for atomic. I didn't find the answer I was looking for or maybe I misread it. Can anyone tell me if what I said is wrong or right, and whether or not this code is safe or not?
For correcting the code if it is broken:
Is the correct answer for this to turn count into an atomic member? Or does this work and after releasing the lock on the mutex, all the threads see the value?
I'm also curious if this would be considered the correct answer:
Note: I am not looking for a code review and trying to see if this kind of solution would solve the issue with respect to the C++ memory model.
#include <atomic>
#include <mutex>
struct ClassNameHere {
int* rawObj;
std::atomic<unsigned int> count;
std::mutex mutex;
// ...
void deref()
{
std::scoped_lock lock{mutex};
count--;
if (count == 0)
delete rawObj;
}
};

"what if two threads enter when count == 1" -- if that happens, something else is fishy. The idea behind smart pointers is that the refcount is bound to an object's lifetime (scope). The decrement happens when the object (via stack unrolling) is destroyed. If two threads trigger that, the refcount can not possibly be just 1 unless another bug is present.
However, what could happen is that two threads enter this code when count = 2. In that case, the decrement operation is locked by the mutex, so it can never reach negative values. Again, this assumes non-buggy code elsewhere. Since all this does is to delete the object (and then redundantly set count to zero), nothing bad can happen.
What can happen is a double delete though. If two threads at count = 2 decrement the count, they could both see the count = 0 afterwards. Just determine whether to delete the object inside the mutex as a simple fix. Store that info in a local variable and handle accordingly after releasing the mutex.
Concerning your third question, turning the count into an atomic is not going to fix things magically. Also, the point behind atomics is that you don't need a mutex, because locking a mutex is an expensive operation. With atomics, you can combine operations like decrement and check for zero, which is similar to the fix proposed above. Atomics are typically slower than "normal" integers. They are still faster than a mutex though.

In both cases there’s a data race. Thread 1 decrements the counter to 1, and just before the if statement a thread switch occurs. Thread 2 decrement the counter to 0 and then deletes the object. Thread 1 resumes, sees that count is 0, and deletes the object again.
Move the unlock() to the end of th function.or, better, use std::lock_guard to do the lock; its destructor will unlock the mutex even when the delete call throws an exception.

If two threads potentially* enter deref() concurrently, then, regardless of the previous or previously expected value of count, a data race occurs, and your entire program, even the parts that you would expect to be chronologically prior, has undefined behavior as stated in the C++ standard in [intro.multithread/20] (N4659):
Two actions are potentially concurrent if
(20.1) they are performed by different threads, or
(20.2) they are unsequenced, at least one is performed by a signal handler, and they are not both performed by the same signal handler invocation.
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is
not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
The potentially concurrent actions in this case, of course, are the read of count outside of the locked section, and the write of count within it.
*) That is, if current inputs allow it.
UPDATE 1: The section you reference, describing atomic memory order, explains how atomic operations synchronize with each other and with other synchronization primitives (such as mutexes and memory barriers). In other words, it describes how atomics can be used for synchronization so that some operations aren't data races. It does not apply here. The standard takes a conservative approach here: Unless other parts of the standard explicitly make clear that two conflicting accesses are not concurrent, you have a data race, and hence UB (where conflicting means same memory location, and at least one of them isn't read-only).

Your lock prevents that operation count-- gets in a mess when performed concurrently in different threads. It does not guarantee, however, that the values of count are synchronized, such that repeated reads outside a single critical section will bear the risk of a data race.
You could rewrite it as follows:
void deref()
{
bool isLast;
m_Mutx.lock();
--count;
isLast = (count == 0);
m_Mutx.unlock();
if (isLast) {
delete rawObj;
}
}
Thereby, the lock makes sure that access to count is synchronized and always in a valid state. This valid state is carried over to the non-critical section through a local variable (without race condition). Thereby, the critical section can be kept rather short.
A simpler version would be to synchronize the complete function body; this might get a disadvantage if you want to do more elaborative things than just delete rawObj:
void deref()
{
std::lock_guard<std::mutex> lock(m_Mutx);
if (! --count) {
delete rawObj;
}
}
BTW: std::atomic allone will not solve this issue as this synchronizes just each single access, but not a "transaction". Therefore, your scoped_lock is necessary, and - as this spans the complete function then - the std::atomic becomes superfluous.

Is mutex mandatory to access extern variable from a different thread?

I am developing an application in Qt/C++. At some point, there are two threads : one is the UI thread and the other one is the background thread. I have to do some operation from the background thread based on the value of an extern variable which is type of bool. I am setting this value by clicking a button on UI.
header.cpp
extern bool globalVar;
mainWindow.cpp
//main ui thread on button click
setVale(bool val){
globalVar = val;
}
backgroundThread.cpp
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
Here, writing to globalVar happens only when the user clicks the button whereas reading happens continuously.
So my question is :
In a situation like the one above, is mutex mandatory?
If read and write happens at the same time, does this cause the application to crash?
If read and write happens at same time, is globalVar going to have some value other than true or false?
Finally, does the OS provide any kind of locking mechanism to prevent the read/write operation to access a memory location at the same time by a different thread?

The loop
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
is busy waiting, which is extremely wasteful. Thus, you're probably better off with some classic synchronization that will wake the background thread (mostly) when there is something to be done. You should consider adapting this example of std::condition_variable.
Say you start with:
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
bool ready = false;
Your worker thread can then be something like this:
void worker_thread()
{
while(true)
{
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
ready = false;
lk.unlock();
}
The notifying thread should do something like this:
{
std::lock_guard<std::mutex> lk(m);
ready = true;
}
cv.notify_one();

Since it is just a single plain bool, I'd say a mutex is overkill, you should just go for an atomic integer instead. An atomic will read and write in a single CPU clock so no worries there, and it will be lock free, which is always better if possible.
If it is something more complex, then by all means go for a mutex.
It won't crash from that alone, but you can get data corruption, which may crash the application.
The system will not manage that stuff for you, you do it manually, just make sure all access to the data goes through the mutex.
Edit:
Since you specify a number of times that you don't want a complex solution, you may opt for simply using a mutex instead of the bool. There is no need to protect the bool with a mutex, since you can use the mutex as a bool, and yes, you could go with an atomic, but that's what the mutex already does (plus some extra functionality in the case of recursive mutexes).
It also matters what is your exact workload, since your example doesn't make a lot of sense in practice. It would be helpful to know what those some operations are.
So in your ui thread you could simply val ? mutex.lock() : mutex.unlock(), and in your secondary thread you could use if (mutex.tryLock()) doStuff; mutex.unlock(); else doOtherStuff;. Now if the operation in the secondary thread takes too long and you happen to be changing the lock in the main thread, that will block the main thread until the secondary thread unlocks. You could use tryLock(timeout) in the main thread, depending on what you prefer, lock() will block until success, while tryLock(timeout) will prevent blocking but the lock may fail. Also, take care not to unlock from a thread other than the one you locked with, and not to unlock an already unlocked mutex.
Depending on what you are actually doing, maybe an asynchronous event driven approach would be more appropriate. Do you really need that while(1)? How frequently do you perform those operations?

In situation like above does mutex is necessary?
A mutex is one tool that will work. What you actually need are three things:
a means of ensuring an atomic update (a bool will give you this as it's mandated to be an integral type by the standard)
a means of ensuring that the effects of a write made by one thread is actually visible in the other thread. This may sound counter-intuitive but the c++ memory model is single-threaded and optimisations (software and hardware) do not need to consider cross-thread communication, and...
a means of preventing the compiler (and CPU!!) from re-ordering the reads and writes.
The answer to the implied question is 'yes'. You will need something at does all of these things (see below)
If read and write happend at the same time does this cause to crash the application?
not when it's a bool, but the program won't behave as you expect. In fact, because the program is now exhibiting undefined behaviour you can no longer reason about its behaviour at all.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
not in this case because it's an intrinsic (atomic) type.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
Not unless you specify one.
Your options are:
std::atomic<bool>
std::mutex
std::atomic_signal_fence

Realistically speaking, as long as you use an integer type (not bool), make it volatile, and keep inside of its own cache line by properly aligning its storage, you don't need to do anything special at all.
In situation like above does mutex is necessary?
Only if you want to keep the value of the variable synchronized with other state.
If read and write happed at the same time does this cause to crash the application?
According to C++ standard, it's undefined behavior. So anything can happen: e.g. your application might not crash, but its state might be subtly corrupted. In real life, though, compilers often offer some sane implementation defined behavior and you're fine unless your platform is really weird. Anything commonplace, like 32 and 64 bit intel, PPC and ARM will be fine.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
globalVar can only have these two values, so it makes no sense to speak of any other values unless you're talking about its binary representation. Yes, it could happen that the binary representation is incorrect and not what the compiler would expect. That's why you shouldn't use a bool but a uint8_t instead.
I wouldn't love to see such flag in a code review, but if a uint8_t flag is the simplest solution to whatever problem you're solving, I say go for it. The if (globalVar) test will treat zero as false, and anything else as true, so temporary "gibberish" is OK and won't have any odd effects in practice. According to the standard, you'll be facing undefined behavior, of course.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
It's not the OS's job to do that.
Speaking of practice, though: on any reasonable platform, the use of a std::atomic_bool will have no overhead over the use of a naked uint8_t, so just use that and be done.

Safely Destroying a Thread Pool

Consider the following implementation of a trivial thread pool written in C++14.
threadpool.h
threadpool.cpp
Observe that each thread is sleeping until it's been notified to awaken -- or some spurious wake up call -- and the following predicate evaluates to true:
std::unique_lock<mutex> lock(this->instance_mutex_);
this->cond_handle_task_.wait(lock, [this] {
return (this->destroy_ || !this->tasks_.empty());
});
Furthermore, observe that a ThreadPool object uses the data member destroy_ to determine if its being destroyed -- the destructor has been called. Toggling this data member to true will notify each worker thread that it's time to finish its current task and any of the other queued tasks then synchronize with the thread that's destroying this object; in addition to prohibiting the enqueue member function.
For your convenience, the implementation of the destructor is below:
ThreadPool::~ThreadPool() {
{
std::lock_guard<mutex> lock(this->instance_mutex_); // this line.
this->destroy_ = true;
}
this->cond_handle_task_.notify_all();
for (auto &worker : this->workers_) {
worker.join();
}
}
Q: I do not understand why it's necessary to lock the object's mutex while toggling destroy_ to true in the destructor. Furthermore, is it only necessary for setting its value or is it also necessary for accessing its value?
BQ: Can this thread pool implementation be improved or optimized while maintaining it's original purpose; a thread pool that can pool N amount of threads and distribute tasks to them to be executed concurrently?
This thread pool implementation is forked from Jakob Progsch's C++11 thread pool repository with a thorough code step through to understand the purpose behind its implementation and some subjective style changes.
I am introducing myself to concurrent programming and there is still much to learn -- I am a novice concurrent programmer as it stands right now. If my questions are not worded correctly then please make the appropriate correction(s) in your provided answer. Moreover, if the answer can be geared towards a client who is being introduced to concurrent programming for the first time then that would be best -- for myself and any other novices as well.

If the owning thread of the ThreadPool object is the only thread that atomically writes to the destroy_ variable, and the worker threads only atomically read from the destroy_ variable, then no, a mutex is not needed to protect the destroy_ variable in the ThreadPool destructor. Typically a mutex is necessary when an atomic set of operations must take place that can't be accomplished through a single atomic instruction on a platform, (i.e., operations beyond an atomic swap, etc.). That being said, the author of the thread pool may be trying to force some type of acquire semantics on the destroy_ variable without restoring to atomic operations (i.e. a memory fence operation), and/or the setting of the flag itself is not considered an atomic operation (platform dependent)... Some other options include declaring the variable as volatile to prevent it from being cached, etc. You can see this thread for more info.
Without some sort of synchronization operation in place, the worst case scenario could end up with a worker that won't complete due to the destroy_ variable being cached on a thread. On platforms with weaker memory ordering models, that's always a possibility if you allowed a benign memory race condition to exist ...

C++ defines a data race as multiple threads potentially accessing an object simultaneously with at least one of those accesses being a write. Programs with data races have undefined behavior. If you were to write to destroy in your destructor without holding the mutex, your program would have undefined behavior and we cannot predict what would happen.
If you were to read destroy elsewhere without holding the mutex, that read could potentially happen while the destructor is writing to it which is also a data race.

Boost, mutex concept

I am new to multi-threading programming, and confused about how Mutex works. In the Boost::Thread manual, it states:
Mutexes guarantee that only one thread can lock a given mutex. If a code section is surrounded by a mutex locking and unlocking, it's guaranteed that only a thread at a time executes that section of code. When that thread unlocks the mutex, other threads can enter to that code region:
My understanding is that Mutex is used to protect a section of code from being executed by multiple threads at the same time, NOT protect the memory address of a variable. It's hard for me to grasp the concept, what happen if I have 2 different functions trying to write to the same memory address.
Is there something like this in Boost library:
lock a memory address of a variable, e.g., double x, lock (x); So
that other threads with a different function can not write to x.
do something with x, e.g., x = x + rand();
unlock (x)
Thanks.

The mutex itself only ensures that only one thread of execution can lock the mutex at any given time. It's up to you to ensure that modification of the associated variable happens only while the mutex is locked.
C++ does give you a way to do that a little more easily than in something like C. In C, it's pretty much up to you to write the code correctly, ensuring that anywhere you modify the variable, you first lock the mutex (and, of course, unlock it when you're done).
In C++, it's pretty easy to encapsulate it all into a class with some operator overloading:
class protected_int {
int value; // this is the value we're going to share between threads
mutex m;
public:
operator int() { return value; } // we'll assume no lock needed to read
protected_int &operator=(int new_value) {
lock(m);
value = new_value;
unlock(m);
return *this;
}
};
Obviously I'm simplifying that a lot (to the point that it's probably useless as it stands), but hopefully you get the idea, which is that most of the code just treats the protected_int object as if it were a normal variable.
When you do that, however, the mutex is automatically locked every time you assign a value to it, and unlocked immediately thereafter. Of course, that's pretty much the simplest possible case -- in many cases, you need to do something like lock the mutex, modify two (or more) variables in unison, then unlock. Regardless of the complexity, however, the idea remains that you centralize all the code that does the modification in one place, so you don't have to worry about locking the mutex in the rest of the code. Where you do have two or more variables together like that, you generally will have to lock the mutex to read, not just to write -- otherwise you can easily get an incorrect value where one of the variables has been modified but the other hasn't.

No, there is nothing in boost(or elsewhere) that will lock memory like that.
You have to protect the code that access the memory you want protected.
what happen if I have 2 different functions trying to write to the same
memory address.
Assuming you mean 2 functions executing in different threads, both functions should lock the same mutex, so only one of the threads can write to the variable at a given time.
Any other code that accesses (either reads or writes) the same variable will also have to lock the same mutex, failure to do so will result in indeterministic behavior.

It is possible to do non-blocking atomic operations on certain types using Boost.Atomic. These operations are non-blocking and generally much faster than a mutex. For example, to add something atomically you can do:
boost::atomic<int> n = 10;
n.fetch_add(5, boost:memory_order_acq_rel);
This code atomically adds 5 to n.

In order to protect a memory address shared by multiple threads in two different functions, both functions have to use the same mutex ... otherwise you will run into a scenario where threads in either function can indiscriminately access the same "protected" memory region.
So boost::mutex works just fine for the scenario you describe, but you just have to make sure that for a given resource you're protecting, all paths to that resource lock the exact same instance of the boost::mutex object.

I think the detail you're missing is that a "code section" is an arbitrary section of code. It can be two functions, half a function, a single line, or whatever.
So the portions of your 2 different functions that hold the same mutex when they access the shared data, are "a code section surrounded by a mutex locking and unlocking" so therefore "it's guaranteed that only a thread at a time executes that section of code".
Also, this is explaining one property of mutexes. It is not claiming this is the only property they have.

Your understanding is correct with respect to mutexes. They protect the section of code between the locking and unlocking.
As per what happens when two threads write to the same location of memory, they are serialized. One thread writes its value, the other thread writes to it. The problem with this is that you don't know which thread will write first (or last), so the code is not deterministic.
Finally, to protect a variable itself, you can find a near concept in atomic variables. Atomic variables are variables that are protected by either the compiler or the hardware, and can be modified atomically. That is, the three phases you comment (read, modify, write) happen atomically. Take a look at Boost atomic_count.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js