Multithreaded access to global variable : should I use mutex

Multithreaded access to global variable : should I use mutex - c++

Suppose I have 2 threads : std::thread thd1; std::thread thd2; Thread thd1 periodically sets some global variable calling the folowing setFlag function :
static std::int32_t g_flag;
static std::mutex io_mutex;
void setFlag( std::int32_t flag )
{
//std::lock_guard<std::mutex> lk(io_mutex);
g_flag = flag;
}
And thread thd2 periodically reads this flag
std::int32_t getFlag()
{
//std::lock_guard<std::mutex> lk(io_mutex);
return g_flag;
}
The question is - should I use mutex in this case? Is it safe to access variable in read-write manner from several threads without having mutex?

Accessing a memory location for a write in one thread and either a read or write in another thread without synchronization and at least one of them non-atomically, is known as a data race and causes undefined behavior in C++.
In your code the write access to g_flag of thread 1 is not synchronized with the read access of thread 2 to the same variable.
Therefore your program has undefined behavior (as none of the accesses are atomic).
One possible solution for this is to use a mutex as you are demonstrating correctly in the commented code, which will synchronize the read and write access, such that one happens-before the other, although the order in which these happen-before is still indeterminate.
Another possibility is to declare g_flag as an atomic:
std::atomic<std::int32_t> g_flag{};
As mentioned above, atomic accesses (which std::atomic provides) are specifically exempt from causing data races and undefined behavior when accessed potentially in parallel for write and read.
An atomic will (in general) not make the other thread wait as a mutex/lock does. This does however also make it trickier to use correctly if you are accessing other shared memory as well.
Instead there are further options for std::atomic to specify whether and how other memory accesses around the atomic access will be ordered, i.e. whether and to what degree it will also cause synchronization between the threads.
Without further details I cannot determine what the appropriate tool is in your case.

Related

Is it safe to read(only) the same non-atomic variable in different threads without locking?

I have a class method in which I want to read the same non-atomic class member simultaneously in different threads. The method is const, so it doesn't write to the member being read. Is it safe to not bother about any locks in this case?
Edit:
I should have given an example:
class SomeClass()
{
public:
void someMethod() const;
//...
private:
std::string someMemeber_; // might be changed by some methods
//...
}
void SomeClass::someMethod() const
{
std::jthread thr1([](){/*read someMember_ here*/});
std::jthread thr2([](){/*read someMember_ here*/});
//...
}

It is safe, provided you can guarantee that the data being read won't change during the period that multiple threads have unsynchronized access to it.
(Note that this period includes not just the brief moments in time when threads are calling your const-method, but the entire region of time when a thread could call your const-method. So for example in many programs this no-modifications-allowed period might begin at the moment when the first child-thread is spawned, and end at the moment the last child-thread has exited and been join()'d)

Given that:
Yes, the value can be changed by other methods, but not during the execution of the method with multiple threads.
It is not inherently safe to blindly read non-atomic values that have potentially been modified by some other thread even if you know for 100% certain that no modification will be happening concurrently with the reads.
There has to be syncrhonization between the two threads before a modification is seen from reading threads.
The potential problems go beyond reading an out-of-date value. If the compiler can establish that there is no synchronization happening between two subsequent reads of the same memory location, it's perfectly allowed to only read it once and cache the result.
There are various ways for the synchronization to be done, and which one is best in a given scenario will depend on the context.
see https://en.cppreference.com/w/cpp/language/memory_model for more details
However, once synchronization has been established, any number of threads can concurrently read from the same memory location.

I have a class method in which I want to read the same non-atomic class member simultaneously in different threads. The method is const, so it doesn't write to the member being read. Is it safe to not bother about any locks in this case?
Yes, it is 100% safe. I am assuming that the only issue is that multiple threads are reading and that the code would be safe if a single thread were reading. That additional threads are reading the same data has no effect on whether the reads are safe or not.
A data race can only occur between a read and a modification. So if one read from one thread wouldn't race with any modifications, additional reads can't either.

Is is worth to declare a size_t as std::atomic if used across 2 threads?

I have a size_t variable which is updated by a std::thread and read by another std::thread.
I know that I can mutex protect the reads and writes.
But, would it be the same or would it be beneficial if I make the size_t as std::atomic<size_t>?

Yes, it is worth it. In fact it is mandatory to use std::atomic or synchronize access to a non-atomic if multiple threads use the same variable and at least one is writing to the variable.
Not following this rule is data-race undefined behavior.
Depending on your use of the std::size_t the compiler can assume that non-atomic and otherwise non-synchronized variables will not change from other threads and optimize the code accordingly. This can cause Bad Things™ to happen.
My usual example for this is a loop where a non-atomic boolean is used:
// make keepRunning an std::atomic<bool> to avoid endless loop
bool keepRunning {true};
unsigned number = 0;
void stop()
{
keepRunning = false;
}
void loop()
{
while(keepRunning) {
number += 1;
}
}
When compiling this code with optimizations enabled, GCC and Clang will both only check keepRunning once and then start an endless loop.
See https://godbolt.org/z/GYMiLE for the generated assembler output.
i.e. they optimize it into if (keepRunning) infinite_loop;, hoisting the load out of the loop. Because it's non-atomic, they're allowed to assume no other thread can be writing it. See Multithreading program stuck in optimized mode but runs normally in -O0 for a more detailed look at the same problem.
Note that this example only shows the error if the loop body is sufficiently simple. However the undefined behaviour is still present and should be avoided by using std::atomic or synchronization.
In this case you can use std::atomic<bool> with std::memory_order_relaxed because you don't need any synchronization or ordering wrt. other operations in either the writing or reading thread. That will give you atomicity (no tearing) and the assumption that the value can change asynchronously, without making the compiler use any asm barrier instructions to create more ordering wrt. other operations.
So it's possible and safe to use atomics without any synchronization, and without even creating synchronization between the writer and reader the way seq_cst or acquire/release loads and stores do. You can use this synchronization to safely share a non-atomic variable or array, e.g. with atomic<int*> buffer that the reader reads when the pointer is non-NULL.
But if only the atomic variable itself is shared, you can just have readers read the current value, not caring about synchronization. You may want to read it into a local temporary if you don't need to re-read every iteration of a short loop, only once per function call.

Avoiding data race of boolean variables with pthreads

in my code I have the following structure:
Parent thread
somedatatype thread1_continue, thread2_continue; // Does bool guarantee no data race?
Thread 1:
while (thread1_continue) {
// Do some work
}
Thread 2:
while (thread2_continue) {
// Do some work
}
So I wonder which data type should be thread1_continue or thread2_continue to avoid data race. And also if there is any data type or technique in pthread to solve this problem.

There is no built-in basic type that guarantees thread safety, no matter how small. Even if you are working with bool or unsigned char, neither reading nor writing is guaranteed to be atomic. In other words: there is a chance that if more threads are independantly working with the same memory, one thread can overwrite this memory only partially while the other reads the trash value ~ in that case the behavior is undefined.
You could use mutex to wrap the critical section with lock and unlock calls to ensure the mutual exclusion - there will be only 1 thread that will be able to execute that code. For more sophisticated synchronization there are semaphores, condition variables or even patterns / idioms describing how the synchronization can be handled using these (light switch, turniket, etc.). Just study more about these, some simple examples can be found here :)
Note that there might be some more complex types / wrappers available that wrap the way the object is being accessed - such as std::atomic template in C++11, which does nothing but internally handles the synchronization for you so that you don't need to do that explicitly. With std::atomic there is a guarantee that: "if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined".

For booleans (and others), be sure to avoid
thread 1 loop
{
do actions1;
myFlag = true;
do more1;
}
thread 2 loop
{
do actions2;
if (myFlag)
{
myFlag = false;
do flagged actions;
}
do more2;
}
This nearly always works until myBool is set by thread1 while thread2 is in between checking and resetting myBool. There are CPU-dependent primitives to handle test-and-set, but the normal solution is lock when accessing shared resources, even booleans.

variable sync on multithread environments

The questions are limited to X86/LINUX environment.
One thread write a var with a lock,other threads read this var without lock.
When the write thread unlocked,Could other threads read the new value immediately?
volatile int a=0;
/* thread 1(write) */
lock();
a = 10;
unlock();
/* thread 2(read) */
printf("%d",a);
One thread read a var with a lock,another thread write this var without lock.
When the read thread read after write complete,Could it read the new value immediately?
volatile int a=0;
/* thread 1(read) */
lock();
printf("%d",a);
unlock();
/* thread 2(write) */
a = 10;

One thread write a var with a lock,other threads read this var without lock. When the write thread unlocked,Could other threads read the new value immediately?
Yes, they can, but what ensures all the reading threads will not read before the writing starts?
One thread read a var with a lock,another thread write this var without lock. When the read thread read after write complete,Could it read the new value immediately?
Yes, but again what ensures the ordering of the read and writes?
Since you need the operations to occur in a certain scenario you will need to provide some form of synchronization here. Simplest is to use an an signalling counter like Semaphore.
Note that volatile does not provide you with ordering of the sequences it only ensures no optimization on compilres part, So the ordering is still your responsibility.

They could, but it's not guaranteed. In both cases, you've got
undefined behavior. As soon as more than one thread accesses an object,
and at least one thread modifies it, all accesses must be
synchronized, or undefined behavior results.
This is true according the C++11 and Posix. Linux follows the Posix
rules. In some versions of VC++, volatile has been extended with
atomic semantics. (Under Posix, the only semantics associated with
volatile concern signals and longjmp/setjmp. It's totally
irrelevant and largely ignored in threading.)

Be careful. Just because a variable is enclosed within a lock, it doesn't mean to say that other threads can't read it even in the lock if other parts of your code don't protect access to a. In your first example, you lock access to the code which changes a but then after the unlock() you use an unprotected read. Just before or during that read, another thread could change the value of a leaving you with quite unexpected and unpredictable results.
In other words, you are not locking access to a variable per se, but restricting certain paths in the code to be mutually exclusive
Additionally, your use of volatile is concerning. I'm not sure why you used it but I'm guessing it will not give you what you were expecting. Please read this for a fuller explanation.

Boost, mutex concept

I am new to multi-threading programming, and confused about how Mutex works. In the Boost::Thread manual, it states:
Mutexes guarantee that only one thread can lock a given mutex. If a code section is surrounded by a mutex locking and unlocking, it's guaranteed that only a thread at a time executes that section of code. When that thread unlocks the mutex, other threads can enter to that code region:
My understanding is that Mutex is used to protect a section of code from being executed by multiple threads at the same time, NOT protect the memory address of a variable. It's hard for me to grasp the concept, what happen if I have 2 different functions trying to write to the same memory address.
Is there something like this in Boost library:
lock a memory address of a variable, e.g., double x, lock (x); So
that other threads with a different function can not write to x.
do something with x, e.g., x = x + rand();
unlock (x)
Thanks.

The mutex itself only ensures that only one thread of execution can lock the mutex at any given time. It's up to you to ensure that modification of the associated variable happens only while the mutex is locked.
C++ does give you a way to do that a little more easily than in something like C. In C, it's pretty much up to you to write the code correctly, ensuring that anywhere you modify the variable, you first lock the mutex (and, of course, unlock it when you're done).
In C++, it's pretty easy to encapsulate it all into a class with some operator overloading:
class protected_int {
int value; // this is the value we're going to share between threads
mutex m;
public:
operator int() { return value; } // we'll assume no lock needed to read
protected_int &operator=(int new_value) {
lock(m);
value = new_value;
unlock(m);
return *this;
}
};
Obviously I'm simplifying that a lot (to the point that it's probably useless as it stands), but hopefully you get the idea, which is that most of the code just treats the protected_int object as if it were a normal variable.
When you do that, however, the mutex is automatically locked every time you assign a value to it, and unlocked immediately thereafter. Of course, that's pretty much the simplest possible case -- in many cases, you need to do something like lock the mutex, modify two (or more) variables in unison, then unlock. Regardless of the complexity, however, the idea remains that you centralize all the code that does the modification in one place, so you don't have to worry about locking the mutex in the rest of the code. Where you do have two or more variables together like that, you generally will have to lock the mutex to read, not just to write -- otherwise you can easily get an incorrect value where one of the variables has been modified but the other hasn't.

No, there is nothing in boost(or elsewhere) that will lock memory like that.
You have to protect the code that access the memory you want protected.
what happen if I have 2 different functions trying to write to the same
memory address.
Assuming you mean 2 functions executing in different threads, both functions should lock the same mutex, so only one of the threads can write to the variable at a given time.
Any other code that accesses (either reads or writes) the same variable will also have to lock the same mutex, failure to do so will result in indeterministic behavior.

It is possible to do non-blocking atomic operations on certain types using Boost.Atomic. These operations are non-blocking and generally much faster than a mutex. For example, to add something atomically you can do:
boost::atomic<int> n = 10;
n.fetch_add(5, boost:memory_order_acq_rel);
This code atomically adds 5 to n.

In order to protect a memory address shared by multiple threads in two different functions, both functions have to use the same mutex ... otherwise you will run into a scenario where threads in either function can indiscriminately access the same "protected" memory region.
So boost::mutex works just fine for the scenario you describe, but you just have to make sure that for a given resource you're protecting, all paths to that resource lock the exact same instance of the boost::mutex object.

I think the detail you're missing is that a "code section" is an arbitrary section of code. It can be two functions, half a function, a single line, or whatever.
So the portions of your 2 different functions that hold the same mutex when they access the shared data, are "a code section surrounded by a mutex locking and unlocking" so therefore "it's guaranteed that only a thread at a time executes that section of code".
Also, this is explaining one property of mutexes. It is not claiming this is the only property they have.

Your understanding is correct with respect to mutexes. They protect the section of code between the locking and unlocking.
As per what happens when two threads write to the same location of memory, they are serialized. One thread writes its value, the other thread writes to it. The problem with this is that you don't know which thread will write first (or last), so the code is not deterministic.
Finally, to protect a variable itself, you can find a near concept in atomic variables. Atomic variables are variables that are protected by either the compiler or the hardware, and can be modified atomically. That is, the three phases you comment (read, modify, write) happen atomically. Take a look at Boost atomic_count.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Multithreaded access to global variable : should I use mutex - c++

Related

Is it safe to read(only) the same non-atomic variable in different threads without locking?

Is is worth to declare a size_t as std::atomic if used across 2 threads?

Avoiding data race of boolean variables with pthreads

variable sync on multithread environments

Boost, mutex concept

Categories

Resources