Atomic/not-atomic mix, any guarantees? - c++

Let's I have GUI thread with code like this:
std::vector<int> vec;
std::atomic<bool> data_ready{false};
std::thread th([&data_ready, &vec]() {
//we get data
vec.push_back(something);
data_ready = true;
});
draw_progress_dialog();
while (!data_ready) {
process_not_user_events();
sleep_a_little();
}
//is it here safe to use vec?
As you see I not protect "vec" by any kind of lock, but I not use "vec" in two thread at the same moment, the only problem is memory access reodering,
Is it impossible according to C++11 standard that some modifications in "vec" happens after "data_ready = true;"?
It is not clear (for me) from documentation, is it memory ordering relevant only for other atomics, or not.
Plus question, is "default" memory order is what I want, or have to change memory model?

As long as your used memory order is at least acquire/release (which is the default), you are guaranteed to see all updates (not just the ones to atomic variables) the writing thread did before setting the flag to true as soon as you can read the write.
So yes, this is fine.

Related

Acquiring lock by checking against a condition and rechecking it

Is something like this valid:
std::vector<std::vector<int>> data;
std::shared_mutex m;
...
void Resize() {
// AreAllVectorsEmpty: std::all_of(data.begin(), data.end(), [](auto& v) { return v.empty(); }
if (!AreAllVectorsEmpty()) {
m.lock();
if (!AreAllVectorsEmpty()) {
data.resize(new_size);
}
m.unlock();
}
}
I am checking AreAllVectosEmpty() and then if condition succeeds, then taking lock and then again check for the same condition whether to do the resize.
Would this be thread safe? Resize is only called by one thread, but other threads manipulate elements of data.
Is it a requirement that AreAllVectorsEmpty have a memory fence or acquire semantics?
Edit: Other threads would ofcourse block when m.lock is acquired by Resize.
Edit: Let's also assume new_size is large enough that reallocation happens.
Edit: Update code for shared_mutex.
Edit: AreAllVectorsEmtpy is iterating over the data vector. Nobody else modifies data vector, but data[0], data[1] etc are modified by other threads. My assumption is since data[0]'s size variable is inside the vector and is a simple integer, it is safe to access data[0].size(), data[1].size() etc... in the Resize thread. AreAllVectorsEmpty is iterating over data and checking vector.empty().
I would use a shared_mutex and use:
a shared lock in all threads that just read the vector (while reading the vector)
a unique lock in this thread when resizing the vector
I think first checking for the size, then resizing it, is safe, provided that this is the only thread that modifies the contents of the vector.
A lock automatically implies a memory barrier, otherwise the lock would not make much sense.
The answer depends entirely on how AreAllVectorsEmpty is implemented.
If it just checks a flag that can be set atomically, then yes, it is safe. If it iterates over the vector you intend to change (or other commonly used containers), then no, it is not safe (what happens to iterators, if the vector does re-allocation internally???).
If doing the latter, you need a read/write lock mechanism, have a look at shared mutexes.
You'd then acquire the shared lock before checking, and in case of modification, the exclusive lock.
Be aware that if areAllVectorsEmpty uses some independent data structure (other than the mentioned atomic flag), you might have to protect this one with a separate mutex as well.
The standard does not seem to request that this works, compare http://en.cppreference.com/w/cpp/container#Thread_safety. If it works with your specific compiler and STL? You'll need to look into the sources. But I would not rely on it.
This brings me to the question: why do you want to do it? For performance reasons? Have you measured performance? Is it really a measurable performance hit when you lock before calling AreAllVectorsEmpty?
BTW, please don't directly lock the mutex, please use a std::lock_guard.
// AreAllVectorsEmpty: std::all_of(data.begin(), data.end(), [](auto&
v) { return v.empty(); }
you are accessing internals of the inner vectors (calling empty) and the same time another thread could insert some elements into one of the inner vectors -> data race

Is mutex mandatory to access extern variable from a different thread?

I am developing an application in Qt/C++. At some point, there are two threads : one is the UI thread and the other one is the background thread. I have to do some operation from the background thread based on the value of an extern variable which is type of bool. I am setting this value by clicking a button on UI.
header.cpp
extern bool globalVar;
mainWindow.cpp
//main ui thread on button click
setVale(bool val){
globalVar = val;
}
backgroundThread.cpp
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
Here, writing to globalVar happens only when the user clicks the button whereas reading happens continuously.
So my question is :
In a situation like the one above, is mutex mandatory?
If read and write happens at the same time, does this cause the application to crash?
If read and write happens at same time, is globalVar going to have some value other than true or false?
Finally, does the OS provide any kind of locking mechanism to prevent the read/write operation to access a memory location at the same time by a different thread?
The loop
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
is busy waiting, which is extremely wasteful. Thus, you're probably better off with some classic synchronization that will wake the background thread (mostly) when there is something to be done. You should consider adapting this example of std::condition_variable.
Say you start with:
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
bool ready = false;
Your worker thread can then be something like this:
void worker_thread()
{
while(true)
{
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
ready = false;
lk.unlock();
}
The notifying thread should do something like this:
{
std::lock_guard<std::mutex> lk(m);
ready = true;
}
cv.notify_one();
Since it is just a single plain bool, I'd say a mutex is overkill, you should just go for an atomic integer instead. An atomic will read and write in a single CPU clock so no worries there, and it will be lock free, which is always better if possible.
If it is something more complex, then by all means go for a mutex.
It won't crash from that alone, but you can get data corruption, which may crash the application.
The system will not manage that stuff for you, you do it manually, just make sure all access to the data goes through the mutex.
Edit:
Since you specify a number of times that you don't want a complex solution, you may opt for simply using a mutex instead of the bool. There is no need to protect the bool with a mutex, since you can use the mutex as a bool, and yes, you could go with an atomic, but that's what the mutex already does (plus some extra functionality in the case of recursive mutexes).
It also matters what is your exact workload, since your example doesn't make a lot of sense in practice. It would be helpful to know what those some operations are.
So in your ui thread you could simply val ? mutex.lock() : mutex.unlock(), and in your secondary thread you could use if (mutex.tryLock()) doStuff; mutex.unlock(); else doOtherStuff;. Now if the operation in the secondary thread takes too long and you happen to be changing the lock in the main thread, that will block the main thread until the secondary thread unlocks. You could use tryLock(timeout) in the main thread, depending on what you prefer, lock() will block until success, while tryLock(timeout) will prevent blocking but the lock may fail. Also, take care not to unlock from a thread other than the one you locked with, and not to unlock an already unlocked mutex.
Depending on what you are actually doing, maybe an asynchronous event driven approach would be more appropriate. Do you really need that while(1)? How frequently do you perform those operations?
In situation like above does mutex is necessary?
A mutex is one tool that will work. What you actually need are three things:
a means of ensuring an atomic update (a bool will give you this as it's mandated to be an integral type by the standard)
a means of ensuring that the effects of a write made by one thread is actually visible in the other thread. This may sound counter-intuitive but the c++ memory model is single-threaded and optimisations (software and hardware) do not need to consider cross-thread communication, and...
a means of preventing the compiler (and CPU!!) from re-ordering the reads and writes.
The answer to the implied question is 'yes'. You will need something at does all of these things (see below)
If read and write happend at the same time does this cause to crash the application?
not when it's a bool, but the program won't behave as you expect. In fact, because the program is now exhibiting undefined behaviour you can no longer reason about its behaviour at all.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
not in this case because it's an intrinsic (atomic) type.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
Not unless you specify one.
Your options are:
std::atomic<bool>
std::mutex
std::atomic_signal_fence
Realistically speaking, as long as you use an integer type (not bool), make it volatile, and keep inside of its own cache line by properly aligning its storage, you don't need to do anything special at all.
In situation like above does mutex is necessary?
Only if you want to keep the value of the variable synchronized with other state.
If read and write happed at the same time does this cause to crash the application?
According to C++ standard, it's undefined behavior. So anything can happen: e.g. your application might not crash, but its state might be subtly corrupted. In real life, though, compilers often offer some sane implementation defined behavior and you're fine unless your platform is really weird. Anything commonplace, like 32 and 64 bit intel, PPC and ARM will be fine.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
globalVar can only have these two values, so it makes no sense to speak of any other values unless you're talking about its binary representation. Yes, it could happen that the binary representation is incorrect and not what the compiler would expect. That's why you shouldn't use a bool but a uint8_t instead.
I wouldn't love to see such flag in a code review, but if a uint8_t flag is the simplest solution to whatever problem you're solving, I say go for it. The if (globalVar) test will treat zero as false, and anything else as true, so temporary "gibberish" is OK and won't have any odd effects in practice. According to the standard, you'll be facing undefined behavior, of course.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
It's not the OS's job to do that.
Speaking of practice, though: on any reasonable platform, the use of a std::atomic_bool will have no overhead over the use of a naked uint8_t, so just use that and be done.

Do I need a memory barrier for a change notification flag between threads?

I need a very fast (in the sense "low cost for reader", not "low latency") change notification mechanism between threads in order to update a read cache:
The situation
Thread W (Writer) updates a data structure (S) (in my case a setting in a map) only once in a while.
Thread R (Reader) maintains a cache of S and does read this very frequently. When Thread W updates S Thread R needs to be notified of the update in reasonable time (10-100ms).
Architecture is ARM, x86 and x86_64. I need to support C++03 with gcc 4.6 and higher.
Code
is something like this:
// variables shared between threads
bool updateAvailable;
SomeMutex dataMutex;
std::string myData;
// variables used only in Thread R
std::string myDataCache;
// Thread W
SomeMutex.Lock();
myData = "newData";
updateAvailable = true;
SomeMutex.Unlock();
// Thread R
if(updateAvailable)
{
SomeMutex.Lock();
myDataCache = myData;
updateAvailable = false;
SomeMutex.Unlock();
}
doSomethingWith(myDataCache);
My Question
In Thread R no locking or barriers occur in the "fast path" (no update available).
Is this an error? What are the consequences of this design?
Do I need to qualify updateAvailable as volatile?
Will R get the update eventually?
My understanding so far
Is it safe regarding data consistency?
This looks a bit like "Double Checked Locking". According to http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html a memory barrier can be used to fix it in C++.
However the major difference here is that the shared resource is never touched/read in the Reader fast path. When updating the cache, the consistency is guaranteed by the mutex.
Will R get the update?
Here is where it gets tricky. As I understand it, the CPU running Thread R could cache updateAvailable indefinitely, effectively moving the Read way way before the actual if statement.
So the update could take until the next cache flush, for example when another thread or process is scheduled.
Use C++ atomics and make updateAvailable an std::atomic<bool>. The reason for this is that it's not just the CPU that can see an old version of the variable but especially the compiler which doesn't see the side effect of another thread and thus never bothers to refetch the variable so you never see the updated value in the thread. Additionally, this way you get a guaranteed atomic read, which you don't have if you just read the value.
Other than that, you could potentially get rid of the lock, if for example the producer only ever produces data when updateAvailable is false, you can get rid of the mutex because the std::atomic<> enforces proper ordering of the reads and writes. If that's not the case, you'll still need the lock.
You do have to use a memory fence here. Without the fence, there is no guarantee updates will be ever seen on the other thread. In C++03 you have the option of either using platform-specific ASM code (mfence on Intel, no idea about ARM) or use OS-provided atomic set/get functions.
Do I need to qualify updateAvailable as volatile?
As volatile doesn't correlate with threading model in C++, you should use atomics for make your program strictly standard-confirmant:
On C++11 or newer preferable way is to use atomic<bool> with memory_order_relaxed store/load:
atomic<bool> updateAvailable;
//Writer
....
updateAvailable.store(true, std::memory_order_relaxed); //set (under mutex locked)
// Reader
if(updateAvailable.load(std::memory_order_relaxed)) // check
{
...
updateAvailable.store(false, std::memory_order_relaxed); // clear (under mutex locked)
....
}
gcc since 4.7 supports similar functionality with in its atomic builtins.
As for gcc 4.6, it seems there is not strictly-confirmant way to evade fences when access updateAvailable variable. Actually, memory fence is usually much faster than 10-100ms order of time. So you can use its own atomic builtins:
int updateAvailable = 0;
//Writer
...
__sync_fetch_and_or(&updateAvailable, 1); // set to non-zero
....
//Reader
if(__sync_fetch_and_and(&updateAvailable, 1)) // check, but never change
{
...
__sync_fetch_and_and(&updateAvailable, 0); // clear
...
}
Is it safe regarding data consistency?
Yes, it is safe. Your reason is absolutely correct here:
the shared resource is never touched/read in the Reader fast path.
This is NOT double-check locking!
It is explicitely stated in the question itself.
In case when updateAvailable is false, Reader thread uses variable myDataCache which is local to the thread (no other threads use it). With double-check locking scheme all threads use shared object directly.
Why memory fences/barriers are NOT NEEDED here
The only variable, accessed concurrently, is updateAvailable. myData variable is accessed with mutex protection, which provides all needed fences. myDataCache is local to the Reader thread.
When Reader thread sees updateAvailable variable to be false, it uses myDataCache variable, which is changed by the thread itself. Program order garantees correct visibility of changes in that case.
As for visibility garantees for variable updateAvailable, C++11 standard provide such garantees for atomic variable even without fences. 29.3 p13 says:
Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.
Jonathan Wakely has confirmed, that this paragraph is applied even to memory_order_relaxed accesses in chat.

Avoiding data race of boolean variables with pthreads

in my code I have the following structure:
Parent thread
somedatatype thread1_continue, thread2_continue; // Does bool guarantee no data race?
Thread 1:
while (thread1_continue) {
// Do some work
}
Thread 2:
while (thread2_continue) {
// Do some work
}
So I wonder which data type should be thread1_continue or thread2_continue to avoid data race. And also if there is any data type or technique in pthread to solve this problem.
There is no built-in basic type that guarantees thread safety, no matter how small. Even if you are working with bool or unsigned char, neither reading nor writing is guaranteed to be atomic. In other words: there is a chance that if more threads are independantly working with the same memory, one thread can overwrite this memory only partially while the other reads the trash value ~ in that case the behavior is undefined.
You could use mutex to wrap the critical section with lock and unlock calls to ensure the mutual exclusion - there will be only 1 thread that will be able to execute that code. For more sophisticated synchronization there are semaphores, condition variables or even patterns / idioms describing how the synchronization can be handled using these (light switch, turniket, etc.). Just study more about these, some simple examples can be found here :)
Note that there might be some more complex types / wrappers available that wrap the way the object is being accessed - such as std::atomic template in C++11, which does nothing but internally handles the synchronization for you so that you don't need to do that explicitly. With std::atomic there is a guarantee that: "if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined".
For booleans (and others), be sure to avoid
thread 1 loop
{
do actions1;
myFlag = true;
do more1;
}
thread 2 loop
{
do actions2;
if (myFlag)
{
myFlag = false;
do flagged actions;
}
do more2;
}
This nearly always works until myBool is set by thread1 while thread2 is in between checking and resetting myBool. There are CPU-dependent primitives to handle test-and-set, but the normal solution is lock when accessing shared resources, even booleans.

Is it ok to read a shared boolean flag without locking it when another thread may set it (at most once)?

I would like my thread to shut down more gracefully so I am trying to implement a simple signalling mechanism. I don't think I want a fully event-driven thread so I have a worker with a method to graceully stop it using a critical section Monitor (equivalent to a C# lock I believe):
DrawingThread.h
class DrawingThread {
bool stopRequested;
Runtime::Monitor CSMonitor;
CPInfo *pPInfo;
//More..
}
DrawingThread.cpp
void DrawingThread::Run() {
if (!stopRequested)
//Time consuming call#1
if (!stopRequested) {
CSMonitor.Enter();
pPInfo = new CPInfo(/**/);
//Not time consuming but pPInfo must either be null or constructed.
CSMonitor.Exit();
}
if (!stopRequested) {
pPInfo->foobar(/**/);//Time consuming and can be signalled
}
if (!stopRequested) {
//One more optional but time consuming call.
}
}
void DrawingThread::RequestStop() {
CSMonitor.Enter();
stopRequested = true;
if (pPInfo) pPInfo->RequestStop();
CSMonitor.Exit();
}
I understand (at least in Windows) Monitor/locks are the least expensive thread synchronization primitive but I am keen to avoid overuse. Should I be wrapping each read of this boolean flag? It is initialized to false and only set once to true when stop is requested (if it is requested before the task completes).
My tutors advised to protect even bool's because read/writing may not be atomic. I think this one shot flag is the exception that proves the rule?
It is never OK to read something possibly modified in a different thread without synchronization. What level of synchronization is needed depends on what you are actually reading. For primitive types, you should have a look at atomic reads, e.g. in the form of std::atomic<bool>.
The reason synchronization is always needed is that the processors will have the data possibly shared in a cache line. It has no reason to update this value to a value possibly changed in a different thread if there is no synchronization. Worse, yet, if there is no synchronization it may write the wrong value if something stored close to the value is changed and synchronized.
Boolean assignment is atomic. That's not the problem.
The problem is that a thread may not not see changes to a variable done by a different thread due to either compiler or CPU instruction reordering or data caching (i.e. the thread that reads the boolean flag may read a cached value, instead of the actual updated value).
The solution is a memory fence, which indeed is implicitly added by lock statements, but for a single variable it's overkill. Just declare it as std::atomic<bool>.
The answer, I believe, is "it depends." If you're using C++03, threading isn't defined in the Standard, and you'll have to read what your compiler and your thread library say, although this kind of thing is usually called a "benign race" and is usually OK.
If you're using C++11, benign races are undefined behavior. Even when undefined behavior doesn't make sense for the underlying data type. The problem is that compilers can assume that programs have no undefined behavior, and make optimizations based on that (see also the Part 1 and Part 2 linked from there). For instance, your compiler could decide to read the flag once and cache the value because it's undefined behavior to write to the variable in another thread without some kind of mutex or memory barrier.
Of course, it may well be that your compiler promises to not make that optimization. You'll need to look.
The easiest solution is to use std::atomic<bool> in C++11, or something like Hans Boehm's atomic_ops elsewhere.
No, you have to protect every access, since modern compilers and cpus reorder the code without your multithreading tasks in mind. The read access from different threads might work, but don't have to work.