Is there a data race in this code? - c++

In this page, this sample code is written to explain how to use notify_one:
#include <iostream>
#include <condition_variable>
#include <thread>
#include <chrono>
std::condition_variable cv;
std::mutex cv_m;
int i = 0;
bool done = false;
void waits()
{
std::unique_lock<std::mutex> lk(cv_m);
std::cout << "Waiting... \n";
cv.wait(lk, []{return i == 1;});
std::cout << "...finished waiting. i == 1\n";
done = true;
}
void signals()
{
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << "Notifying...\n";
cv.notify_one();
std::unique_lock<std::mutex> lk(cv_m);
i = 1;
while (!done) {
lk.unlock();
std::this_thread::sleep_for(std::chrono::seconds(1));
lk.lock();
std::cerr << "Notifying again...\n";
cv.notify_one();
}
}
int main()
{
std::thread t1(waits), t2(signals);
t1.join(); t2.join();
}
However, valgrind (helgrind, actually) complains that:
Probably a race condition: condition variable 0x605420 has been
signaled but the associated mutex 0x605460 is not locked by the
signalling thread.
If the second threads runs before the first one and reaches cv.notify_one(); before anyone else, it will signals other threads without any lock being hold.
I am actually learning how to use these condition variables and trying to understand who should lock/unlock the mutex associated with them. So my question is: is this code doing things right? or is helgrind wrong?

[Truth in advertising: I was, until very recently, the architect of a commercial data-race and memory error detector that "competes" with Helgrind/Valgrind.]
There is no data race in your code. Helgrind is issuing this warning because of a subtlety about the way condition variables work. There is some discussion about it in the "hints" section of the Helgrind manual. Briefly: Helgrind is doing happens-before data race detection. It derives the "happens-before" relation by observing the order in which your code is calling pthread_mutex_lock/unlock and pthread_cond_wait/signal (these are the C primitives upon which the C++11 primitives are implemented.)
If you follow the discipline that your cv.notify_one() calls are always protected by the same mutex that surrounds the corresponding cv.wait() calls then Helgrind knows that the mutexes will enforce the correct happens-before relationship and so everything will be okay.
In your case Helgrind is complaining about the initial (gratuitous) cv.notify_one() call at the top of signals(), before you acquire the lock on cv_m. It knows that this is the kind of situation that can confuse it (although the real confusion is that it might later report false positives, so the warning message here is a bit misleading.)
Please note the advice to "use semaphores instead of condition_variables" in the hints section of the Helgrind manual is horrible advice. Semaphores are much harder to check for correctness than condition variables both for tools and for humans. Semaphores are "too general" in the sense that there are all sorts of invariants you can not rely on. The same thread that "locks" with a semaphore doesn't have to be the thread that "unlocks". Two threads that "wait" on a non-binary semaphore may or may not have a happens-before relationship. So semaphores are pretty much useless if you are trying to reason (or automatically detect) deadlock or data race conditions.
Better advice is to use condition variables to signal/wait, but to make sure that you follow a discipline where all calls on a particular condition variable happen within critical sections protected by the same mutex.

In your case, there is no problem.
In general, issues can arise if there is a third thread that may be using the same mutex, or if it's possible that the waiting thread may destroy the condition variable when it finishes running.
It's just safer to lock the mutex while signalling, to ensure that the signalling code entirely runs before the woken up code.

Edit: Turns out this isn't true, but it's being preserved here because #dauphic's explanation of why it isn't true in the comments is helpful.
You have reversed the order of unlock and lock in your while loop, such that you are attempting to unlock the mutex before you have ever locked it, which would seem consistent with the valgrind message you are seeing.

Related

Does it make sense to use different mutexes with the same condition variable?

The documentation of the notify_one() function of condition variable at cppreference.com states the following
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.
The first part of the sentence is strange, if I hold different mutexes in the notifying and notified threads, then the mutexes have no real meaning as the there is no 'blocking' operation here. In fact, if different mutexes are held, then the likelihood that a spurious wake up could cause the notification to be missed is possible! I get the impression that we might as well not lock on the notifying thread in such a case. Can someone clarify this?
Consider the following from cppreference page on condition variables as an example.
std::mutex m; // this is supposed to be a pessimization
std::condition_variable cv;
std::string data;
bool ready = false;
bool processed = false;
void worker_thread()
{
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m); // a different, local std::mutex is supposedly better
cv.wait(lk, []{return ready;});
// after the wait, we own the lock.
std::cout << "Worker thread is processing data\n";
data += " after processing";
// Send data back to main()
processed = true;
std::cout << "Worker thread signals data processing completed\n";
lk.unlock();
cv.notify_one();
}
int main()
{
std::thread worker(worker_thread);
data = "Example data";
// send data to the worker thread
{
std::lock_guard<std::mutex> lk(m); // a different, local std::mutex is supposedly better
ready = true;
std::cout << "main() signals data ready for processing\n";
}
cv.notify_one();
// wait for the worker
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return processed;});
}
std::cout << "Back in main(), data = " << data << '\n';
worker.join();
}
PS. I saw a few questions that are similarly titled but they refer to different aspects of the problem.
I think the wording of cppreference is somewhat awkward here. I think they were just trying to differentiate the mutex used in conjunction with the condition variable from other unrelated mutexes.
It makes no sense to use a condition variable with different mutexes. The mutex is used to make any change to the actual semantic condition (in the example it is just the variable ready) atomic and it must therefore be held whenever the condition is updated or checked. Also it is needed to ensure that a waiting thread that is unblocked can immediately check the condition without again running into race conditions.
I understand it as follows:
It is OK, not to hold the lock on the mutex associated with the condition variable when notify_one is called, or any mutex at all, however it is OK to hold other mutexes for different reasons.
The pessimisation is not that only one mutex is used, but to hold this mutex for longer than necessary when you know that another thread is supposed to immediately try to acquire the mutex after being notified.
I think that my interpretation agrees with the explanation given in cppreference on condition variable:
The thread that intends to modify the shared variable has to
acquire a std::mutex (typically via std::lock_guard)
perform the modification while the lock is held
execute notify_one or notify_all on the std::condition_variable (the lock does not need to be held for notification)
Even if the shared variable is atomic, it must be modified under the mutex in order to correctly publish the modification to the waiting thread.
Any thread that intends to wait on std::condition_variable has to
acquire a std::unique_lock<std::mutex>, on the same mutex as used to protect the shared variable
Furthermore the standard expressly forbids using different mutexes for wait, wait_­for, or wait_­until:
lock.mutex() returns the same value for each of the lock arguments supplied by all concurrently waiting (via wait, wait_­for, or wait_­until) threads.
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s);
That's misleading. The problem is the word, "same." They should have said, "...does not need to hold the lock on any mutex..." That's the real point. There's an important reason why the waiting thread should have a mutex locked when it enters the wait() call: It's awaiting some change in some shared data structure, and it needs the mutex to be locked when it accesses the structure to check whether or not the awaited change actually has happened.
The notify()ing thread probably needs to lock the same mutex in order to effect that change, but the correctness of the program won't depend on whether it calls notify() before or after it releases the mutex.

Is it always necessary for a notifying thread to lock the shared data during modification?

When using a condition variable, http://en.cppreference.com/w/cpp/thread/condition_variable describes the typical steps for the thread that notifies as:
acquire a std::mutex (typically via std::lock_guard)
perform the modification while the lock is held
execute notify_one or notify_all on the std::condition_variable (the lock does not need to be held for notification)
For the simple case shown below, is it necessary for the main thread to lock "stop" when it modifies it? While I understand that locking when modifying shared data is almost always a good idea, I'm not sure why it would be necessary in this case.
std::condition_variable cv;
std::mutex mutex;
bool stop = false;
void worker() {
std::unique_lock<std::mutex> lock(mutex);
cv.wait(lock, [] { return stop; })
}
// No lock (Why would this not work?)
void main() {
std::thread(worker);
std::this_thread::sleep_for(1s);
stop = true;
cv.notify_one();
}
// With lock: why is this neccesary?
void main() {
std::thread mythread(worker);
std::this_thread::sleep_for(1s);
{
std::unique_lock<std::mutex>(mutex);
stop = true;
}
cv.notify_one();
mythread.join();
}
For the simple case shown below, is it necessary for the main thread to lock "stop" when it modifies it?
Yes, to prevent race condition. Aside from issues of accessing shared data from different threads, which could be fixed by std::atomic, imagine this order of events:
worker_thread: checks value of `stop` it is false
main_thread: sets `stop` to true and sends signal
worker_thread: sleeps on condition variable
In this situation worker thread would possibly sleep forever and it would miss event of stop set to true simply because wakeup signal was already sent and it missed it.
Acquiring mutex on modification of shared data is required because only then you can treat checking condition and going to sleep or acting on it in worker thread as atomic operation as whole.
Is , it is necessary.
First of all, from the standard point of view, you code exhibits undefined behavior, as stop is not atomic and isn't change under a lock, while other threads may read or write to it. both reading and writing must occur under a lock if multiple threads read and write to a shared memory.
From hardware perspective, if stop isn't changed under a lock, other threads, even through they lock some lock, aren't guaranteed to "see" the change that was done. locks don't just prevent from other threads to intervene, they also force the latest change to be read or written into the main memory, and hence all the threads can work with the latest values.

Why does this double mutex lock not cause deadlock?

I test c++11 mutex in my centos computer. I try to double lock this mutex to make deadlock. But after I run it, everything is fine and no deadlock occurs.
#include <thread>
#include <mutex>
#include <iostream>
std::mutex m;
int main()
{
m.lock();
m.lock();
std::cout<<"i am ok"<<std::endl;
return 0;
}
The compiler is g++ 4.8.5 in centos 3.10.0-327.36.3.el7.x86_64:
[zzhao010#localhost shareLibPlay]$ ./3.out
i am ok
Locking a std::mutex that is already locked by the same thread is undefined behavior and therefore it may work, it may fail, it may drink all your beer and throw up on the couch. No guarantees.
The behavior is undefined in case you invoke lock twice as you did.
It works as you would expect it to do is a valid undefined behavior indeed.
See here for further details.
For a deadlock, you need at least two
By definition, a deadlock involves at least 2 parties. This was laid down by many authors, among others Hoare in his pioneering work Communicating Sequential Processes. This is also reminded in the C++ standard definitions (emphasis is mine):
17.3.8: Deadlock: one or more threads are unable to continue execution because
each is blocked waiting for one or more of the others to satisfy some
condition
A more illustrative definition is given by Anthony Williams, in C++ concurrency in action
Neither thread can proceed, because each is waiting for the other to release it's mutex. This scenario is called deadlock and it's the biggest problem with having to lock two or more mutexes.
You can therefore by definition not create a deadlock with a single thread in a single process.
Don't misunderstand the standard
The standard says on mutexes:
30.4.1.2.1/4 [Note: A program may deadlock if the thread that owns a mutex object calls lock() on that object.]
This is a non-normative note. I think it embarrassingly contradicts the standard's own definition. From the terminology point of view, a process that locks itself is in a blocked state.
But more important, and beyond the issue of deadlock terminology, the word "MAY" allows the said behavior for C++ implementations (e.g. if it is not able on a particular OS to detect a redundant lock acquisition). But it's not required at all : I believe that most mainstream C++ implementation will work fine, exactly as you have experienced yourself.
Want to experience with deadlocks ?
If you want to experience with real deadlocks, or if you want simply to find out if your C++ implementation is able to detect the resource_deadlock_would_occur error, here a short example. It could go fine but has high probability of creating a deadlock:
std::mutex m1,m2;
void foo() {
m1.lock();
std::cout<<"foo locked m1"<<std::endl;
std::this_thread::sleep_for (std::chrono::seconds(1));
m2.lock();
m1.unlock();
std::cout<<"foo locked m2 and releases m1"<<std::endl;
m2.unlock();
std::cout<<"foo is ok"<<std::endl;
}
void bar() {
m2.lock();
std::cout<<"bar locked m2"<<std::endl;
std::this_thread::sleep_for (std::chrono::seconds(1));
m1.lock();
m2.unlock();
std::cout<<"barlocked m1 and releases m2"<<std::endl;
m1.unlock();
std::cout<<"bar is ok"<<std::endl;
}
int main()
{
std::thread t1(foo);
bar();
t1.join();
std::cout << "Everything went fine"<<std::endl;
return 0;
}
Online demo
This kind of deadlock is avoided by locking the different mutexes always in the same order.

Deadlocks related to scheduling

On the Oracle docs for multithreading they have this paragraph about deadlocks when trying to require a lock:
Because there is no guaranteed order in which locks are acquired, a
problem in threaded programs is that a particular thread never
acquires a lock, even though it seems that it should.
This usually happens when the thread that holds the lock releases it,
lets a small amount of time pass, and then reacquires it. Because the
lock was released, it might seem that the other thread should acquire
the lock. But, because nothing blocks the thread holding the lock, it
continues to run from the time it releases the lock until it
reacquires the lock, and so no other thread is run.
Just to make sure I understood this, I tried to write this out in code (let me know if this is a correct interpretation):
#include <mutex>
#include <chrono>
#include <thread>
#include <iostream>
std::mutex m;
void f()
{
std::unique_lock<std::mutex> lock(m); // acquire the lock
std::cout << std::this_thread::get_id() << " now has the mutex\n";
lock.unlock(); // release the lock
std::this_thread::sleep_for(std::chrono::seconds(2)); // sleep for a while
lock.lock(); // reacquire the lock
std::cout << std::this_thread::get_id() << " has the mutex after sleep\n";
}
int main()
{
std::thread(f).join(); // thread A
std::thread(f).join(); // thread B
}
So what the quote above is saying is that the time during which the lock is released and the thread is sleeping (like the code above) is not sufficient to guarantee a thread waiting on the lock to acquire it? How does this relate to deadlocks?
The document is addressing a specific kind of fairness problem and is lumping it into its discussions about deadlock. The document correctly defines deadlock to mean a "permanent blocking of a set of threads". It then describes a slightly less obvious way of achieving the permanent blocking condition.
In the scenario described, assume two threads attempt to acquire the same lock simultaneously. Only one will win, so call it W, and the other loses, so call it L. The loser is put to sleep to wait its turn to get the lock.
The quoted text says that L may never get a chance to get the lock, even if it is released by W.
The reason this might happen is the lock does not impose fairness over which thread has acquired it. The lock is more than happy to let W acquire and release it forever. If W is implemented in such a way that it does not need to context switch after it releases the lock, it may just end up acquiring the lock again before L has a chance to wake up to see if the lock is available.
So, in the code below, if W_thread wins the initial race against L_thread, L_thread is effectively deadlocked, even though in theory it could acquire the lock between iterations of W_thread.
void W_thread () {
for (;;) {
std::unique_lock<std::mutex> lock(m);
//...
}
}
void L_thread () {
std::unique_lock<std::mutex> lock(m);
//...
}
The document recommends using thr_yield() to force a context switch to another thread. This API is specific to Solaris/SunOS. The POSIX version of it is called sched_yield(), although some UNIX versions (including Linux) provide a wrapper called pthread_yield().
In C++11, this is accomplished via std::this_thread::yield().

Need to mutex-protect (atomic) assignment sought by condition variable?

I understand how to use condition variables (crummy name for this construct, IMO, as the cv object neither is a variable nor indicates a condition). So I have a pair of threads, canonically set up with Boost.Thread as:
bool awake = false;
boost::mutex sync;
boost::condition_variable cv;
void thread1()
{
boost::unique_lock<boost::mutex> lock1(sync);
while (!awake)
cv.wait(lock1);
lock1.unlock(); // this line actually not canonical, but why not?
// proceed...
}
void thread2()
{
//...
boost::unique_lock<boost::mutex> lock2;
awake = true;
lock2.unlock();
cv.notify_all();
}
My question is: does thread2 really need to be protecting the assignment to awake? It seems to me the notify_all() call should be sufficient. If the data being manipulated and checked against were more than a simple "ok to proceed" flag, I see the value in the mutex, but here it seems like overkill.
A secondary question is that asked in the code fragment: Why doesn't the Boost documentation show the lock in thread1 being unlocked before the "process data" step?
EDIT: Maybe my question is really: Is there a cleaner construct than a CV to implement this kind of wait?
does thread2 really need to be protecting the assignment to awake?
Yes. Modifying an object from one thread and accessing it from another without synchronisation gives undefined behaviour. Even if it's just a bool.
For example, on some multiprocessor systems the write might only affect local memory; without an explicit synchronisation operation, other threads might never see the change.
Why doesn't the Boost documentation show the lock in thread1 being unlocked before the "process data" step?
If you unlocked the mutex before clearing the flag, then you might miss another signal.
Is there a cleaner construct than a CV to implement this kind of wait?
In Boost and the standard C++ library, no; a condition variable is flexible enough to handle arbitrary shared state and not particularly over-complicated for this simple case, so there's no particular need for anything simpler.
More generally, you could use a semaphore or a pipe to send a simple signal between threads.
Formally, you definitely need the lock in both threads: if any thread
modifies an object, and more than one thread accesses it, then all
accesses must be synchronized.
In practice, you'll probably get away with it without the lock; it's
almost certain that notify_all will issue the necessary fence or
membar instructions to ensure that the memory is properly synchronized.
But why take the risk?
As to the absense of the unlock, that's the whole point of the scoped
locking pattern: the unlock is in the destructor of the object, so
that the mutex will be unlocked even if an exception passes through.