Why does this double mutex lock not cause deadlock?

Why does this double mutex lock not cause deadlock? - c++

I test c++11 mutex in my centos computer. I try to double lock this mutex to make deadlock. But after I run it, everything is fine and no deadlock occurs.
#include <thread>
#include <mutex>
#include <iostream>
std::mutex m;
int main()
{
m.lock();
m.lock();
std::cout<<"i am ok"<<std::endl;
return 0;
}
The compiler is g++ 4.8.5 in centos 3.10.0-327.36.3.el7.x86_64:
[zzhao010#localhost shareLibPlay]$ ./3.out
i am ok

Locking a std::mutex that is already locked by the same thread is undefined behavior and therefore it may work, it may fail, it may drink all your beer and throw up on the couch. No guarantees.

The behavior is undefined in case you invoke lock twice as you did.
It works as you would expect it to do is a valid undefined behavior indeed.
See here for further details.

For a deadlock, you need at least two
By definition, a deadlock involves at least 2 parties. This was laid down by many authors, among others Hoare in his pioneering work Communicating Sequential Processes. This is also reminded in the C++ standard definitions (emphasis is mine):
17.3.8: Deadlock: one or more threads are unable to continue execution because
each is blocked waiting for one or more of the others to satisfy some
condition
A more illustrative definition is given by Anthony Williams, in C++ concurrency in action
Neither thread can proceed, because each is waiting for the other to release it's mutex. This scenario is called deadlock and it's the biggest problem with having to lock two or more mutexes.
You can therefore by definition not create a deadlock with a single thread in a single process.
Don't misunderstand the standard
The standard says on mutexes:
30.4.1.2.1/4 [Note: A program may deadlock if the thread that owns a mutex object calls lock() on that object.]
This is a non-normative note. I think it embarrassingly contradicts the standard's own definition. From the terminology point of view, a process that locks itself is in a blocked state.
But more important, and beyond the issue of deadlock terminology, the word "MAY" allows the said behavior for C++ implementations (e.g. if it is not able on a particular OS to detect a redundant lock acquisition). But it's not required at all : I believe that most mainstream C++ implementation will work fine, exactly as you have experienced yourself.
Want to experience with deadlocks ?
If you want to experience with real deadlocks, or if you want simply to find out if your C++ implementation is able to detect the resource_deadlock_would_occur error, here a short example. It could go fine but has high probability of creating a deadlock:
std::mutex m1,m2;
void foo() {
m1.lock();
std::cout<<"foo locked m1"<<std::endl;
std::this_thread::sleep_for (std::chrono::seconds(1));
m2.lock();
m1.unlock();
std::cout<<"foo locked m2 and releases m1"<<std::endl;
m2.unlock();
std::cout<<"foo is ok"<<std::endl;
}
void bar() {
m2.lock();
std::cout<<"bar locked m2"<<std::endl;
std::this_thread::sleep_for (std::chrono::seconds(1));
m1.lock();
m2.unlock();
std::cout<<"barlocked m1 and releases m2"<<std::endl;
m1.unlock();
std::cout<<"bar is ok"<<std::endl;
}
int main()
{
std::thread t1(foo);
bar();
t1.join();
std::cout << "Everything went fine"<<std::endl;
return 0;
}
Online demo
This kind of deadlock is avoided by locking the different mutexes always in the same order.

Related

Destruction of condition variable randomly loses notification

Given a condition_variable as a member of a class, my understanding is that:
The condition variable is destroyed after the class destructor completes.
Destruction of a condition variable does not need to wait for notifications to have been received.
In light of these expectations, my question is: why does the example code below randomly fail to notify a waiting thread?
#include <mutex>
#include <condition_variable>
#define NOTIFY_IN_DESTRUCTOR
struct notify_on_delete {
std::condition_variable cv;
~notify_on_delete() {
#ifdef NOTIFY_IN_DESTRUCTOR
cv.notify_all();
#endif
}
};
int main () {
for (int trial = 0; trial < 10000; ++trial) {
notify_on_delete* nod = new notify_on_delete();
std::mutex flag;
bool kill = false;
std::thread run([nod, &flag, &kill] () {
std::unique_lock<std::mutex> lock(flag);
kill = true;
nod->cv.wait(lock);
});
while(true) {
std::unique_lock<std::mutex> lock(flag);
if (!kill) continue;
#ifdef NOTIFY_IN_DESTRUCTOR
delete nod;
#else
nod->cv.notify_all();
#endif
break;
}
run.join();
#ifndef NOTIFY_IN_DESTRUCTOR
delete nod;
#endif
}
return 0;
}
In the code above, if NOTIFY_IN_DESTRUCTOR is not defined then the test will run to completion reliably. However, when NOTIFY_IN_DESTRUCTOR is defined the test will randomly hang (usually after a few thousand trials).
I am compiling using Apple Clang:
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.3.0
Thread model: posix
C++14 specified, compiled with DEBUG flags set.
EDIT:
To clarify: this question is about the semantics of the specified behavior of instances of condition_variable. The second point above appears to be reenforced in the following quote:
Blockquote
Requires: There shall be no thread blocked on *this. [ Note: That is, all threads shall have been notified; they may subsequently block on the lock specified in the wait. This relaxes the usual rules, which would have required all wait calls to happen before destruction. Only the notification to unblock the wait needs to happen before destruction. The user should take care to ensure that no threads wait on *this once the destructor has been started, especially when the waiting threads are calling the wait functions in a loop or using the overloads of wait, wait_for, or wait_until that take a predicate. — end note ]
The core semantic question seems to be what "blocked on" means. My present interpretation of the quote above would be that after the line
cv.notify_all(); // defined NOTIFY_IN_DESTRUCTOR
in ~notify_on_delete() the thread test is not "blocked on" nod - which is to say that I presently understand that after this call "the notification to unblock the wait" has occurred, so according to the quote the requirement has been met to proceed with the destruction of the condition_variable instance.
Can someone provide a clarification of "blocked on" or "notification to unblock" to the effect that in the code above, the call to notify_all() does not satisfy the requirements of ~condition_variable()?

When NOTIFY_IN_DESTRUCTOR is defined:
Calling notify_one()/notify_all() doesn't mean that the waiting thread is immediately woken up and the current thread will wait for the other thread. It just means that if the waiting thread wakes up at some point after the current thread has called notify, it should proceed. So in essence, you might be deleting the condition variable before the waiting thread wakes up (depending on how the threads are scheduled).
The explanation for why it hangs, even if the condition variable is deleted while the other thread is waiting on it lies on the fact the wait/notify operations are implemented using queues associated with the condition variables. These queues hold the threads waiting on the condition variables. Freeing the condition variable would mean getting rid of these thread queues.

I am pretty sure your vendors implementation is broken. Your program looks almost OK from the perspective of obeying the contract with the cv/mutex classes. I couldn’t 100% verify, I am behind one version.
The notion of “blocking” is confusing in the condition_variable (CV) class because there are multiple things to be blocking on. The contract requires the implementation to be more complex than a veneer on pthread_cond* (for example). My reading of it indicates that a single CV would require at least 2 pthread_cond_t’s to implement.
The crux is the destructor having a definition while threads are waiting upon a CV; and its ruin is in a race between CV.wait and ~CV. The naive implementation simply has ~CV broadcast the condvar then eliminate it, and has CV.wait remember the lock in a local variable, so that when it awakens from the runtime notion of blocking it no longer has to reference the object. In that implementation, ~CV becomes a “fire and forget” mechanism.
Sadly, a racing CV.wait could meet the preconditions, yet not be finished interacting with the object yet, when ~CV sneaks in and destroys it. To resolve the race CV.wait and ~CV need to exclude each other, thus the CV requires at least a private mutex to resolve races.
We aren’t finished yet. There usually isn’t an underlying support [ eg. kernel ] for an operation like “wait on cv controlled by lock and release this other lock once I am blocked”. I think that even the posix folks found that too funny to require. Thus, burying a mutex in my CV isn’t enough, I actually require a mechanism that permits me to process events within it; thus a private condvar is required inside the implementation of CV. Obligatory David Parnas meme.
Almost OK, because as Marek R points out, you are relying on referencing a class after its destruction has begun; not the cv/mutex class, your notify_on_delete class. The conflict is a bit academic. I doubt clang would depend upon nod remaining valid after control had transferred to nod->cv.wait(); but the real customer of most compiler vendors are benchmarks, not programmers.
As as general note, multi-threaded programming is difficult, and having now peaked at the c++ threading model, it might be best to give it a decade or two to settle down. It’s contracts are astonishing. When I first looked at your program, I thought ‘duh, there is no way you can destroy a cv that can be accessed because RAII’. Silly me.
Pthreads is another awful API for threading. At least it doesn’t attempt over-reach, and is mature enough that robust test suites keep vendors in line.

Deadlocks related to scheduling

On the Oracle docs for multithreading they have this paragraph about deadlocks when trying to require a lock:
Because there is no guaranteed order in which locks are acquired, a
problem in threaded programs is that a particular thread never
acquires a lock, even though it seems that it should.
This usually happens when the thread that holds the lock releases it,
lets a small amount of time pass, and then reacquires it. Because the
lock was released, it might seem that the other thread should acquire
the lock. But, because nothing blocks the thread holding the lock, it
continues to run from the time it releases the lock until it
reacquires the lock, and so no other thread is run.
Just to make sure I understood this, I tried to write this out in code (let me know if this is a correct interpretation):
#include <mutex>
#include <chrono>
#include <thread>
#include <iostream>
std::mutex m;
void f()
{
std::unique_lock<std::mutex> lock(m); // acquire the lock
std::cout << std::this_thread::get_id() << " now has the mutex\n";
lock.unlock(); // release the lock
std::this_thread::sleep_for(std::chrono::seconds(2)); // sleep for a while
lock.lock(); // reacquire the lock
std::cout << std::this_thread::get_id() << " has the mutex after sleep\n";
}
int main()
{
std::thread(f).join(); // thread A
std::thread(f).join(); // thread B
}
So what the quote above is saying is that the time during which the lock is released and the thread is sleeping (like the code above) is not sufficient to guarantee a thread waiting on the lock to acquire it? How does this relate to deadlocks?

The document is addressing a specific kind of fairness problem and is lumping it into its discussions about deadlock. The document correctly defines deadlock to mean a "permanent blocking of a set of threads". It then describes a slightly less obvious way of achieving the permanent blocking condition.
In the scenario described, assume two threads attempt to acquire the same lock simultaneously. Only one will win, so call it W, and the other loses, so call it L. The loser is put to sleep to wait its turn to get the lock.
The quoted text says that L may never get a chance to get the lock, even if it is released by W.
The reason this might happen is the lock does not impose fairness over which thread has acquired it. The lock is more than happy to let W acquire and release it forever. If W is implemented in such a way that it does not need to context switch after it releases the lock, it may just end up acquiring the lock again before L has a chance to wake up to see if the lock is available.
So, in the code below, if W_thread wins the initial race against L_thread, L_thread is effectively deadlocked, even though in theory it could acquire the lock between iterations of W_thread.
void W_thread () {
for (;;) {
std::unique_lock<std::mutex> lock(m);
//...
}
}
void L_thread () {
std::unique_lock<std::mutex> lock(m);
//...
}
The document recommends using thr_yield() to force a context switch to another thread. This API is specific to Solaris/SunOS. The POSIX version of it is called sched_yield(), although some UNIX versions (including Linux) provide a wrapper called pthread_yield().
In C++11, this is accomplished via std::this_thread::yield().

Why locking a std::mutex doesn't block the thread

I wrote the following code to test my understanding of std::mutex
int main() {
mutex m;
m.lock();
m.lock(); // expect to block the thread
}
And then I got a system_error: device or resource busy. Isn't the second m.lock() supposed to block the thread?

From std::mutex:
A calling thread must not own the mutex prior to calling lock or try_lock.
and from std::mutex::lock:
If lock is called by a thread that already owns the mutex, the program may deadlock. Alternatively, if an implementation can detect the deadlock, a resource_deadlock_would_occur error condition may be observed.
and the exceptions clause:
Throws std::system_error when errors occur, including errors from the underlying operating system that would prevent lock from meeting its specifications. The mutex is not locked in the case of any exception being thrown.
Therefore it is not supposed to block the thread. On your platform, the implementation appears to be able to detect when a thread is already the owner of a lock and raise an exception. This may not happen on other platforms, as indicated in the descriptions.

Isn't the second m.lock() supposed to block the thread?
No, it gives undefined behaviour. The second m.lock() breaks this requirement:
C++11 30.4.1.2/7 Requires: If m is of type std::mutex or std::timed_mutex, the calling thread does not own the mutex.
It looks like your implementation is able to detect that the calling thread owns the mutex and gives an error; others may block indefinitely, or fail in other ways.

(std::mutex wasn't mentioned in the question when I wrote this answer.)
It depends on the mutex library and mutex type you're using - you haven't told us. Some systems provide a "recursive mutex" that is allowed to be called multiple times like this only if it happens from the same thread (then you must have a matching number of unlocks before another thread can lock it), other libraries consider this an error and may fail gracefully (as yours has) or have undefined behaviour.

Is there a data race in this code?

In this page, this sample code is written to explain how to use notify_one:
#include <iostream>
#include <condition_variable>
#include <thread>
#include <chrono>
std::condition_variable cv;
std::mutex cv_m;
int i = 0;
bool done = false;
void waits()
{
std::unique_lock<std::mutex> lk(cv_m);
std::cout << "Waiting... \n";
cv.wait(lk, []{return i == 1;});
std::cout << "...finished waiting. i == 1\n";
done = true;
}
void signals()
{
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << "Notifying...\n";
cv.notify_one();
std::unique_lock<std::mutex> lk(cv_m);
i = 1;
while (!done) {
lk.unlock();
std::this_thread::sleep_for(std::chrono::seconds(1));
lk.lock();
std::cerr << "Notifying again...\n";
cv.notify_one();
}
}
int main()
{
std::thread t1(waits), t2(signals);
t1.join(); t2.join();
}
However, valgrind (helgrind, actually) complains that:
Probably a race condition: condition variable 0x605420 has been
signaled but the associated mutex 0x605460 is not locked by the
signalling thread.
If the second threads runs before the first one and reaches cv.notify_one(); before anyone else, it will signals other threads without any lock being hold.
I am actually learning how to use these condition variables and trying to understand who should lock/unlock the mutex associated with them. So my question is: is this code doing things right? or is helgrind wrong?

[Truth in advertising: I was, until very recently, the architect of a commercial data-race and memory error detector that "competes" with Helgrind/Valgrind.]
There is no data race in your code. Helgrind is issuing this warning because of a subtlety about the way condition variables work. There is some discussion about it in the "hints" section of the Helgrind manual. Briefly: Helgrind is doing happens-before data race detection. It derives the "happens-before" relation by observing the order in which your code is calling pthread_mutex_lock/unlock and pthread_cond_wait/signal (these are the C primitives upon which the C++11 primitives are implemented.)
If you follow the discipline that your cv.notify_one() calls are always protected by the same mutex that surrounds the corresponding cv.wait() calls then Helgrind knows that the mutexes will enforce the correct happens-before relationship and so everything will be okay.
In your case Helgrind is complaining about the initial (gratuitous) cv.notify_one() call at the top of signals(), before you acquire the lock on cv_m. It knows that this is the kind of situation that can confuse it (although the real confusion is that it might later report false positives, so the warning message here is a bit misleading.)
Please note the advice to "use semaphores instead of condition_variables" in the hints section of the Helgrind manual is horrible advice. Semaphores are much harder to check for correctness than condition variables both for tools and for humans. Semaphores are "too general" in the sense that there are all sorts of invariants you can not rely on. The same thread that "locks" with a semaphore doesn't have to be the thread that "unlocks". Two threads that "wait" on a non-binary semaphore may or may not have a happens-before relationship. So semaphores are pretty much useless if you are trying to reason (or automatically detect) deadlock or data race conditions.
Better advice is to use condition variables to signal/wait, but to make sure that you follow a discipline where all calls on a particular condition variable happen within critical sections protected by the same mutex.

In your case, there is no problem.
In general, issues can arise if there is a third thread that may be using the same mutex, or if it's possible that the waiting thread may destroy the condition variable when it finishes running.
It's just safer to lock the mutex while signalling, to ensure that the signalling code entirely runs before the woken up code.

Edit: Turns out this isn't true, but it's being preserved here because #dauphic's explanation of why it isn't true in the comments is helpful.
You have reversed the order of unlock and lock in your while loop, such that you are attempting to unlock the mutex before you have ever locked it, which would seem consistent with the valgrind message you are seeing.

does boost::thread::timed_join(0) acquire a lock?

I need to check if my boost::thread I've created is running from another thread. This SO post explains you can do this by calling:
boost::posix_time::seconds waitTime(0);
myBoostThread.timed_join(waitTime);
I can't have any critical sections in my client thread. Can I guarantee that timed_join() with 0 time argument be lock free?

Boost.Thread provides no guarantees about a lock-free timed_join(). However, the implementation, which is always subject to change:
Boost.Thread acquires a mutex for pthreads, then performs a timed wait on a condition variable.
Boost.Thread calls WaitForMultipleObjects for windows. Its documentation indicates that it will always return immediately. However, I do not know if the underlying OS implementation is lock-free.
For an alternative, consider using atomic operations. While Boost 1.52 does not currently provide a public atomic library, both Boost.Smart_Ptr and Boost.Interprocess have atomic integers within their detail namespace. However, neither of these guarantee lock-free implementations, and one of the configurations for Boost.Smart_Ptr will lock with pthread mutex. Thus, you may need to consult your compiler and system's documentation to identify a lock-free implementation.
Nevertheless, here is a small example using boost::detail::atomic_count:
#include <boost/chrono.pp>
#include <boost/detail/atomic_count.hpp>
#include <boost/thread.hpp>
// Use RAII to perform cleanup.
struct count_guard
{
count_guard(boost::detail::atomic_count& count) : count_(count) {}
~count_guard() { --count_; }
boost::detail::atomic_count& count_;
};
void thread_main(boost::detail::atomic_count& count)
{
// Place the guard on the stack. When the thread exits through either normal
// means or the stack unwinding from an exception, the atomic count will be
// decremented.
count_guard decrement_on_exit(count);
boost::this_thread::sleep_for(boost::chrono::seconds(5));
}
int main()
{
boost::detail::atomic_count count(1);
boost::thread t(thread_main, boost::ref(count));
// Check the count to determine if the thread has exited.
while (0 != count)
{
std::cout << "Sleeping for 2 seconds." << std::endl;
boost::this_thread::sleep_for(boost::chrono::seconds(2));
}
}
In this case, the at_thread_exit() extension could be used as an alternative to using RAII.

No, there is no such guarantee.
Even if the boost implementation is completely lock free (I haven't checked), there is no guarantee that the underlying OS implementation is completely lock free.
That said, if locks were used here, I would find it unlikely that they will cause any significant delay in the application, so I would not hesitate using timed_join unless there is a hard real-time deadline to meet (which does not equate to UI responsiveness).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js