Why is locking a std::mutex twice 'Undefined Behaviour'? - c++

As per this article:
If you try and lock a non-recursive mutex twice from the same thread without unlocking in between, you get undefined behavior.
My very naive mind tells me why don't they just return an error? Is there a reason why this has to be UB?

Because it never happens in a correct program, and making a check for something that never happens is wasteful (and to make that check it needs to store the owning thread ID, which is also wasteful).
Note that it being undefined allows debug implementations to throw an exception, for example, while still allowing release implementations to be as efficient as possible.

Undefined behavior allows implementations to do whatever is fastest/most convenient. For example, an efficient implementation of a non-recursive mutex might be a single bit where the lock operation is implemented with an atomic compare-and-swap instruction in a loop. If the thread that owns the mutex tries to lock it again it will deadlock because it is waiting for the mutex to unlock but since nobody else can unlock it (unless there's some other bug where some thread that doesn't own it unlocks it) the thread will wait forever.

Related

std::condition_variable memory writes visibility

Does std::condition_variable::notify_one() or std::condition_variable::notify_all() guarantee that non-atomic memory writes in the current thread prior to the call will be visible in notified threads?
Other threads do:
{
std::unique_lock lock(mutex);
cv.wait(lock, []() { return values[threadIndex] != 0; });
// May a thread here see a zero value and therefore start to wait again?
}
Main thread does:
fillData(values); // All values are zero and all threads wait() before calling this.
cv.notify_all(); // Do need some memory fence or lock before this
// to ensure that new non-zero values will be visible
// in other threads immediately after waking up?
Doesn't notify_all() store some atomic value therefore enforcing memory ordering? I did not clarified it.
UPD: according to Superlokkus' answer and an answer here: we have to acquire a lock to ensure memory writes visibility in other threads (memory propagation), otherwise threads in my case may read zero values.
Also I missed this quote here about condition_variable, which specifically answers my question. Even an atomic variable has to be modified under a lock in a case when the modification must become visible immediately.
Even if the shared variable is atomic, it must be modified under the
mutex in order to correctly publish the modification to the waiting
thread.
I guess you are mixing up memory ordering of so called atomic values and the mechanisms of classic lock based synchronization.
When you have a datum which is shared between threads, lets say an int for example, one thread can not simply read it while the other thread might be write to it meanwhile. Otherwise we would have a data race.
To get around this for long time we used classic lock based synchronization:
The threads share at least a mutex and the int. To read or to write any thread has to hold the lock first, meaning they wait on the mutex. Mutexes are build so that they are fine that this can happen concurrently. If a thread wins gettting the mutex it can change or read the int and then should unlock it, so others can read/write too. Using a conditional variable like you used is just to make the pattern "readers wait for a change of a value by a writer" more efficient, they get woken up by the cv instead of periodically waiting on the lock, reading, and unlocking, which would be called busy waiting.
So because you hold the lock in any after waiting on the mutex or in you case, correctly (mutex is still needed) waiting on the conditional variable, you can change the int. And readers will read the new value after the writer was able to wrote it, never the old. UPDATE: However one thing if have to add, which might also be the cause of confusion: Conditional variables are subject for so called spurious wakeups. Meaning even though you write did not have notified any thread, a read thread might still wake up, with the mutex locked. So you have to check if you writer actually waked you up, which is usually done by the writer by changing another datum just to notify this, or if its suitable by using the same datum you already wanted to share. The lambda parameter overload of std::condition_variable::wait was just made to make the checking and going back to sleep code looking a bit prettier. Based on your question now I don't know if you want to use you values for this job.
However at snippet for the "main" thread is incorrect or incomplete:
You are not synchronizing on the mutex in order to change values.
You have to hold the lock for that, but notifying can be done without the lock.
std::unique_lock lock(mutex);
fillData(values);
lock.unlock();
cv.notify_all();
But these mutex based patters have some drawbacks and are slow, only one thread at a time can do something. This is were so called atomics, like std::atomic<int> came into play. They can be written and read at the same time without an mutex by multiple threads concurrently. Memory ordering is only a thing to consider there and an optimization for cases where you uses several of them in a meaningful way or you don't need the "after the write, I never see the old value" guarantee. However with it's default memory ordering memory_order_seq_cst you would also be fine.

C++ atomics: how to allow only a single thread to access a function?

I'd like to write a function that is accessible only by a single thread at a time. I don't need busy waits, a brutal 'rejection' is enough if another thread is already running it. This is what I have come up with so far:
std::atomic<bool> busy (false);
bool func()
{
if (m_busy.exchange(true) == true)
return false;
// ... do stuff ...
m_busy.exchange(false);
return true;
}
Is the logic for the atomic exchange correct?
Is it correct to mark the two atomic operations as std::memory_order_acq_rel? As far as I understand a relaxed ordering (std::memory_order_relaxed) wouldn't be enough to prevent reordering.
Your atomic swap implementation might work. But trying to do thread safe programming without a lock is most always fraught with issues and is often harder to maintain.
Unless there's a performance improvement that's needed, then std::mutex with the try_lock() method is all you need, eg:
std::mutex mtx;
bool func()
{
// making use of std::unique_lock so if the code throws an
// exception, the std::mutex will still get unlocked correctly...
std::unique_lock<std::mutex> lck(mtx, std::try_to_lock);
bool gotLock = lck.owns_lock();
if (gotLock)
{
// do stuff
}
return gotLock;
}
Your code looks correct to me, as long as you leave the critical section by falling out, not returning or throwing an exception.
You can unlock with a release store; an RMW (like exchange) is unnecessary. The initial exchange only needs acquire. (But does need to be an atomic RMW like exchange or compare_exchange_strong)
Note that ISO C++ says that taking a std::mutex is an "acquire" operation, and releasing is is a "release" operation, because that's the minimum necessary for keeping the critical section contained between the taking and the releasing.
Your algo is exactly like a spinlock, but without retry if the lock's already taken. (i.e. just a try_lock). All the reasoning about necessary memory-order for locking applies here, too. What you've implemented is logically equivalent to the try_lock / unlock in #selbie's answer, and very likely performance-equivalent, too. If you never use mtx.lock() or whatever, you're never actually blocking i.e. waiting for another thread to do something, so your code is still potentially lock-free in the progress-guarantee sense.
Rolling your own with an atomic<bool> is probably good; using std::mutex here gains you nothing; you want it to be doing only this for try-lock and unlock. That's certainly possible (with some extra function-call overhead), but some implementations might do something more. You're not using any of the functionality beyond that. The one nice thing std::mutex gives you is the comfort of knowing that it safely and correctly implements try_lock and unlock. But if you understand locking and acquire / release, it's easy to get that right yourself.
The usual performance reason to not roll your own locking is that mutex will be tuned for the OS and typical hardware, with stuff like exponential backoff, x86 pause instructions while spinning a few times, then fallback to a system call. And efficient wakeup via system calls like Linux futex. All of this is only beneficial to the blocking behaviour. .try_lock leaves that all unused, and if you never have any thread sleeping then unlock never has any other threads to notify.
There is one advantage to using std::mutex: you can use RAII without having to roll your own wrapper class. std::unique_lock with the std::try_to_lock policy will do this. This will make your function exception-safe, making sure to always unlock before exiting, if it got the lock.

Is that OK if I unlock a mutex before I lock it?

I wrote some code before but I realized it has a very bad code style so I changed it.
I changed the A.unlock inside of the if block. I know if the if never run then this thread will unlock the mutex which does not belong to itself and then it will return an undefined behavior.
My question is, if it returns an undefined behavior, will the logic here still work? Because if thread t1 didn't have the lock, t1 unlock the mutex A will return undefine behavior and the mutex will still be held by the thread which holds it right? And it will not affect the other logic in this code.
My old code works as same as I put the unlock part inside of the if block. So that's why I am curious how can this work.
mutex A;
if(something)
{
A.lock();
}
A.unlock();
When calling unlock on a mutex, the mutex must be owned by the current thread or the behavior is undefined. Undefined behavior means anything can happen, including the program appearing to run correctly, the program crashing, or memory elsewhere getting corrupted and a problem not being visible until later.
Normally a mutex is not used directly; one of the other standard classes (like std::unique_lock or std::lock_guard are used to manage it. Then you won't have to worry about unlocking the mutex.
You should consider using std::lock_guard :
mutex A;
if (something)
{
lock_guard<mutex> lock(A);
}
It is an undefined behavior to unlock a mutex you haven't locked. It may work on some cases for so many times but some day it will break and behave differently.
You can use a lock_guard in your case and forget about lock/unlock
std::lock_guard<std::mutex> lock(A);
The ISO C++ standard has this to say on the matter, in [thread.mutex.requirements.mutex] (my emphasis):
The expression m.unlock() shall be well-formed and have the following semantics:
Requires: The calling thread shall own the mutex.
That means what you are doing is a violation of the standard and is therefore undefined. It may work, or it may delete all your files while playing the derisive_maniacal_laughter.mp3 file :-)
Bottom line, don't do it.
I also wouldn't put the unlock inside the loop since modern C++ has higher-level mechanisms for dealing with auto-release of resources. In this case, it's a lock guard:
std::mutex mtxProtectData; // probably defined elsewhere (long-lived)
: : :
if (dataNeedsChanging) {
std::lock_guard<std::mutex> mtxGuard(mtxProtectData);
// Change data, mutex will unlock at brace below
// when mtxGuard goes out of scope.
}

Why doesn't pthread_mutex_t block, when `g++` compile code [duplicate]

As per this article:
If you try and lock a non-recursive mutex twice from the same thread without unlocking in between, you get undefined behavior.
My very naive mind tells me why don't they just return an error? Is there a reason why this has to be UB?
Because it never happens in a correct program, and making a check for something that never happens is wasteful (and to make that check it needs to store the owning thread ID, which is also wasteful).
Note that it being undefined allows debug implementations to throw an exception, for example, while still allowing release implementations to be as efficient as possible.
Undefined behavior allows implementations to do whatever is fastest/most convenient. For example, an efficient implementation of a non-recursive mutex might be a single bit where the lock operation is implemented with an atomic compare-and-swap instruction in a loop. If the thread that owns the mutex tries to lock it again it will deadlock because it is waiting for the mutex to unlock but since nobody else can unlock it (unless there's some other bug where some thread that doesn't own it unlocks it) the thread will wait forever.

Why locking a std::mutex doesn't block the thread

I wrote the following code to test my understanding of std::mutex
int main() {
mutex m;
m.lock();
m.lock(); // expect to block the thread
}
And then I got a system_error: device or resource busy. Isn't the second m.lock() supposed to block the thread?
From std::mutex:
A calling thread must not own the mutex prior to calling lock or try_lock.
and from std::mutex::lock:
If lock is called by a thread that already owns the mutex, the program may deadlock. Alternatively, if an implementation can detect the deadlock, a resource_deadlock_would_occur error condition may be observed.
and the exceptions clause:
Throws std::system_error when errors occur, including errors from the underlying operating system that would prevent lock from meeting its specifications. The mutex is not locked in the case of any exception being thrown.
Therefore it is not supposed to block the thread. On your platform, the implementation appears to be able to detect when a thread is already the owner of a lock and raise an exception. This may not happen on other platforms, as indicated in the descriptions.
Isn't the second m.lock() supposed to block the thread?
No, it gives undefined behaviour. The second m.lock() breaks this requirement:
C++11 30.4.1.2/7 Requires: If m is of type std::mutex or std::timed_mutex, the calling thread does not own the mutex.
It looks like your implementation is able to detect that the calling thread owns the mutex and gives an error; others may block indefinitely, or fail in other ways.
(std::mutex wasn't mentioned in the question when I wrote this answer.)
It depends on the mutex library and mutex type you're using - you haven't told us. Some systems provide a "recursive mutex" that is allowed to be called multiple times like this only if it happens from the same thread (then you must have a matching number of unlocks before another thread can lock it), other libraries consider this an error and may fail gracefully (as yours has) or have undefined behaviour.