As per this article:
If you try and lock a non-recursive mutex twice from the same thread without unlocking in between, you get undefined behavior.
My very naive mind tells me why don't they just return an error? Is there a reason why this has to be UB?
Because it never happens in a correct program, and making a check for something that never happens is wasteful (and to make that check it needs to store the owning thread ID, which is also wasteful).
Note that it being undefined allows debug implementations to throw an exception, for example, while still allowing release implementations to be as efficient as possible.
Undefined behavior allows implementations to do whatever is fastest/most convenient. For example, an efficient implementation of a non-recursive mutex might be a single bit where the lock operation is implemented with an atomic compare-and-swap instruction in a loop. If the thread that owns the mutex tries to lock it again it will deadlock because it is waiting for the mutex to unlock but since nobody else can unlock it (unless there's some other bug where some thread that doesn't own it unlocks it) the thread will wait forever.
Related
Does std::condition_variable::notify_one() or std::condition_variable::notify_all() guarantee that non-atomic memory writes in the current thread prior to the call will be visible in notified threads?
Other threads do:
{
std::unique_lock lock(mutex);
cv.wait(lock, []() { return values[threadIndex] != 0; });
// May a thread here see a zero value and therefore start to wait again?
}
Main thread does:
fillData(values); // All values are zero and all threads wait() before calling this.
cv.notify_all(); // Do need some memory fence or lock before this
// to ensure that new non-zero values will be visible
// in other threads immediately after waking up?
Doesn't notify_all() store some atomic value therefore enforcing memory ordering? I did not clarified it.
UPD: according to Superlokkus' answer and an answer here: we have to acquire a lock to ensure memory writes visibility in other threads (memory propagation), otherwise threads in my case may read zero values.
Also I missed this quote here about condition_variable, which specifically answers my question. Even an atomic variable has to be modified under a lock in a case when the modification must become visible immediately.
Even if the shared variable is atomic, it must be modified under the
mutex in order to correctly publish the modification to the waiting
thread.
I guess you are mixing up memory ordering of so called atomic values and the mechanisms of classic lock based synchronization.
When you have a datum which is shared between threads, lets say an int for example, one thread can not simply read it while the other thread might be write to it meanwhile. Otherwise we would have a data race.
To get around this for long time we used classic lock based synchronization:
The threads share at least a mutex and the int. To read or to write any thread has to hold the lock first, meaning they wait on the mutex. Mutexes are build so that they are fine that this can happen concurrently. If a thread wins gettting the mutex it can change or read the int and then should unlock it, so others can read/write too. Using a conditional variable like you used is just to make the pattern "readers wait for a change of a value by a writer" more efficient, they get woken up by the cv instead of periodically waiting on the lock, reading, and unlocking, which would be called busy waiting.
So because you hold the lock in any after waiting on the mutex or in you case, correctly (mutex is still needed) waiting on the conditional variable, you can change the int. And readers will read the new value after the writer was able to wrote it, never the old. UPDATE: However one thing if have to add, which might also be the cause of confusion: Conditional variables are subject for so called spurious wakeups. Meaning even though you write did not have notified any thread, a read thread might still wake up, with the mutex locked. So you have to check if you writer actually waked you up, which is usually done by the writer by changing another datum just to notify this, or if its suitable by using the same datum you already wanted to share. The lambda parameter overload of std::condition_variable::wait was just made to make the checking and going back to sleep code looking a bit prettier. Based on your question now I don't know if you want to use you values for this job.
However at snippet for the "main" thread is incorrect or incomplete:
You are not synchronizing on the mutex in order to change values.
You have to hold the lock for that, but notifying can be done without the lock.
std::unique_lock lock(mutex);
fillData(values);
lock.unlock();
cv.notify_all();
But these mutex based patters have some drawbacks and are slow, only one thread at a time can do something. This is were so called atomics, like std::atomic<int> came into play. They can be written and read at the same time without an mutex by multiple threads concurrently. Memory ordering is only a thing to consider there and an optimization for cases where you uses several of them in a meaningful way or you don't need the "after the write, I never see the old value" guarantee. However with it's default memory ordering memory_order_seq_cst you would also be fine.
I wrote some code before but I realized it has a very bad code style so I changed it.
I changed the A.unlock inside of the if block. I know if the if never run then this thread will unlock the mutex which does not belong to itself and then it will return an undefined behavior.
My question is, if it returns an undefined behavior, will the logic here still work? Because if thread t1 didn't have the lock, t1 unlock the mutex A will return undefine behavior and the mutex will still be held by the thread which holds it right? And it will not affect the other logic in this code.
My old code works as same as I put the unlock part inside of the if block. So that's why I am curious how can this work.
mutex A;
if(something)
{
A.lock();
}
A.unlock();
When calling unlock on a mutex, the mutex must be owned by the current thread or the behavior is undefined. Undefined behavior means anything can happen, including the program appearing to run correctly, the program crashing, or memory elsewhere getting corrupted and a problem not being visible until later.
Normally a mutex is not used directly; one of the other standard classes (like std::unique_lock or std::lock_guard are used to manage it. Then you won't have to worry about unlocking the mutex.
You should consider using std::lock_guard :
mutex A;
if (something)
{
lock_guard<mutex> lock(A);
}
It is an undefined behavior to unlock a mutex you haven't locked. It may work on some cases for so many times but some day it will break and behave differently.
You can use a lock_guard in your case and forget about lock/unlock
std::lock_guard<std::mutex> lock(A);
The ISO C++ standard has this to say on the matter, in [thread.mutex.requirements.mutex] (my emphasis):
The expression m.unlock() shall be well-formed and have the following semantics:
Requires: The calling thread shall own the mutex.
That means what you are doing is a violation of the standard and is therefore undefined. It may work, or it may delete all your files while playing the derisive_maniacal_laughter.mp3 file :-)
Bottom line, don't do it.
I also wouldn't put the unlock inside the loop since modern C++ has higher-level mechanisms for dealing with auto-release of resources. In this case, it's a lock guard:
std::mutex mtxProtectData; // probably defined elsewhere (long-lived)
: : :
if (dataNeedsChanging) {
std::lock_guard<std::mutex> mtxGuard(mtxProtectData);
// Change data, mutex will unlock at brace below
// when mtxGuard goes out of scope.
}
I wrote the following code to test my understanding of std::mutex
int main() {
mutex m;
m.lock();
m.lock(); // expect to block the thread
}
And then I got a system_error: device or resource busy. Isn't the second m.lock() supposed to block the thread?
From std::mutex:
A calling thread must not own the mutex prior to calling lock or try_lock.
and from std::mutex::lock:
If lock is called by a thread that already owns the mutex, the program may deadlock. Alternatively, if an implementation can detect the deadlock, a resource_deadlock_would_occur error condition may be observed.
and the exceptions clause:
Throws std::system_error when errors occur, including errors from the underlying operating system that would prevent lock from meeting its specifications. The mutex is not locked in the case of any exception being thrown.
Therefore it is not supposed to block the thread. On your platform, the implementation appears to be able to detect when a thread is already the owner of a lock and raise an exception. This may not happen on other platforms, as indicated in the descriptions.
Isn't the second m.lock() supposed to block the thread?
No, it gives undefined behaviour. The second m.lock() breaks this requirement:
C++11 30.4.1.2/7 Requires: If m is of type std::mutex or std::timed_mutex, the calling thread does not own the mutex.
It looks like your implementation is able to detect that the calling thread owns the mutex and gives an error; others may block indefinitely, or fail in other ways.
(std::mutex wasn't mentioned in the question when I wrote this answer.)
It depends on the mutex library and mutex type you're using - you haven't told us. Some systems provide a "recursive mutex" that is allowed to be called multiple times like this only if it happens from the same thread (then you must have a matching number of unlocks before another thread can lock it), other libraries consider this an error and may fail gracefully (as yours has) or have undefined behaviour.
I understand how to use condition variables (crummy name for this construct, IMO, as the cv object neither is a variable nor indicates a condition). So I have a pair of threads, canonically set up with Boost.Thread as:
bool awake = false;
boost::mutex sync;
boost::condition_variable cv;
void thread1()
{
boost::unique_lock<boost::mutex> lock1(sync);
while (!awake)
cv.wait(lock1);
lock1.unlock(); // this line actually not canonical, but why not?
// proceed...
}
void thread2()
{
//...
boost::unique_lock<boost::mutex> lock2;
awake = true;
lock2.unlock();
cv.notify_all();
}
My question is: does thread2 really need to be protecting the assignment to awake? It seems to me the notify_all() call should be sufficient. If the data being manipulated and checked against were more than a simple "ok to proceed" flag, I see the value in the mutex, but here it seems like overkill.
A secondary question is that asked in the code fragment: Why doesn't the Boost documentation show the lock in thread1 being unlocked before the "process data" step?
EDIT: Maybe my question is really: Is there a cleaner construct than a CV to implement this kind of wait?
does thread2 really need to be protecting the assignment to awake?
Yes. Modifying an object from one thread and accessing it from another without synchronisation gives undefined behaviour. Even if it's just a bool.
For example, on some multiprocessor systems the write might only affect local memory; without an explicit synchronisation operation, other threads might never see the change.
Why doesn't the Boost documentation show the lock in thread1 being unlocked before the "process data" step?
If you unlocked the mutex before clearing the flag, then you might miss another signal.
Is there a cleaner construct than a CV to implement this kind of wait?
In Boost and the standard C++ library, no; a condition variable is flexible enough to handle arbitrary shared state and not particularly over-complicated for this simple case, so there's no particular need for anything simpler.
More generally, you could use a semaphore or a pipe to send a simple signal between threads.
Formally, you definitely need the lock in both threads: if any thread
modifies an object, and more than one thread accesses it, then all
accesses must be synchronized.
In practice, you'll probably get away with it without the lock; it's
almost certain that notify_all will issue the necessary fence or
membar instructions to ensure that the memory is properly synchronized.
But why take the risk?
As to the absense of the unlock, that's the whole point of the scoped
locking pattern: the unlock is in the destructor of the object, so
that the mutex will be unlocked even if an exception passes through.
As per this article:
If you try and lock a non-recursive mutex twice from the same thread without unlocking in between, you get undefined behavior.
My very naive mind tells me why don't they just return an error? Is there a reason why this has to be UB?
Because it never happens in a correct program, and making a check for something that never happens is wasteful (and to make that check it needs to store the owning thread ID, which is also wasteful).
Note that it being undefined allows debug implementations to throw an exception, for example, while still allowing release implementations to be as efficient as possible.
Undefined behavior allows implementations to do whatever is fastest/most convenient. For example, an efficient implementation of a non-recursive mutex might be a single bit where the lock operation is implemented with an atomic compare-and-swap instruction in a loop. If the thread that owns the mutex tries to lock it again it will deadlock because it is waiting for the mutex to unlock but since nobody else can unlock it (unless there's some other bug where some thread that doesn't own it unlocks it) the thread will wait forever.