C++ Pthread Mutex Locking - c++

I guess I'm a little unsure of how mutexs work. If a mutex gets locked after some conditional, will it only lock out threads that meet that same condition, or will it lock out all threads regardless until the mutex is unlocked?
Ex:
if (someBoolean)
pthread_mutex_lock(& mut);
someFunction();
pthread_mutex_unlock(& mut);
Will all threads be stopped from running someFunction();, or just those threads that pass through the if-statement?

You need to call pthread_mutex_lock() in order for a thread to be locked. If you call pthread_mutex lock in thread A, but not in thread B, thread B does not get locked (and you have likely defeated the purpose of mutual exclusion in your code, as it's only useful if every thread follows the same locking protocol to protect your code).
The code in question has a few issues outlined below:
if (someBoolean) //if someBoolean is shared among threads, you need to lock
//access to this variable as well.
pthread_mutex_lock(& mut);
someFunction(); //now you have some threads calling someFunction() with the lock
//held, and some calling it without the lock held, which
//defeats the purpose of mutual exclusion.
pthread_mutex_unlock(& mut); //If you did not lock the *mut* above, you must not
//attempt to unlock it.

Will all threads be stopped from running someFunction();, or just those threads that pass through the if-statement?
Only the threads for which someBoolean is true will obtain the lock. Therefore, only those threads will be prevented from calling someFunction() while someone else holds the same lock.
However, in the provided code, all threads will call pthread_mutex_unlock on the mutex, regardless of whether they actually locked it. For mutexes created with default parameters this constitutes undefined behavior and must be fixed.

Related

Understanding condition_variable::wait for blocking a thread

While implementing a thread pool pattern in C++ based on this, I came across a few questions.
Let's assume minimal code sample:
std::mutex thread_mutex;
std::condition_variable thread_condition;
void thread_func() {
std::unique_lock<std::mutex> lock(thread_mutex);
thread_condition.wait(lock);
lock.unlock();
}
std::thread t1 = std::thread(thread_func);
Regarding cppreference.com about conditon_variable::wait(), wait() causes the current thread to block. What is locking the mutex then for when I only need one thread at all using wait() to get notified when something is to do?
unique_lock will block the thread when the mutex already has been locked by another thread. But this wouldn't be neccesary as long as wait() blocks anyway or what do I miss here?
Adding a few lines at the bottom...
std::thread t2 = std::thread(thread_func);
thread_condition.notify_all()
When unique_lock is blocking the thread, how will notify_all() reach both threads when one of them is locked by unique_lock and the other is blocked by wait()? I understand that blocking wait() will be freed by notify_all() which afterwards leads to unlocking the mutex and that this gives chance to the other thread for locking first the mutex and blocking thread by wait() afterwards. But how is this thread notified than?
Expanding this question by adding a loop in thread_func()...
std::mutex thread_mutex;
std::condition_variable thread_condition;
void thread_func() {
while(true) {
std::unique_lock<std::mutex> lock(thread_mutex);
thread_condition.wait(lock);
lock.unlock();
}
}
std::thread t1 = std::thread(thread_func);
std::thread t2 = std::thread(thread_func);
thread_condition.notify_all()
While reading documentation, I would now expect both threads running endlessly. But they do not return from wait() lock. Why do I have to use a predicate for expected behaviour like this:
bool wakeup = false;
//[...]
thread_condition.wait(lock, [] { return wakeup; });
//[...]
wakeup = !wakeup;
thread_condition.notify_all();
Thanks in advance.
This is really close to being a duplicate, but it's actually that question that answers this one; we also have an answer that more or less answers this question, but the question is distinct. I think that an independent answer is needed, even though it's little more than a (long) definition.
What is a condition variable?
The operational definition is that it's a means for a thread to block until a message arrives from another thread. A mutex alone can't possibly do this: if all other threads are busy with unrelated work, a mutex can't block a thread at all. A semaphore can block a lone thread, but it's tightly bound to the notion of a count, which isn't always appropriate to the nature of the message to receive.
This "channel" can be implemented in several ways. Very low-tech is to use a pipe, but that involves expensive system calls. Windows provides the Event object which is fundamentally a boolean on whose truth a thread may wait. (C++20 provides a similar feature with atomic_flag::wait.)
Condition variables take a different approach: their structural definition is that they are stateless, but have a special connection to a corresponding mutex type. The latter is necessitated by the former: without state, it is impossible to store a message, so arrangements must be made to prevent sending a message during some interval between a thread recognizing the need to wait (by examining some other state: perhaps that the queue from which it wants to pop is empty) and it actually being blocked. Of course, after the thread is blocked it cannot take any action to allow the message to be sent, so the condition variable must do so.
This is implemented by having the thread take a mutex before checking the condition and having wait release that mutex only after the thread can receive the message. (In some implementations, the mutex is also used to protect the workings of the condition variable, but C++ does not do so.) When the message is received, the mutex is re-acquired (which may block the thread again for a time), as is necessary to consult the external state again. wait thus acts like an everted std::unique_lock: the mutex is unlocked during wait and locked again afterwards, with possibly arbitary changes having been made by other threads in the meantime.
Answers
Given this understanding, the individual answers here are trivial:
Locking the mutex allows the waiting thread to safely decide to wait, given that there must be some other thread affecting the state in question.
If the std::unique_lock blocks, some other thread is currently updating the state, which might actually obviate the need for wait.
Any number of threads can be in wait, since each unlocks the mutex when it calls it.
Waiting on a condition variable, er, unconditionally is always wrong: the state you're after might already apply, with no further messages coming.

What does std::mutex prevent threads from modifying?

What part of memory gets locked by mutex when .lock() or .try_lock(), is it just the function or is it the whole program that gets locked?
Nothing is locked except the mutex. Everything else continues running (until it tries to lock an already locked mutex that is). The mutex is only there so that two threads cannot run the code between a mutex lock and a mutex unlock at the same time.
A mutex doesn't really lock anything, except for itself. You can think of a mutex as being a gate where you can only unlock it from the inside. When the gate is locked, any thread that tries to lock the mutex will sit there at the gate and wait for the current thread that is behind the gate to unlock it and let them in. When they gate is not locked then when you call lock you can just go in, close and lock the gate, and now no threads can get past the gate until you unlock it and let them in.
A mutex doesn't lock anything. You just use a mutex to communicate to other parts of your code that they should consider whatever you decide needs to be protected from access by several threads at the same time to be off-limits for now.
You could think of a mutex as something like a boolean okToModify. Whenever you want to edit something, you check if okToModify is true. If it is, you set it to false (preventing any other threads from modifying it), change it, then set okToModify back to true to tell the other threads you're done and give them a chance to modify:
// WARNING! This code doesn't actually work as a lock!
// it is just an example of the concept.
struct LockedInt {
bool okToModify; // This would be your mutex instead of a bool.
int integer;
};
struct LockedInt myLockedInt = { true, 0 };
...
while (myLockedInt.okToModify == false)
; // wait doing nothing until whoever is modifying the int is done.
myLockedInt.okToModify = false; // Prevent other threads from getting out of while loop above.
myLockedInt.integer += 1;
myLockedInt.okToModify = true; // Now other threads get out of the while loop if they were waiting and can modify.
The while loop and okToModify = false above is basically what locking a mutex does, and okToModify = true is what unlocking a mutex does.
Now, why do we need mutexes and don't use booleans? Because a thread could be running at the same time as those three lines above. The code for locking a mutex actually guarantees that the waiting for okToModify to become true and setting okToModify = false happen in one go, and therefore no other thread can get "in between the lines", for example by using a special machine-code instruction called "compare-and-exchange".
So do not use booleans instead of mutexes, but you can think of a mutex as a special, thread-safe boolean.
m.lock() doesn't really lock anything. What it does is, it waits to take ownership of the mutex. A mutex always either is owned by exactly one thread or else it is available. m.lock() waits until the mutex becomes available, and then it takes ownership of it in the name of the calling thread.
m.unlock releases the mutex (i.e., it relinquishes ownership), causing the mutex to once again become available.
Mutexes also perform another very important function. In modern C++, when some thread T performs a sequence of assignments of various values to various memory locations, the system makes no guarantees about when other threads U, V, and W will see those assignments, whether the other threads will see the assignments happen in the same order in which thread T performed them, or even, whether the other threads will ever see the assignments.
There are some quite complicated rules governing things that a programmer can do to ensure that different threads see a consistent view of shared memory objects (Google "C++ memory model"), but here's one simple rule:
Whatever thread T did before it releases some mutex M is guaranteed to be visible to any other thread U after thread U subsequently locks the same mutex M.

Why do both the notify and wait function of a std::condition_variable need a locked mutex

On my neverending quest to understand std::contion_variables I've run into the following. On this page it says the following:
void print_id (int id) {
std::unique_lock<std::mutex> lck(mtx);
while (!ready) cv.wait(lck);
// ...
std::cout << "thread " << id << '\n';
}
And after that it says this:
void go() {
std::unique_lock<std::mutex> lck(mtx);
ready = true;
cv.notify_all();
}
Now as I understand it, both of these functions will halt on the std::unqique_lock line. Until a unique lock is acquired. That is, no other thread has a lock.
So say the print_id function is executed first. The unique lock will be aquired and the function will halt on the wait line.
If the go function is then executed (on a separate thread), the code there will halt on the unique lock line. Since the mutex is locked by the print_id function already.
Obviously this wouldn't work if the code was like that. But I really don't see what I'm not getting here. So please enlighten me.
What you're missing is that wait unlocks the mutex and then waits for the signal on cv.
It locks the mutex again before returning.
You could have found this out by clicking on wait on the page where you found the example:
At the moment of blocking the thread, the function automatically calls lck.unlock(), allowing other locked threads to continue.
Once notified (explicitly, by some other thread), the function unblocks and calls lck.lock(), leaving lck in the same state as when the function was called.
There's one point you've missed—calling wait() unlocks the mutex. The thread atomically (releases the mutex + goes to sleep). Then, when woken by the signal, it tries to re-acquire the mutex (possibly blocking); once it acquires it, it can proceed.
Notice that it's not necessary to have the mutex locked for calling notify_*, only for wait*
To answer the question as posed, which seems necessary regarding claims that you should not acquire a lock on notification for performance reasons (isn't correctness more important than performance?): The necessity to lock on "wait" and the recommendation to always lock around "notify" is to protect the user from himself and his program from data and logical races. Without the lock in "go", the program you posted would immediately have a data race on "ready". However, even if ready were itself synchronized (e.g. atomic) you would have a logical race with a missed notification, because without the lock in "go" it is possible for the notify to occur just after the check for "ready" and just before the actual wait, and the waiting thread may then remain blocked indefinitely. The synchronization on the atomic variable itself is not enough to prevent this. This is why helgrind will warn when a notification is done without holding the lock. There are some fringe cases where the mutex lock is really not required around the notify. In all of these cases, there needs to be a bidirectional synchronization beforehand so that the producing thread can know for sure that the other thread is already waiting. IMO these cases are for experts only. Actually, I have seen an expert, giving a talk about multi-threading, getting this wrong — he thought an atomic counter would suffice. That said, the lock around the wait is always necessary for correctness (or, at least, an operation that is atomic with the wait), and this is why the standard library enforces it and atomically unlocks the mutex on entering the wait.
POSIX condition variables are, unlike Windows events, not "idiot-proof" because they are stateless (apart from being aware of waiting threads). The recommendation to use a lock on the notify is there to protect you from the worst and most common screwups. You can build a Windows-like stateful event using a mutex + condition var + bool variable if you like, of course.

Does pthread_mutex_t in linux are reentrancy (if a thread tries to acquire a lock that it already holds, the request succeeds)

I am coming from Java , so i am familiar with synchronize and not mutex.
I wonder if pthread_mutex_t is also reentrancy. if not is there another mechanism for this?
Thank you
This depends on the mutex type, the default does no checking and an attempt to lock it more than once in the same thread results in undefined behavior. Read about it here.
You can create a mutex of type PTHREAD_MUTEX_RECURSIVE to be able to recursively lock it, which is done by providing a pthread_mutexattr_t with the desired mutex type to pthread_mutex_init
According to the manual, you can declare a mutex object as PTHREAD_MUTEX_RECURSIVE:
If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex shall
maintain the concept of a lock count. When a thread successfully
acquires a mutex for the first time, the lock count shall be set to
one. Every time a thread relocks this mutex, the lock count shall be
incremented by one. Each time the thread unlocks the mutex, the lock
count shall be decremented by one. When the lock count reaches zero,
the mutex shall become available for other threads to acquire. If a
thread attempts to unlock a mutex that it has not locked or a mutex
which is unlocked, an error shall be returned.
See also pthread_mutex_attr_settype.
By default, pthread_mutex is not recursive, but there is a way for initialize it as recursive:
pthread_mutexattr_t Attr;
pthread_mutexattr_init(&Attr);
pthread_mutexattr_settype(&Attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init(&_mutex, &Attr);
I think you don't mean reentrancy but recursiveness. Having reentrant locking with Java is impossible.
Recursive locking is simply when you already own a lock in a thread you might lock it again without any issues. Actually this is very efficient since this doesn't need any atomic operations but just a compare of the owning thread id of the mutex with the current thread id and if both are equal, an owning-counter is - non-atomically (!) - incremented.
Reentrancy is rather is asynchronous execution within the same thread. For example if you have a signal handler in C / C++ and this signal handler locks a mutex already owned by this thread in the synchronous part of the code again. I'm not aware if there are mutex implementations that can handle this situations but there's not much need for that so I guess not. I think such mutexes couldn't be as efficient as normal recursive mutexes.

Non-recursive mutex ownership

I read this answer on SO:
Because the recursive mutex has a sense of ownership, the thread that grabs the mutex must be the same thread that releases the mutex. In the case of non-recursive mutexes, there is no sense of ownership and any thread can usually release the mutex no matter which thread originally took the mutex.
I'm confused by the last statement. Can one thread lock a mutex and another different thread unlock that mutex? I thought that same thread should be the only one able to unlock the mutex? Or is there any specific mutex that allows this? I hope someone can clarify.
Non-recursive mutex
Most mutexes are (or at least should be) non-recursive. A mutex is an object which can be acquired or released atomically, which allows data which is shared between multiple threads to be protected against race conditions, data corruption, and other nasty things.
A single mutex should only ever be acquired once by a single thread within the same call chain. Attempting to acquire (or hold) the same mutex twice, within the same thread context, should be considered an invalid scenario, and should be handled appropriately (usually via an ASSERT as you are breaking a fundamental contract of your code).
Recursive mutex
This should be considered a code smell, or a hack. The only way in which a recursive mutex differs from a standard mutex is that a recursive mutex can be acquired multiple times by the same thread.
The root cause of the need for a recursive mutex is lack of ownership and no clear purpose or delineation between classes. For example, your code may call into another class, which then calls back into your class. The starting class could then try to acquire the same mutex again, and because you want to avoid a crash, you implement this as a recursive mutex.
This kind of topsy-turvy class hierarchy can lead to all sorts of headaches, and a recursive mutex provides only a band-aid solution for a more fundamental architectural problem.
Regardless of the mutex type, it should always be the same thread which acquires and releases the same mutex. The general pattern that you use in code is something like this:
Thread 1
Acquire mutex A
// Modify or read shared data
Release mutex A
Thread 2
Attempt to acquire mutex A
Block as thread 1 has mutex A
When thread 1 has released mutex A, acquire it
// Modify or read shared data
Release mutex A
It gets more complicated when you have multiple mutexes that can be acquired simultaneously (say, mutex A and B). There is the risk that you'll run into a deadlock situation like this:
Thread 1
Acquire mutex A
// Access some data...
*** Context switch to thread 2 ***
Thread 2
Acquire mutex B
// Access some data
*** Context switch to thread 1 ***
Attempt to acquire mutex B
Wait for thread 2 to release mutex B
*** Context switch to thread 2 ***
Attempt to acquire mutex A
Wait for thread 1 to release mutex A
*** DEADLOCK ***
We now have a situation where each thread is waiting for the other thread to release the other lock -- this is known as the ABBA deadlock pattern.
To prevent this situation, it is important that each thread always acquires the mutexes in the same order (e.g. always A, then B).
Recursive mutexes are thread-specific by design (the same thread locking again = recursion, another thread locking meanwhile = block). Regular mutexes do not have this design and thus they can in fact be locked and unlocked in different threads.
I think this covers all your questions. Straight from linux man pages for pthreads:
If the mutex type is PTHREAD_MUTEX_NORMAL, deadlock detection shall not be provided. Attempting to relock the mutex causes deadlock. If a thread
attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results.
If the mutex type is PTHREAD_MUTEX_ERRORCHECK, then error checking shall be provided. If a thread attempts to relock a mutex that it has already
locked, an error shall be returned. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be
returned.
If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex shall maintain the concept of a lock count. When a thread successfully acquires a
mutex for the first time, the lock count shall be set to one. Every time a thread relocks this mutex, the lock count shall be incremented by one.
Each time the thread unlocks the mutex, the lock count shall be decremented by one. When the lock count reaches zero, the mutex shall become
available for other threads to acquire. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall
be returned.
If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior. Attempting to unlock the mutex
if it was not locked by the calling thread results in undefined behavior. Attempting to unlock the mutex if it is not locked results in undefined
behavior.