I have data coupled with a Lock = boost::shared_mutex.
I am locking data access with
reader locks ReadLock = boost::shared_lock<Lock>
and writer locks WriteLock = boost::unique_lock<Lock>.
Obviously, lots of readers may be reading the data at a time, and only one writing at it. But here's the catch:
A single thread may have mutliple readlocks on the same mutex, because it's calling functions that are locking the data themselves (with a ReadLock). But, as I have found, this causes inter-locking:
Thread 1 read-locks data (Lock1)
Thread 2 waits with a write-lock (LockW)
Thread 1 spawns another read-lock (Lock2) while Lock1 is still alive
Now I get a lock because Lock2 is waiting for LockW to exit, LockW is waiting for Lock1, andLock1 is stuck because of Lock2.
I don't know if it's possible to change the design so that each thread only does a single ReadLock. I believe that having a system that starves Writers would solve my issues. Is there a common way on how to handle my case?
Related
Does std::condition_variable::notify_one() or std::condition_variable::notify_all() guarantee that non-atomic memory writes in the current thread prior to the call will be visible in notified threads?
Other threads do:
{
std::unique_lock lock(mutex);
cv.wait(lock, []() { return values[threadIndex] != 0; });
// May a thread here see a zero value and therefore start to wait again?
}
Main thread does:
fillData(values); // All values are zero and all threads wait() before calling this.
cv.notify_all(); // Do need some memory fence or lock before this
// to ensure that new non-zero values will be visible
// in other threads immediately after waking up?
Doesn't notify_all() store some atomic value therefore enforcing memory ordering? I did not clarified it.
UPD: according to Superlokkus' answer and an answer here: we have to acquire a lock to ensure memory writes visibility in other threads (memory propagation), otherwise threads in my case may read zero values.
Also I missed this quote here about condition_variable, which specifically answers my question. Even an atomic variable has to be modified under a lock in a case when the modification must become visible immediately.
Even if the shared variable is atomic, it must be modified under the
mutex in order to correctly publish the modification to the waiting
thread.
I guess you are mixing up memory ordering of so called atomic values and the mechanisms of classic lock based synchronization.
When you have a datum which is shared between threads, lets say an int for example, one thread can not simply read it while the other thread might be write to it meanwhile. Otherwise we would have a data race.
To get around this for long time we used classic lock based synchronization:
The threads share at least a mutex and the int. To read or to write any thread has to hold the lock first, meaning they wait on the mutex. Mutexes are build so that they are fine that this can happen concurrently. If a thread wins gettting the mutex it can change or read the int and then should unlock it, so others can read/write too. Using a conditional variable like you used is just to make the pattern "readers wait for a change of a value by a writer" more efficient, they get woken up by the cv instead of periodically waiting on the lock, reading, and unlocking, which would be called busy waiting.
So because you hold the lock in any after waiting on the mutex or in you case, correctly (mutex is still needed) waiting on the conditional variable, you can change the int. And readers will read the new value after the writer was able to wrote it, never the old. UPDATE: However one thing if have to add, which might also be the cause of confusion: Conditional variables are subject for so called spurious wakeups. Meaning even though you write did not have notified any thread, a read thread might still wake up, with the mutex locked. So you have to check if you writer actually waked you up, which is usually done by the writer by changing another datum just to notify this, or if its suitable by using the same datum you already wanted to share. The lambda parameter overload of std::condition_variable::wait was just made to make the checking and going back to sleep code looking a bit prettier. Based on your question now I don't know if you want to use you values for this job.
However at snippet for the "main" thread is incorrect or incomplete:
You are not synchronizing on the mutex in order to change values.
You have to hold the lock for that, but notifying can be done without the lock.
std::unique_lock lock(mutex);
fillData(values);
lock.unlock();
cv.notify_all();
But these mutex based patters have some drawbacks and are slow, only one thread at a time can do something. This is were so called atomics, like std::atomic<int> came into play. They can be written and read at the same time without an mutex by multiple threads concurrently. Memory ordering is only a thing to consider there and an optimization for cases where you uses several of them in a meaningful way or you don't need the "after the write, I never see the old value" guarantee. However with it's default memory ordering memory_order_seq_cst you would also be fine.
Looking at several videos and the documentation example, we unlock the mutex before calling the notify_all(). Will it be better to instead call it after?
The common way:
Inside the Notifier thread:
//prepare data for several worker-threads;
//and now, awaken the threads:
std::unique_lock<std::mutex> lock2(sharedMutex);
_threadsCanAwaken = true;
lock2.unlock();
_conditionVar.notify_all(); //awaken all the worker threads;
//wait until all threads completed;
//cleanup:
_threadsCanAwaken = false;
//prepare new batches once again, etc, etc
Inside one of the worker threads:
while(true){
// wait for the next batch:
std::unique_lock<std::mutex> lock1(sharedMutex);
_conditionVar.wait(lock1, [](){return _threadsCanAwaken});
lock1.unlock(); //let sibling worker-threads work on their part as well
//perform the final task
//signal the notifier that one more thread has completed;
//loop back and wait until the next task
}
Notice how the lock2 is unlocked before we notify the condition variable - should we instead unlock it after the notify_all() ?
Edit
From my comment below: My concern is that, what if the worker spuriously awakes, sees that the mutex is unlocked, super-quickly completes the task and loops back to the start of while. Now the slow-poke Notifier finally calls notify_all(), causing the worker to loop an additional time (excessive and undesired).
There are no advantages to unlocking the mutex before signaling the condition variable unless your implementation is unusual. There are two disadvantages to unlocking before signaling:
If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.
There is a performance penalty for unlocking before signaling that is avoided by unlocking after signaling. If you signal before you unlock, a good implementation will know that your signal cannot possibly render any thread ready-to-run because the mutex is held by the calling thread and any thread affects by the condition variable necessarily cannot make forward progress without the mutex. This permits a significant optimization (often called "wait morphing") that is not possible if you unlock first.
So signal while holding the lock unless you have some unusual reason to do otherwise.
should we instead unlock it after the notify_all() ?
It is correct to do it either way but you may have different behavior in different situations. It is quite difficult to predict how it will affect performance of your program - I've seen both positive and negative effects for different applications. So it is better you profile your program and make decision on your particular situation based on profiling.
As mentioned here : cppreference.com
The notifying thread does not need to hold the lock on the same mutex
as the one held by the waiting thread(s); in fact doing so is a
pessimization, since the notified thread would immediately block
again, waiting for the notifying thread to release the lock.
That said, documentation for wait
At the moment of blocking the thread, the function automatically calls
lck.unlock(), allowing other locked threads to continue.
Once notified (explicitly, by some other thread), the function
unblocks and calls lck.lock(), leaving lck in the same state as when
the function was called. Then the function returns (notice that this
last mutex locking may block again the thread before returning).
so when notified wait will re-attempt to gain the lock and in that process it will get blocked again till original notifying thread releases the lock.
So I'll suggest that release the lock before calling notify. As done in example on cppreference.com and most importantly
Don't be Pessimistic.
David's answer seems to me wrong.
First, assuming the simple case of two threads, one waiting for the other on a condition variable, unlocking first by the notifier will not waken the other waiting thread, as the signal has not arrived. Then the notify call will immediately waken the waiting thread. You do not need any special optimizations.
On the other hand, signalling first has the potential of waking up a thread and making it sleep immediately again, as it cannot hold the lock—unless wait morphing is implemented.
Wait morphing does not exist in Linux at least, according to the answer under this StackOverflow question: Which OS / platforms implement wait morphing optimization?
The cppreference example also unlocks first before signalling: https://en.cppreference.com/w/cpp/thread/condition_variable/notify_all
It explicit says:
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s). Doing so may be a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock, though some implementations recognize the pattern and do not attempt to wake up the thread that is notified under lock.
should we instead unlock it after the notify_all() ?
After reading several related posts, I've formed the opinion that it's purely a performance issue. If OS supports "wait morphing", unlock after; otherwise, unlock before.
I'm adding an answer here to augment that of #DavidSchwartz 's. Particularly, I'd like to clarify his point 1.
If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.
The 1st thing I said is that, because it's a CV and not a Mutex, a better term for the so-called "deadlock" might be "sleep paralysis" - a mistake some programs make is that
a thread that's supposed to wake
went to sleep due to not rechecking the condition it's been waiting for before wait'ng again.
The 2nd thing is that, when waking some other thread(s),
the default choice should be broadcast/notify_all (broadcast is the POSIX term, which is equivalent to its C++ counterpart).
signal/notify is an optimized special case used for when there's only 1 other thread is waiting.
Finally 3rd, David is adamant that
it's better to unlock after notify,
because it can avoid the "deadlock" which I've been referring to as "sleep paralysis".
If it's unlock then notify, then there's a window where another thread (let's call this the "wrong" thread) may i.) acquire the mutex, ii.)going into wait, and iii.) wake up. The steps i. ii. and iii. happens too quickly, consumed the signal, leaving the intended (let's call it "correct") thread in sleep.
I discussed this extensively with David, he clarified that only when all 3 points are violated ( 1. condvar associated with several separate conditions and/or didn't check it before waiting again; 2. signal/notify only 1 thread when there're more than 1 other threads using the condvar; 3. unlock before notify creating a window for race condition ), the "sleep paralysis" would occur.
Finally, my recommendation is that, point 1 and 2 are essential for correctness of the program, and fixing issues associated with 1 and 2 should be prioritized over 3, which should only be a augmentative "last resort".
For the purpose of providing reference, manpage for signal/broadcast and wait contains some info from version 3 of Single Unix Specification that gave some explanations on point 1 and 2, and partly 3. Although specified for POSIX/Unix/Linux in C, it's concepts are applicable to C++.
As of this writing (2023-01-31), the 2018 edition of version 4 of Single Unix Specification is released, and the drafting of version 5 is underway.
I'm creating a multithreaded c++ program using pthread (c++98 standard).
I have a std::map that multiple threads will access. The access will be adding and removing elements, using find, and also accessing elements using the [] operator.
I understand that reading using the [] operator, or even modifying the elements with it is thread safe, but the rest of the operations are not.
First question: do I understand this correctly?
Some threads will just access the elements via [], while others will do some of the other operations. Obviously I need some form of thread synchronisation.
The way I see this should work is:
- While no "write" operation is being done to the map, threads should all be able to "read" from it concurrently.
- When a thread wants to "write" to the map, it should set a lock so no thread starts any "read" or "write" operation, and then it should wait until all "read" operations have completed, at which point it would perform the operation and release the locks.
- After the locks have been released, all threads should be able to read freely.
The main question is: what thread synchronisation methods can I use to achieve this behaviour?
I have read about mutex, conditional variables and semaphores, and as far as I can see they won't do excatly what I need. I'm familiar with mutex but not with cond. variables or semaphores.
The main problem I see is that I need a way of locking threads until something happens (the write operation ends) without those threads then locking anything in turn.
Also I need something like an inverted semaphore, that blocks while the counter is more than 1 and then wakes when it is 0 (i.e. no read operation is being done).
Thanks in advance.
P.S. It's my first post. Kindly indicate if I'm doing something wrong!
I understand that reading using the [] operator, or even modifying the elements with it is thread safe, but the rest of the operations are not.
Do I understand this correctly?
Well, what you've said isn't quite true. Concurrent readers can use [] to access existing elements, or use other const functions (like find, size()...) safely if there's no simultaneous non-const operations like erase or insert mutating the map<>. Concurrent threads can modify different elements, but if one thread modifies an element you must have some synchronisation before another thread attempts to access or further modify that specific element.
When a thread wants to "write" to the map, it should set a lock so no thread starts any "read" or "write" operation, and then it should wait until all "read" operations have completed, at which point it would perform the operation and release the locks. - After the locks have been released, all threads should be able to read freely.
That's not quite the way it works... for writers to be able to 'wait until all "read" operations have completed', the reader(s) need to acquire a lock. Writers then wait for that same lock to be released, and acquire it themselves to restrict other readers or writers until they've finished their update and release it.
what thread synchronisation methods can I use to achieve this behaviour?
A mutex is indeed suitable, though you will often get higher performance from a reader-writer lock (which allows concurrent readers, some also prioritorise waiting writers over further readers). Related POSIX threads functions include: pthread_rwlock_rdlock, pthread_rwlock_wrlock, pthread_rwlock_unlock etc..
To contrast the two approaches, with readers and writers using a mutex you get serialisation something like this:
THREAD ACTION
reader1 pthread_mutex_lock(the_mutex) returns having acquired lock, and
thread starts reading data
reader2 pthread_mutex_lock(the_mutex) "hangs", as blocked by reader1
writer1 pthread_mutex_lock(the_mutex) hangs, as blocked by reader1
reader1 pthread_mutex_unlock(the_mutex) -> releases lock
NOTE: some systems guarantee reader2 will unblock before writer1, some don't
reader2 blocked pthread_mutex_lock(the_mutex) returns having acquired lock,
and thread starts reading data
reader1 pthread_mutex_lock(the_mutex) hangs, as blocked by reader2
reader2 pthread_mutex_unlock(the_mutex) -> releases lock
writer1 blocked pthread_mutex_lock(the_mutex) returns having acquired lock,
and thread starts writing and/or reading data
writer1 pthread_mutex_unlock(the_mutex) -> releases lock
reader1 blocked pthread_mutex_lock(the_mutex) returns having acquired lock,
and thread starts reading data
...etc...
With a read-write lock, it might be more like this (notice the first two readers run concurrently):
THREAD ACTION
reader1 pthread_rwlock_rdlock(the_rwlock) returns having acquired lock, and
thread starts reading data
reader2 pthread_rwlock_rdlock(the_rwlock) returns having acquired lock, and
thread starts reading data
writer1 pthread_rwlock_wrlock(the_rwlock) hangs, as blocked by reader1/2
reader1 pthread_rwlock_unlock(the_rwlock) -> releases lock
reader1 pthread_rwlock_rwlock(the_rwlock) hangs, as pending writer
reader2 pthread_rwlock_unlock(the_rwlock) -> releases lock
writer1 blocked pthread_rwlock_wrlock(the_rwlock) returns having acquired lock,
and thread starts writing and/or reading data
writer1 pthread_rwlock_unlock(the_rwlock) -> releases lock
reader1 blocked pthread_rwlock_rwlock(the_rwlock) returns having acquired lock,
and thread starts reading data
...etc...
Is it possible for a thread that already has a lock on a mutex to check whether another thread is already waiting, without releasing the mutex? For example, say a thread has 3 tasks to run on a block of data, but another thread may have a short task to run on the data as well. Ideally, I'd have the first thread check whether another thread is waiting between each of the three tasks and allow the other thread to execute its task before resuming the other two tasks. Does Boost have a type of mutex that supports this functionality (assuming the C++11 mutex or lock types don't support this), or can this be done with conditional variables?
You cannot check whether other threads are waiting on a mutex.
If you want to give other threads their chance to run, just release the mutex. No need to know if someone is waiting. Then re-acquire as necessary.
Conditional variables are events. Use them if you want to wait until something happens. To check whether something has happened you need a regular (mutex-protected or atomic) variable.
you can not check if other threads are waiting on a mutex.
if you do need such functionality, you need to implement your own.
a mutex with use_count will suffice your need.
class my_mutex{
public:
my_mutex() {count=0;}
void lock() {count++; mtx.lock();}
void unlock() {count--; mtx.unlock();}
size_t get_waiting_threads() {return count>1?count-1:0;}
private:
atomic_ulong count;
mutex mtx;
};
if you need to finish task 1 and 2 before task 3 is executed, you should use conditional_variable instead of mutex.
If two threads are waiting on a lock, the thread that started waiting first is not guaranteed to be the thread that gets it first when it becomes available. So if you're a tight-looping thread who does something like
while(true):
mutex.lock()
print "got the lock; releasing it"
mutex.unlock()
You might think you're being polite to all the other threads waiting for the lock, but you're not, the system might just give you the lock over and over again without letting any of the other threads jump in, no matter how long they've been waiting.
Condition variables are a reasonable way to solve this problem.
I've a C++ list which is being processed by multiple thread.
Each thread creates a pthread_mutex_lock on the list so that other threads cannot "interfere" with the list. As a part of processing, each thread also push_back data on the list.
My question is - is push_back on a mutex-ed list a bad idea? Is the mutex still valid while the thread is pusing more data on the list? Most of the documentation/examples I've seen on pthread_mutex_lock are only doing "reading" so I am curious to know what happens the same thread which acquired lock, writes on the shared resource.
As long as only that particular thread is holding the lock, and no other thread can take this lock, writing should be fine. think of why a problem could happen? it wouldve been a problem if one thread was writing and the other was reading simultaneously. If a ball is yours, you can do anything with it right? things change when they're shared.
The mutex needs to be unique for the entire group of threads (i.e. all threads must use the same mutex). If you create a mutex for each thread, then you are not thread-safe at all, because each thread will wait on its own mutex and not be synchronized with the rest.
And yes an acquired mutex can be used safely to both read and write.