Conditional variables in pthreads and releasing multiple locks - c++

Sorry if this is a trivial question. But I couldn't find an answer anywhere. I'm writing a program that uses pthreads. One thread acquires a lock (mutex) and then attempts to push data into a synchronized buffer. The buffer has its own mutex which gets acquired once the push() method is invoked.
If the buffer is full and the thread needs to wait on a conditional variable, will the wait call release all acquired locks? Or will it just release the one associated with the conditional variable (which happens to be the last acquired lock)? If it is the latter how can I avoid deadlocks if another thread needs to acquire the first lock?
Edit:
The problem I have is the following. I have two threads, say A and B. Thread A has a for loop which inserts data into a number of buffers. Each iteration inserts an element into one of the buffers. Since this is a thread, it continuously executes this loop within another outer loop for the thread. When B is activated, it manipulates those buffers. But B must not operate on the buffers if A is interrupted by the scheduler in the middle of executing the for loop. Therefore I use a mutex to lock the critical section in A, which is the for loop.
Edit-2:
I have been giving it some thought. And the answer to my problem definitely won't be that the buffers' conditional variable releases the first lock as well. That would mean that the first lock was not even necessary in the first place. If consumer threads (different from Thread B) responsible for removing elements from the buffers are doing their job properly, Thread A would resume at some point and the for loop will complete. Therefore, my problem must lie there. I will take a closer look and update.

pthread_cond_wait() will release only the mutex that you pass to it, which in your case will be the buffer's mutex acquired earlier within the push() call.
It sounds like you will have the possibility of deadlock in your situation, but that's a consequence of your high-level design. If Thread B cannot be allowed to execute while Thread A is running its for() loop, but within that for() loop Thread A might have to wait for Thread B to consume some data to proceeed, then deadlock is inevitable.
You will need to update your design to account for this. One possibility is to add a reserve() function to your synchronised buffers that reserves space for a subsequent push(), but doesn't add any data. Your Thread A can then go through and reserve the space it needs without holding the outer mutex (since it's not adding any data visible to Thread B yet) - if it has to wait for Thread B to consume some data, it'll do so here. One it has reserved all the space it requires, it can then lock the outer mutex and push the data - this is guaranteed not to have to wait for Thread B, because the space has already been reserved.

Related

std::condition_variable memory writes visibility

Does std::condition_variable::notify_one() or std::condition_variable::notify_all() guarantee that non-atomic memory writes in the current thread prior to the call will be visible in notified threads?
Other threads do:
{
std::unique_lock lock(mutex);
cv.wait(lock, []() { return values[threadIndex] != 0; });
// May a thread here see a zero value and therefore start to wait again?
}
Main thread does:
fillData(values); // All values are zero and all threads wait() before calling this.
cv.notify_all(); // Do need some memory fence or lock before this
// to ensure that new non-zero values will be visible
// in other threads immediately after waking up?
Doesn't notify_all() store some atomic value therefore enforcing memory ordering? I did not clarified it.
UPD: according to Superlokkus' answer and an answer here: we have to acquire a lock to ensure memory writes visibility in other threads (memory propagation), otherwise threads in my case may read zero values.
Also I missed this quote here about condition_variable, which specifically answers my question. Even an atomic variable has to be modified under a lock in a case when the modification must become visible immediately.
Even if the shared variable is atomic, it must be modified under the
mutex in order to correctly publish the modification to the waiting
thread.
I guess you are mixing up memory ordering of so called atomic values and the mechanisms of classic lock based synchronization.
When you have a datum which is shared between threads, lets say an int for example, one thread can not simply read it while the other thread might be write to it meanwhile. Otherwise we would have a data race.
To get around this for long time we used classic lock based synchronization:
The threads share at least a mutex and the int. To read or to write any thread has to hold the lock first, meaning they wait on the mutex. Mutexes are build so that they are fine that this can happen concurrently. If a thread wins gettting the mutex it can change or read the int and then should unlock it, so others can read/write too. Using a conditional variable like you used is just to make the pattern "readers wait for a change of a value by a writer" more efficient, they get woken up by the cv instead of periodically waiting on the lock, reading, and unlocking, which would be called busy waiting.
So because you hold the lock in any after waiting on the mutex or in you case, correctly (mutex is still needed) waiting on the conditional variable, you can change the int. And readers will read the new value after the writer was able to wrote it, never the old. UPDATE: However one thing if have to add, which might also be the cause of confusion: Conditional variables are subject for so called spurious wakeups. Meaning even though you write did not have notified any thread, a read thread might still wake up, with the mutex locked. So you have to check if you writer actually waked you up, which is usually done by the writer by changing another datum just to notify this, or if its suitable by using the same datum you already wanted to share. The lambda parameter overload of std::condition_variable::wait was just made to make the checking and going back to sleep code looking a bit prettier. Based on your question now I don't know if you want to use you values for this job.
However at snippet for the "main" thread is incorrect or incomplete:
You are not synchronizing on the mutex in order to change values.
You have to hold the lock for that, but notifying can be done without the lock.
std::unique_lock lock(mutex);
fillData(values);
lock.unlock();
cv.notify_all();
But these mutex based patters have some drawbacks and are slow, only one thread at a time can do something. This is were so called atomics, like std::atomic<int> came into play. They can be written and read at the same time without an mutex by multiple threads concurrently. Memory ordering is only a thing to consider there and an optimization for cases where you uses several of them in a meaningful way or you don't need the "after the write, I never see the old value" guarantee. However with it's default memory ordering memory_order_seq_cst you would also be fine.

lock guard - will it queue multiple request

I have a four member functions that can be called multiple times asynchronously from other piece of code - but since these functions are making use of its class member variables, I need to ensure that until one call execution is not over the second should not start but be in queue.
I have heard of lock guard feature in C++ that make a code block - in my case as automatic lock for a duration for a function :
void DoSomeWork()
{
std::lock_guard<std::mutex> lg(m); // Lock will be held from here to end of function
--------;
return;
}
Since my four class methods do independent work should I have four mutex one for each lock guard for each member function. Will the async calls made be in some sort of queue if a lock guard is active?
I mean if there are say 10 calls made to that member method at same time - so once 1st call acquires the lock guard the remaining 9 call request will wait until lock is free and take up execution one by one?
If a mutex is locked, the next request to lock it will block until the the previous thread holding the lock has unlocked it.
Note that attempting to lock a mutex multiple times from a single thread is undefined behavior. Don't do that.
For more information see e.g. this std::mutex reference.
Assuming you mean multiple threads issuing locks for the same mutex, based on prior questions, there's no queuing for pthreads or posix synchronization types. Say multiple threads each have a loop that starts with a lock and ends with an unlock, looping right back to the lock request, in which case the same thread can keep getting the lock, and none of the other threads will run (there's a very small chance that a time slice could occur between the unlock and lock, switching context to another thread). Using conditional variables also have an issue with spurious wakeup.
https://en.wikipedia.org/wiki/Spurious_wakeup
Based on testing, Windows native synchronization types, (CreateMutex, CreateSemaphore, WaitForSingleObject, WaitForMultipleObjects) do queue requests, but I haven't found it documented.
Some server applications on some operating systems will install a device driver that runs at kernel level in order to workaround the limitations of synchronization types on those operating systems.

Boost::Thread / C++11 std::thread, want to wake worker thread on condition

I am using a Boost::thread as a worker-thread. I want to put the worker thread to sleep when there is no work to be done and wake it up as soon as there is work to be done. I have two variables that hold integers. When the integers are equal, there is no work to be done. When the integers are different, there is work to be done. My current code looks like this:
int a;
int b;
void worker_thread()
{
while(true) {
if(a != b) {
//...do something
}
//if a == b, then just waste CPU cycles
}
}
//other code running in the main thread that changes the values of a and b
I tried using a condition variable and having the worker thread go to sleep when a == b. The problem is that there is a race condition. Here is an example situation:
Worker thread evaluates if(a == b), finds that it is true.
Main thread changes a and/or b such that they are no longer equal. Calls notify_one() on the worker thread.
Worker thread ignores notify_one() since it is still awake.
Worker thread goes to sleep.
Deadlock
What would be better is if I could avoid the condition variables since I don't actually need to lock anything. But just have the worker thread go to sleep whenever a == b and wake up whenever a != b. Is there a way to do this?
It seems you are not properly synchronizing your accesses: When you read a and b in the work thread, you'll need to acquire a lock, at least, while accessing the value shared with the producer: since there is a lock held by the work thread, neither a nor b can be changed by the main thread. If they are not equal, the work thread can release the lock and churn away processing the values. If they are equal, the work thread instead wait()s on the condition variable while the lock is held! The main functionality of the condition variable is to atomically release the lock and go to sleep.
When the main thread updates a and/or b it acquires the lock, does the changes, releases the lock and notifies the worker thread. The work thread clearly didn't held the lock but acquires it either when the next check is due or as a result of the notification, checks the state of the values and either wait()s or processes the values.
When done correctly, there is no potential for a race condition!
I missed your key confusion: "Since I don't actually need to lock anything"! Well, when you have two threads which concurrently may access the same value and, at least, one of them is modifying the value, you have a data race if there is no synchronization. Any program which has a data race has undefined behavior. Put differently: even if you want to only sent a bool value from one thread to another thread, you do need synchronization. The synchronization doesn't have to take the form of locks (the values can be synchronized using atomic variables, for example) but doing non-trivial communication, e.g., involving two ints rather than just one with atomics is generally quite hard! You almost certainly want to use a lock. You may not have discovered this deep desire, yet, however.
Things to think about:
Is there a reason for your threads to stay asleep at all?
Why not launch a new thread and let it die a nice natural death when it has finished its work?
If there is only one code path active at any point in time (all other threads are asleep), then your design does not allow for concurrency.
Finally, if you're using variables that are shared between threads, you should be using atomics. This will make sure that access to your values are synchonized.

Writing to a mutex'ed shared resourced

I've a C++ list which is being processed by multiple thread.
Each thread creates a pthread_mutex_lock on the list so that other threads cannot "interfere" with the list. As a part of processing, each thread also push_back data on the list.
My question is - is push_back on a mutex-ed list a bad idea? Is the mutex still valid while the thread is pusing more data on the list? Most of the documentation/examples I've seen on pthread_mutex_lock are only doing "reading" so I am curious to know what happens the same thread which acquired lock, writes on the shared resource.
As long as only that particular thread is holding the lock, and no other thread can take this lock, writing should be fine. think of why a problem could happen? it wouldve been a problem if one thread was writing and the other was reading simultaneously. If a ball is yours, you can do anything with it right? things change when they're shared.
The mutex needs to be unique for the entire group of threads (i.e. all threads must use the same mutex). If you create a mutex for each thread, then you are not thread-safe at all, because each thread will wait on its own mutex and not be synchronized with the rest.
And yes an acquired mutex can be used safely to both read and write.

How to implement a recursive MRSW lock?

I need a fully-recursive multiple-reader/single-writer lock (shared mutex) for my project - I don't agree with the notion that if you have complete const-correctness you shouldn't need them (there was some discussion about that on the boost mailing list), in my case the lock should protect a completely transparent cache which would be mutable in any case.
As for the semantics of recursive MRSW locks, I think the only ones that make sense are that acquiring a exclusive lock in addition to a shared one temporarily releases the shared one, to be reacquired when the exclusive one is released.
Has the somewhat strange effect that unlocking can wait but I can live with that - writing rarely happens anyway and recursive locking usually only happens through recursive code paths, in which case the caller has to be prepared that the call might wait in any case. To avoid it one can still simply upgrade the lock instead of using recursive locking.
Acquiring a shared lock on top of an exclusive one should obviously just increases the lock count.
So the question becomes - how should I implement it? The usual approach with a critical section and two semaphores doesn't work here because - as far as I can see - the woken up thread has to handshake, by inserting it's thread id into the lock's owner map.
I suppose it would be doable with two condition variables and a couple of mutexes but the sheer amount of synchronization primitives that would end up using sounds like a bit too much overhead for my taste.
An idea which just sprang into my mind is to utilize TLS to remember the type of lock I'm holding (and possibly the local lock counts). Have to think it through - but I'll still post the question for now.
Target platform is Win32 but that shouldn't really matter. Note that I'm specifically targeting Win2k so anything related to the new MRSW lock primitive in Windows 7 is not relevant for me. :-)
Okay, I solved it.
It can be done with just 2 semaphores, a critical section and almost no more locking than for a regular non-recursive MRSW lock (there is obviously some more CPU-time spent inside the lock because that multimap must be managed) - but it's tricky. The structure I came up with looks like this:
// Protects everything that follows, except mWriterThreadId and mRecursiveUpgrade
CRITICAL_SECTION mLock;
// Semaphore to wait on for a read lock
HANDLE mSemaReader;
// Semaphore to wait on for a write lock
HANDLE mSemaWriter;
// Number of threads waiting for a write lock.
int mWriterWaiting;
// Number of times the writer entered the write lock.
int mWriterActive;
// Number of threads inside a read lock. Note that this does not include
// recursive read locks.
int mReaderActiveThreads;
// Whether or not the current writer obtained the lock by a recursive
// upgrade. Note that this member might be set outside the critical
// section, so it should only be read from by the writer during his
// unlock.
bool mRecursiveUpgrade;
// This member contains the current thread id once for each
// (recursive) read lock held by the current thread in addition to an
// undefined number of other thread ids which may or may not hold a
// read lock, even inside the critical section (!).
std::multiset<unsigned long> mReaderActive;
// If there is no writer this member contains 0.
// If the current thread is the writer this member contains his
// thread-id.
// Otherwise it can contain either of them, even inside the
// critical section (!).
// Also note that it might be set outside the critical section.
unsigned long mWriterThreadId;
Now, the basic idea is this:
Full update of mWriterWaiting and mWriterActive for an unlock is performed by the unlocking thread.
For mWriterThreadId and mReaderActive this is not possible, as the waiting thread needs to insert itself when it was released.
So the rule is, that you may never access those two members except to check whether you are holding a read lock or are the current writer - specifically it may not be used to checker whether or not there are any readers / writers - for that you have to use the (somewhat redundant but necessary for this reason) mReaderActiveThreads and mWriterActive.
I'm currently running some test code (which has been going on deadlock- and crash-free for 30 minutes or so) - when I'm sure that it's stable and I've cleaned up the code somewhat I'll put it on some pastebin and add a link in a comment here (just in case someone else ever needs this).
Well, I did some thinking. Starting from the simple "two semaphores and a critical section" one adds a writer lock count and a owning writer TID to the structure.
Unlock still set most of the new status in the critsec. Readers still normally increase the lock count - recursive locking simply adds a non-existing reader to the counter.
During writers lock() I compare the owning TID, and if the writer already own it the write lock counter is increased.
Setting the new writer TID can't be done by the unlock() - it doesn't know which one will be wakened, but if writers reset it back to zero in their unlock() it's not a problem - the current thread id won't ever be zero and setting it is an atomic operation.
All sounds simple enough - one nasty problem left: A recursive reader-reader lock while a writer is waiting will deadlock. And I don't know how to solve that short of doing a reader-biased lock... somehow I need to know whether or not I already own a reader lock.
Using TLS doesn't sound too great after I realized that the number if available slots might be rather limited...
As far as I understand, you need to provide your writer exclusive access to the data, while readers can operate simultaneously (if this is not what you want, please clarify your question).
I think you need to implement a sort of "inverse semaphore", i.e. a semaphore that will block a thread when positive, and signal all waiting threads when zero. If you do this, you can use two such semaphores for your program. The operation of your threads could then be the following:
Reader:
(1) wait on sem A
(2) increase sem B
(3) read operation
(4) decrease sem B
Writer:
(1) increase sem A
(2) wait on sem B
(3) write operation
(4) decrease sem A
In this way the writer will perform the write operation as soon as all pending readers have finished reading. As soon as your writer finishes, readers can resume their operation without blocking each other.
I am not familiar with Windows mutex/semaphore facilities but I can think of a way to implement such semaphores using the POSIX threads API (combining a mutex, a counter and a conditional variable).