#include <iostream>
#include <string>
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
std::string data;
bool ready = false;
bool processed = false;
void worker_thread()
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
std::cout << "Worker thread is processing data\n";
data += " after processing";
processed = true;
std::cout << "Worker thread signals data processing completed\n";
//lk.unlock(); /// here!!!!!!!!!!!
cv.notify_one();
}
int main()
{
std::thread worker(worker_thread);
data = "Example data";
{
std::lock_guard<std::mutex> lk(m);
ready = true;
std::cout << "main() signals data ready for processing\n";
}
cv.notify_one();
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return processed;});
}
std::cout << "Back in main(), data = " << data << '\n';
worker.join();
}
I don't know why this code works without unlock before notify_one in worker_thread.
I think if i don't unlock before notify, Waked up main thread will block again because mutex is still held by worker_thread.
After that worker_thread will unlock mutex(because unique_lock unlock mutex when destroyed).
Then No one can wake up sleeping main thread.
But this code works well without unlocking mutex before notify.
How this works????
(I read cppreference comments, but i couldn't understand it)
There are two things to talk about here. First, why it works, and second, why you don't want to call unlock first.
It works because cv.wait(lk, []{return processed;}); actually unlocks lk while waiting for a notification.
Some sequences. Main gets lock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
cv.notify_one();
wakes up from notify
tries to reget lk, blocks
lk.unlock();
wakes up with lk.
checks condition; passes
Worker gets lock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
cv.notify_one();
lk.unlock();
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
for the case where we unlock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
lk.unlock();
cv.notify_one();
wakes up from notify
gets lk
checks condition; passes
Worker gets lock first, two possibilities at the end:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
lk.unlock();
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
cv.notify_one(); (nobody cares)
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
lk.unlock();
cv.notify_one(); (nobody cares)
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
Now, why is it better to hold the lock?
Because the writers of C++ standard libraries made it better. The library knows that the cv waiter is associated with a mutex, and knows this thread currently holds it.
So what actually happens is:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
cv.notify_one(); // knows listener holds mutex, waits for
lk.unlock(); // and actually wakes up listening threads
wakes up from notify
gets lk
checks condition; passes
Now there are far more possibilities than the above. condition_variable has "spurious" wakeups, where you are woken up even though nobody notified you. So discipline has to be followed in using it.
The general rule is that both sides must share a mutex, and the lock must be held at any point between the test condition changed and the notification. And optimally, the lock should be held until immediately after the notification is sent.
This in addition to preventing race conditions on the condition state itself. But if you use an atomic condition state, it avoids race conditions on the state, but it doesn't satisfy the locking requirements of condition variable.
The above rule -- hold the lock sometime in that interval (possibly for the entire interval) -- is a simplification, but is sufficient to guarantee you don't lose notifications. The full rule means going back to the memory model of C++ and doing painful proofs about what your code does, and I honestly don't want to do that again. So I use that rule of thumb.
To illustrate what can go wrong with an atomic condition "state" and no lock;
MAIN WORKER
auto lk = lock();
cv.wait(lk, condition);
checks condition; fails
fullfill condition
cv.notify_one(); (nobody cares)
goes to sleep, releases lk
and nothing ever happens again.
Related
I'm having a problem where I'm having a few condition_variable's get stuck in their wait phase even though they've been notified. Each one even has a predicate that's being set just in case they miss the notify call from the main thread.
Here's the code:
unsigned int notifyCount = 10000;
std::atomic<int> threadCompletions = 0;
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&, condition, wakeUp]()
{
std::mutex mutex;
std::unique_lock<std::mutex> lock(mutex);
condition->wait(lock, [wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
condition->notify_one();
}
Sleep(5000); // Sleep for 5 seconds just in case some threads are taking a while to finish executing
// Check how many threads finished (threadCompletions should be equal to notifyCount)
Unless I'm mistaken, after the for loop is done, threadCompletions should always be equal to notifyCount. Very often though, it is not.
When running in release, I'll sometimes get just one or two out of 10000 threads that never finished, but when running in debug, I'll get 20 or more.
I thought maybe the wait call in the thread is happening after the main thread's notify_one call (meaning it missed it's notification to wake up), so I passed a predicate into wait to insure that it doesn't get stuck waiting. But it still does in some cases.
Does anyone know why this is happening?
You are assuming the call to wait() is atomic. I don't believe it is. That is why it requires the use of a mutex and a lock.
Consider the following:
Main Thread. Child Thread
// This is your wait unrolled.
while (!wakeUp->load()) {
// This is atomic
// But already checked in the
// thread.
*wakeUp = true;
// Child has not yet called wait
// So this notify_one is wasted.
condition->notify_one();
// The previous call to notify_one
// is not recorded and thus the
// thread is now locked in this wait
// never to be let free.
wait(lock);
}
// Your race condition.
Calls to notify_one() and wait() should be controlled via the same mutext to make sure they don't overlap like this.
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::mutex* mutex = new std::mutex{};
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&]()
{
std::unique_lock<std::mutex> lock(*mutex);
condition->wait(lock, [&wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
std::unique_lock<std::mutex> lock(*mutex);
condition->notify_one();
}
// Don't forget to clean up the new structures correctly/.
You have data racing. Consider following scenario:
Worker Thread: condition variable tests for whether wakeup is true - it isn't
Main Thread: wakeup is set to true and condition variable is getting notified
Worker Thread: condition_variable triggers wait but it happens after notification already occurred - impling that notification misses and the thread might never wake up.
Normally, synchronization of condition variables is done via mutexes - atomics aren't too helpful here. In C++20 there will be special mechanism for waiting/notifying in atomics.
I'm trying to understand condition_variables.
I guess my code should work like:
1. main lock mx
2. main wait() notify <= here lock released
3. threads lock mx
4. threads send notify
5. threads unlock mx
6. main wait() finished and lock mx
So why threads can lock mx faster than wait() call after notify?
Example
#include <iostream>
#include <future>
#include <condition_variable>
#include <vector>
using namespace std::chrono_literals;
std::shared_future<void> ready;
std::mutex finish_mx;
std::condition_variable finish_cv;
int execute(int val, const std::shared_future<void> &ready){
ready.wait();
std::lock_guard<std::mutex> lock(finish_mx);
std::cout<<"Locked: "<<val<<std::endl;
finish_cv.notify_one();
return val;
}
int main()
{
std::promise<void> promise;
auto shared = promise.get_future().share();
std::vector<std::future<int>> pool;
for (int i=0; i<10; ++i){
auto fut = std::async(std::launch::async, execute, i, std::cref(shared));
pool.push_back(std::move(fut));
}
std::this_thread::sleep_for(100ms);
std::unique_lock<std::mutex> finish_lock(finish_mx);
promise.set_value();
for (int i=0; pool.size() > 0; ++i)
{
finish_cv.wait(finish_lock);
std::cout<<"Notifies: "<<i<<std::endl;
for (auto it = pool.begin(); it != pool.end(); ++it) {
auto state = it->wait_for(0ms);
if (state == std::future_status::ready) {
pool.erase(it);
break;
}
}
}
}
example output:
Locked: 6
Locked: 7
Locked: 8
Locked: 9
Locked: 5
Locked: 4
Locked: 3
Locked: 2
Locked: 1
Notifies: 0
Locked: 0
Notifies: 1
Edit
for (int i=0; pool.size() > 0; ++i)
{
finish_cv.wait(finish_lock);
std::cout<<"Notifies: "<<i<<std::endl;
auto it = pool.begin();
while (it != pool.end()) {
auto state = it->wait_for(0ms);
if (state == std::future_status::ready) {
/* process result */
it = pool.erase(it);
} else {
++it;
}
}
}
This depends on how your OS schedules threads that are waiting to acquire a mutex lock. All the execute threads are already waiting to acquire the mutex lock before the first notify_one, so if there's a simple FIFO queue of threads waiting to lock the mutex then they are all ahead of the main thread in the queue. As each mutex unlocks the mutex, the next one in the queue locks it.
This has nothing to do with mutexes being "faster" than condition variables, the condition variable has to lock the same mutex to return from the wait.
As soon as the future becomes ready all the execute threads return from the wait and all try to lock the mutex, joining the queue of waiters. When the condition variable starts to wait the mutex is unlocked, and one of the other threads (the one at the front of the queue) gets the lock. It calls notify_one which causes the condition variable to try to relock the mutex, joining the back of the queue. The notifying thread unlocks the mutex, and the next thread in the queue gets the lock, and calls notify_one (which does nothing because the condition variable is already notified and waiting to lock the mutex). Then the next thread in the queue gets the mutex, and so on.
It seems that one of the execute threads didn't run quickly enough to get in the queue before the first notify_one call, so it ended up in the queue behind the condition variable.
This code is simplification of real project code. Main thread create worker thread and wait with std::condition_variable for worker thread really started. In code below std::condition_variable wakes up after current_thread_state becomes "ThreadState::Stopping" - this is the second notification from worker thread, that is the main thread did not wake up after the first notification, when current_thread_state becomes "ThreadState::Starting". The result was deadlock. Why this happens? Why std::condition_variable not wake up after first thread_event.notify_all()?
int main()
{
std::thread thread_var;
struct ThreadState {
enum Type { Stopped, Started, Stopping };
};
ThreadState::Type current_thread_state = ThreadState::Stopped;
std::mutex thread_mutex;
std::condition_variable thread_event;
while (true) {
{
std::unique_lock<std::mutex> lck(thread_mutex);
thread_var = std::move(std::thread([&]() {
{
std::unique_lock<std::mutex> lck(thread_mutex);
cout << "ThreadFunction() - step 1\n";
current_thread_state = ThreadState::Started;
}
thread_event.notify_all();
// This code need to disable output to console (simulate some work).
cout.setstate(std::ios::failbit);
cout << "ThreadFunction() - step 1 -> step 2\n";
cout.clear();
{
std::unique_lock<std::mutex> lck(thread_mutex);
cout << "ThreadFunction() - step 2\n";
current_thread_state = ThreadState::Stopping;
}
thread_event.notify_all();
}));
while (current_thread_state != ThreadState::Started) {
thread_event.wait(lck);
}
}
if (thread_var.joinable()) {
thread_var.join();
current_thread_state = ThreadState::Stopped;
}
}
return 0;
}
Once you call the notify_all method, your main thread and your worker thread (after doing its work) both try to get a lock on the thread_mutex mutex. If your work load is insignificant, like in your example, the worker thread is likely to get the lock before the main thread and sets the state back to ThreadState::Stopped before the main thread ever reads it. This results in a dead lock.
Try adding a significant work load, e.g.
std::this_thread::sleep_for( std::chrono::seconds( 1 ) );
to the worker thread. Dead locks are far less likely now. Of course, this is not a fix for your problem. This is just for illustrating the problem.
You have two threads racing: one writes values of current_thread_state twice, another reads the value of current_thread_state once.
It is indeterminate whether the sequence of events is write-write-read or write-read-write as you expect, both are valid executions of your application.
This application is recursive multi-thread detached one. Each thread regenerate
new bunch of threads before it dies.
Option 1 (works) however it's a shared resource hence slows the application down.
Option 2 should remove this bottleneck.
Option 1 works:
std::condition_variable cv;
bool ready = false;
std::mutex mu;
// go triggers the thread's function
void go() {
std::unique_lock<std::mutex> lck( mu );
ready = true;
cv.notify_all();
}
void ThreadFunc ( ...) {
std::unique_lock<std::mutex> lck ( mu );
cv.wait(lck, []{return ready;});
do something useful
}
Option 2 does NOT trigger the thread:
std::array<std::mutex, DUToutputs*MaxGnodes> arrMutex ;
void go ( long m , long Channel )
{
std::unique_lock<std::mutex> lck( arrMutex[m+MaxGnodes*Channel] );
ready = true;
cv.notify_all();
}
void ThreadFunc ( ...) {
std::unique_lock<std::mutex> lck ( arrMutex[Inst+MaxGnodes*Channel] );
while (!ready) cv.wait(lck);
do something useful
}
How can I make option #2 work?
The code in Option 2 contains a so-called data race on the variable ready, because the read and write operations on this variable are no longer synchronized. The behaviour of programs with data races is undefined. You can remove the data race by changing bool ready to std::atomic<bool> ready.
That should already fix the problem in Option 2. However, if you use std::atomic, you can also make other optimizations:
std::atomic<bool> ready{false};
void go(long m, long Channel) {
// no lock required
ready = true;
cv.notify_all();
}
void ThreadFunc( ...) {
std::unique_lock<std::mutex> lck(arrMutex[Inst+MaxGnodes*Channel]);
cv.wait(lck, [] { return ready; });
// do something useful
}
I have three threads in my application, the first thread needs to wait for a data to be ready from the two other threads. The two threads are preparing the data concurrently.
In order to do that I am using condition variable in C++ as following:
boost::mutex mut;
boost::condition_variable cond;
Thread1:
bool check_data_received()
{
return (data1_received && data2_received);
}
// Wait until socket data has arrived
boost::unique_lock<boost::mutex> lock(mut);
if (!cond.timed_wait(lock, boost::posix_time::milliseconds(200),
boost::bind(&check_data_received)))
{
}
Thread2:
{
boost::lock_guard<boost::mutex> lock(mut);
data1_received = true;
}
cond.notify_one();
Thread3:
{
boost::lock_guard<boost::mutex> lock(mut);
data2_received = true;
}
cond.notify_one();
So my question is it correct to do that, or is there any more efficient way? I am looking for the most optimized way to do the waiting.
It looks like you want a semaphore here, so you can wait for two "resources" to be "taken".
For now, just replace the mutual exclusion with an atomic. you can still use a cv to signal the waiter:
#include <boost/thread.hpp>
boost::mutex mut;
boost::condition_variable cond;
boost::atomic_bool data1_received(false);
boost::atomic_bool data2_received(false);
bool check_data_received()
{
return (data1_received && data2_received);
}
void thread1()
{
// Wait until socket data has arrived
boost::unique_lock<boost::mutex> lock(mut);
while (!cond.timed_wait(lock, boost::posix_time::milliseconds(200),
boost::bind(&check_data_received)))
{
std::cout << "." << std::flush;
}
}
void thread2()
{
boost::this_thread::sleep_for(boost::chrono::milliseconds(rand() % 4000));
data1_received = true;
cond.notify_one();
}
void thread3()
{
boost::this_thread::sleep_for(boost::chrono::milliseconds(rand() % 4000));
data2_received = true;
cond.notify_one();
}
int main()
{
boost::thread_group g;
g.create_thread(thread1);
g.create_thread(thread2);
g.create_thread(thread3);
g.join_all();
}
Note:
warning - it's essential that you know only the waiter is waiting on the cv, otherwise you need notify_all() instead of notify_one().
It is not important that the waiter is already waiting before the workers signal their completion, because the predicated timed_wait checks the predicate before blocking.
Because this sample uses atomics and predicated wait, it's not actually critical to signal the cv under the mutex. However, thread checkers will (rightly) complain about this (I think) because it's impossible for them to check proper synchronization unless you add the locking.