Threads lock mutex faster than std::conditional_variable::wait() - c++

I'm trying to understand condition_variables.
I guess my code should work like:
1. main lock mx
2. main wait() notify <= here lock released
3. threads lock mx
4. threads send notify
5. threads unlock mx
6. main wait() finished and lock mx
So why threads can lock mx faster than wait() call after notify?
Example
#include <iostream>
#include <future>
#include <condition_variable>
#include <vector>
using namespace std::chrono_literals;
std::shared_future<void> ready;
std::mutex finish_mx;
std::condition_variable finish_cv;
int execute(int val, const std::shared_future<void> &ready){
ready.wait();
std::lock_guard<std::mutex> lock(finish_mx);
std::cout<<"Locked: "<<val<<std::endl;
finish_cv.notify_one();
return val;
}
int main()
{
std::promise<void> promise;
auto shared = promise.get_future().share();
std::vector<std::future<int>> pool;
for (int i=0; i<10; ++i){
auto fut = std::async(std::launch::async, execute, i, std::cref(shared));
pool.push_back(std::move(fut));
}
std::this_thread::sleep_for(100ms);
std::unique_lock<std::mutex> finish_lock(finish_mx);
promise.set_value();
for (int i=0; pool.size() > 0; ++i)
{
finish_cv.wait(finish_lock);
std::cout<<"Notifies: "<<i<<std::endl;
for (auto it = pool.begin(); it != pool.end(); ++it) {
auto state = it->wait_for(0ms);
if (state == std::future_status::ready) {
pool.erase(it);
break;
}
}
}
}
example output:
Locked: 6
Locked: 7
Locked: 8
Locked: 9
Locked: 5
Locked: 4
Locked: 3
Locked: 2
Locked: 1
Notifies: 0
Locked: 0
Notifies: 1
Edit
for (int i=0; pool.size() > 0; ++i)
{
finish_cv.wait(finish_lock);
std::cout<<"Notifies: "<<i<<std::endl;
auto it = pool.begin();
while (it != pool.end()) {
auto state = it->wait_for(0ms);
if (state == std::future_status::ready) {
/* process result */
it = pool.erase(it);
} else {
++it;
}
}
}

This depends on how your OS schedules threads that are waiting to acquire a mutex lock. All the execute threads are already waiting to acquire the mutex lock before the first notify_one, so if there's a simple FIFO queue of threads waiting to lock the mutex then they are all ahead of the main thread in the queue. As each mutex unlocks the mutex, the next one in the queue locks it.
This has nothing to do with mutexes being "faster" than condition variables, the condition variable has to lock the same mutex to return from the wait.
As soon as the future becomes ready all the execute threads return from the wait and all try to lock the mutex, joining the queue of waiters. When the condition variable starts to wait the mutex is unlocked, and one of the other threads (the one at the front of the queue) gets the lock. It calls notify_one which causes the condition variable to try to relock the mutex, joining the back of the queue. The notifying thread unlocks the mutex, and the next thread in the queue gets the lock, and calls notify_one (which does nothing because the condition variable is already notified and waiting to lock the mutex). Then the next thread in the queue gets the mutex, and so on.
It seems that one of the execute threads didn't run quickly enough to get in the queue before the first notify_one call, so it ended up in the queue behind the condition variable.

Related

Notifying condtion_variable without unlocking mutex still works well. Why?

#include <iostream>
#include <string>
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
std::string data;
bool ready = false;
bool processed = false;
void worker_thread()
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
std::cout << "Worker thread is processing data\n";
data += " after processing";
processed = true;
std::cout << "Worker thread signals data processing completed\n";
//lk.unlock(); /// here!!!!!!!!!!!
cv.notify_one();
}
int main()
{
std::thread worker(worker_thread);
data = "Example data";
{
std::lock_guard<std::mutex> lk(m);
ready = true;
std::cout << "main() signals data ready for processing\n";
}
cv.notify_one();
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return processed;});
}
std::cout << "Back in main(), data = " << data << '\n';
worker.join();
}
I don't know why this code works without unlock before notify_one in worker_thread.
I think if i don't unlock before notify, Waked up main thread will block again because mutex is still held by worker_thread.
After that worker_thread will unlock mutex(because unique_lock unlock mutex when destroyed).
Then No one can wake up sleeping main thread.
But this code works well without unlocking mutex before notify.
How this works????
(I read cppreference comments, but i couldn't understand it)
There are two things to talk about here. First, why it works, and second, why you don't want to call unlock first.
It works because cv.wait(lk, []{return processed;}); actually unlocks lk while waiting for a notification.
Some sequences. Main gets lock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
cv.notify_one();
wakes up from notify
tries to reget lk, blocks
lk.unlock();
wakes up with lk.
checks condition; passes
Worker gets lock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
cv.notify_one();
lk.unlock();
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
for the case where we unlock first:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
lk.unlock();
cv.notify_one();
wakes up from notify
gets lk
checks condition; passes
Worker gets lock first, two possibilities at the end:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
lk.unlock();
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
cv.notify_one(); (nobody cares)
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
fullfill condition
lk.unlock();
cv.notify_one(); (nobody cares)
wakes up with lk.
cv.wait(lk, condition);
checks condition; passes
Now, why is it better to hold the lock?
Because the writers of C++ standard libraries made it better. The library knows that the cv waiter is associated with a mutex, and knows this thread currently holds it.
So what actually happens is:
MAIN WORKER
auto lk = lock();
auto lk = lock(); (blocks)
cv.wait(lk, condition);
checks condition; fails
releases lk
wakes up with lk
fullfill condition
cv.notify_one(); // knows listener holds mutex, waits for
lk.unlock(); // and actually wakes up listening threads
wakes up from notify
gets lk
checks condition; passes
Now there are far more possibilities than the above. condition_variable has "spurious" wakeups, where you are woken up even though nobody notified you. So discipline has to be followed in using it.
The general rule is that both sides must share a mutex, and the lock must be held at any point between the test condition changed and the notification. And optimally, the lock should be held until immediately after the notification is sent.
This in addition to preventing race conditions on the condition state itself. But if you use an atomic condition state, it avoids race conditions on the state, but it doesn't satisfy the locking requirements of condition variable.
The above rule -- hold the lock sometime in that interval (possibly for the entire interval) -- is a simplification, but is sufficient to guarantee you don't lose notifications. The full rule means going back to the memory model of C++ and doing painful proofs about what your code does, and I honestly don't want to do that again. So I use that rule of thumb.
To illustrate what can go wrong with an atomic condition "state" and no lock;
MAIN WORKER
auto lk = lock();
cv.wait(lk, condition);
checks condition; fails
fullfill condition
cv.notify_one(); (nobody cares)
goes to sleep, releases lk
and nothing ever happens again.

condition_variable doesn't get notified to wake up even with a predicate

I'm having a problem where I'm having a few condition_variable's get stuck in their wait phase even though they've been notified. Each one even has a predicate that's being set just in case they miss the notify call from the main thread.
Here's the code:
unsigned int notifyCount = 10000;
std::atomic<int> threadCompletions = 0;
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&, condition, wakeUp]()
{
std::mutex mutex;
std::unique_lock<std::mutex> lock(mutex);
condition->wait(lock, [wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
condition->notify_one();
}
Sleep(5000); // Sleep for 5 seconds just in case some threads are taking a while to finish executing
// Check how many threads finished (threadCompletions should be equal to notifyCount)
Unless I'm mistaken, after the for loop is done, threadCompletions should always be equal to notifyCount. Very often though, it is not.
When running in release, I'll sometimes get just one or two out of 10000 threads that never finished, but when running in debug, I'll get 20 or more.
I thought maybe the wait call in the thread is happening after the main thread's notify_one call (meaning it missed it's notification to wake up), so I passed a predicate into wait to insure that it doesn't get stuck waiting. But it still does in some cases.
Does anyone know why this is happening?
You are assuming the call to wait() is atomic. I don't believe it is. That is why it requires the use of a mutex and a lock.
Consider the following:
Main Thread. Child Thread
// This is your wait unrolled.
while (!wakeUp->load()) {
// This is atomic
// But already checked in the
// thread.
*wakeUp = true;
// Child has not yet called wait
// So this notify_one is wasted.
condition->notify_one();
// The previous call to notify_one
// is not recorded and thus the
// thread is now locked in this wait
// never to be let free.
wait(lock);
}
// Your race condition.
Calls to notify_one() and wait() should be controlled via the same mutext to make sure they don't overlap like this.
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::mutex* mutex = new std::mutex{};
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&]()
{
std::unique_lock<std::mutex> lock(*mutex);
condition->wait(lock, [&wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
std::unique_lock<std::mutex> lock(*mutex);
condition->notify_one();
}
// Don't forget to clean up the new structures correctly/.
You have data racing. Consider following scenario:
Worker Thread: condition variable tests for whether wakeup is true - it isn't
Main Thread: wakeup is set to true and condition variable is getting notified
Worker Thread: condition_variable triggers wait but it happens after notification already occurred - impling that notification misses and the thread might never wake up.
Normally, synchronization of condition variables is done via mutexes - atomics aren't too helpful here. In C++20 there will be special mechanism for waiting/notifying in atomics.

pthread mutex lock - Does it check periodically or OS wakes it up [duplicate]

This question already has answers here:
How pthread_mutex_lock is implemented
(3 answers)
Closed 4 years ago.
If Thread1 tried to lock resource locked by Thread2.
Does it go to sleep for finite time ?
Now if the Thread2 unlock the mutex then how would Thread1 will come to know that resource is available ? Is the operating system wakes it up or it checks for resource periodically ?
your second assumption is correct. When a mutex is locked by a thread already, all the remaining threads that are trying to lock it again will be placed on hold and will be in the sleep state. Once the mutex lock is unlocked the O/S wakes them all up and who can unlock first can access the lock. This is not in FIFO basis, actually there is no rule which thread should get first preference to lock the mutex once wakes up. You can consider my below example where I have use condition variable to control the threads:-
pthread_cond_t cond1 = PTHREAD_COND_INITIALIZER;
pthread_cond_t cond2 = PTHREAD_COND_INITIALIZER;
pthread_cond_t cond3 = PTHREAD_COND_INITIALIZER;
pthread_mutex_t lock1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t lock2 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t lock3 = PTHREAD_MUTEX_INITIALIZER;
int TRUE = 1;
void print(char *p)
{
printf("%s",p);
}
void * threadMethod1(void *arg)
{
printf("In thread1\n");
do{
pthread_mutex_lock(&lock1);
pthread_cond_wait(&cond1, &lock1);
print("I am thread 1st\n");
pthread_cond_signal(&cond3);/* Now allow 3rd thread to process */
pthread_mutex_unlock(&lock1);
}while(TRUE);
pthread_exit(NULL);
}
void * threadMethod2(void *arg)
{
printf("In thread2\n");
do
{
pthread_mutex_lock(&lock2);
pthread_cond_wait(&cond2, &lock2);
print("I am thread 2nd\n");
pthread_cond_signal(&cond1);
pthread_mutex_unlock(&lock2);
}while(TRUE);
pthread_exit(NULL);
}
void * threadMethod3(void *arg)
{
printf("In thread3\n");
do
{
pthread_mutex_lock(&lock3);
pthread_cond_wait(&cond3, &lock3);
print("I am thread 3rd\n");
pthread_cond_signal(&cond2);
pthread_mutex_unlock(&lock3);
}while(TRUE);
pthread_exit(NULL);
}
int main(void)
{
pthread_t tid1, tid2, tid3;
int i = 0;
printf("Before creating the threads\n");
if( pthread_create(&tid1, NULL, threadMethod1, NULL) != 0 )
printf("Failed to create thread1\n");
if( pthread_create(&tid2, NULL, threadMethod2, NULL) != 0 )
printf("Failed to create thread2\n");
if( pthread_create(&tid3, NULL, threadMethod3, NULL) != 0 )
printf("Failed to create thread3\n");
pthread_cond_signal(&cond1);/* Now allow first thread to process first */
sleep(1);
TRUE = 0;/* Stop all the thread */
sleep(3);
/* this is how we join thread before exit from a system */
/*
pthread_join(tid1,NULL);
pthread_join(tid2,NULL);
pthread_join(tid3,NULL);*/
exit(0);
}
Here I am using 3 mutexs and 3 conditions. With the above example you can schedule/control or prioritize any number of threads in C. If you see the first thread here it locked mutex lock1 and waiting on cond1, likewise second thread locked mutex lock2 and waits on condition cond2 and 3rd thread locked mutex lock3 and waits on condition cond3. This is the current situation of all the threads after they are being created and now all the threads are waiting for a signal to execute further on its condition variable. In the main thread (i.e. main function, every program has one main thread, in C/C++ this main thread created automatically by operating system once control pass to the main method by kernal) we are calling pthread_cond_signal(&cond1); once this system call done thread1 who was waiting on cond1 will be release and it will start executing. Once it finished with its task it will call pthread_cond_signal(&cond3); now thread who was waiting on condition cond3 i.e. thread3 will be release and it will start execute and will call pthread_cond_signal(&cond2); which will release the thread who is waiting on condition cond2 i.e. in this case thread2.
Fundamental information about the mutex (MUtual Exclusion locks)
A mutex is a special lock that only one thread may lock at a time. If a thread locks a mutex and then a second thread also tries to lock the same mutex, the second thread is blocked, or put on hold. Only when the first thread unlocks the mutex is the second thread unblocked—allowed to resume execution.
Linux guarantees that race conditions do not occur among threads attempting to lock a mutex; only one thread will ever get the lock, and all other threads will be blocked.
A thread may attempt to lock a mutex by calling pthread_mutex_lock on it. If the mutex was unlocked, it becomes locked and the function returns immediately.
What happens trying to lock the when its locked by another thread?
If the mutex was locked by another thread, pthread_mutex_lock blocks execution and returns only eventually when the mutex is unlocked by the other thread.

Correct way to wait a condition variable that is notified by several threads

I'm trying to do this with the C++11 concurrency support.
I have a sort of thread pool of worker threads that all do the same thing, where a master thread has an array of condition variables (one for each thread, they need to 'start' synchronized, ie not run ahead one cycle of their loop).
for (auto &worker_cond : cond_arr) {
worker_cond.notify_one();
}
then this thread has to wait for a notification of each thread of the pool to restart its cycle again. Whats the correct way of doing this? Have a single condition variable and wait on some integer each thread that isn't the master is going to increase? something like (still in the master thread)
unique_lock<std::mutex> lock(workers_mtx);
workers_finished.wait(lock, [&workers] { return workers = cond_arr.size(); });
I see two options here:
Option 1: join()
Basically instead of using a condition variable to start the calculations in your threads, you spawn a new thread for every iteration and use join() to wait for it to be finished. Then you spawn new threads for the next iteration and so on.
Option 2: locks
You don't want the main-thread to notify as long as one of the threads is still working. So each thread gets its own lock, which it locks before doing the calculations and unlocks afterwards. Your main-thread locks all of them before calling the notify() and unlocks them afterwards.
I see nothing fundamentally wrong with your solution.
Guard workers with workers_mtx and done.
We could abstract this with a counting semaphore.
struct counting_semaphore {
std::unique_ptr<std::mutex> m=std::make_unique<std::mutex>();
std::ptrdiff_t count = 0;
std::unique_ptr<std::condition_variable> cv=std::make_unique<std::condition_variable>();
counting_semaphore( std::ptrdiff_t c=0 ):count(c) {}
counting_semaphore(counting_semaphore&&)=default;
void take(std::size_t n = 1) {
std::unique_lock<std::mutex> lock(*m);
cv->wait(lock, [&]{ if (count-std::ptrdiff_t(n) < 0) return false; count-=n; return true; } );
}
void give(std::size_t n = 1) {
{
std::unique_lock<std::mutex> lock(*m);
count += n;
if (count <= 0) return;
}
cv->notify_all();
}
};
take takes count away, and blocks if there is not enough.
give adds to count, and notifies if there is a positive amount.
Now the worker threads ferry tokens between two semaphores.
std::vector< counting_semaphore > m_worker_start{count};
counting_semaphore m_worker_done{0}; // not count, zero
std::atomic<bool> m_shutdown = false;
// master controller:
for (each step) {
for (auto&& starts:m_worker_start)
starts.give();
m_worker_done.take(count);
}
// master shutdown:
m_shutdown = true;
// wake up forever:
for (auto&& starts:m_worker_start)
starts.give(std::size_t(-1)/2);
// worker thread:
while (true) {
master->m_worker_start[my_id].take();
if (master->m_shutdown) return;
// do work
master->m_worker_done.give();
}
or somesuch.
live example.

std::condition_variable not properly wakes up after std::condition_variable::notify_all() from other thread

This code is simplification of real project code. Main thread create worker thread and wait with std::condition_variable for worker thread really started. In code below std::condition_variable wakes up after current_thread_state becomes "ThreadState::Stopping" - this is the second notification from worker thread, that is the main thread did not wake up after the first notification, when current_thread_state becomes "ThreadState::Starting". The result was deadlock. Why this happens? Why std::condition_variable not wake up after first thread_event.notify_all()?
int main()
{
std::thread thread_var;
struct ThreadState {
enum Type { Stopped, Started, Stopping };
};
ThreadState::Type current_thread_state = ThreadState::Stopped;
std::mutex thread_mutex;
std::condition_variable thread_event;
while (true) {
{
std::unique_lock<std::mutex> lck(thread_mutex);
thread_var = std::move(std::thread([&]() {
{
std::unique_lock<std::mutex> lck(thread_mutex);
cout << "ThreadFunction() - step 1\n";
current_thread_state = ThreadState::Started;
}
thread_event.notify_all();
// This code need to disable output to console (simulate some work).
cout.setstate(std::ios::failbit);
cout << "ThreadFunction() - step 1 -> step 2\n";
cout.clear();
{
std::unique_lock<std::mutex> lck(thread_mutex);
cout << "ThreadFunction() - step 2\n";
current_thread_state = ThreadState::Stopping;
}
thread_event.notify_all();
}));
while (current_thread_state != ThreadState::Started) {
thread_event.wait(lck);
}
}
if (thread_var.joinable()) {
thread_var.join();
current_thread_state = ThreadState::Stopped;
}
}
return 0;
}
Once you call the notify_all method, your main thread and your worker thread (after doing its work) both try to get a lock on the thread_mutex mutex. If your work load is insignificant, like in your example, the worker thread is likely to get the lock before the main thread and sets the state back to ThreadState::Stopped before the main thread ever reads it. This results in a dead lock.
Try adding a significant work load, e.g.
std::this_thread::sleep_for( std::chrono::seconds( 1 ) );
to the worker thread. Dead locks are far less likely now. Of course, this is not a fix for your problem. This is just for illustrating the problem.
You have two threads racing: one writes values of current_thread_state twice, another reads the value of current_thread_state once.
It is indeterminate whether the sequence of events is write-write-read or write-read-write as you expect, both are valid executions of your application.