The simplified goal is to force calling 3 member functions in 3 different threads one by one (thread A calls F::first, thread B F::second, an thread C F::third).
In order to achieve the order for threads to be executed I used 1 condition variable and 2 bools indicating whether first and second threads finished their work.
In the code:
std::mutex mtx;
std::condition_variable cv;
bool firstPrinted = false;
bool secondPrinted = false;
class F {
public:
void first(std::function<void()> printFirst) {
std::unique_lock<std::mutex> lck(mtx);
std::cout << "first\n";
printFirst();
firstPrinted = true;
cv.notify_one();
}
void second(std::function<void()> printSecond) {
std::unique_lock<std::mutex> lck(mtx);
std::cout << "second\n";
cv.wait(lck, []() { return firstPrinted; });
printSecond();
secondPrinted = true;
cv.notify_one();
}
void third(std::function<void()> printThird) {
std::unique_lock<std::mutex> lck(mtx);
std::cout << "third\n";
cv.wait(lck, []() { return secondPrinted; });
printThird();
}
};
auto first = []() {
std::cout << "1";
};
auto second = []() {
std::cout << "2";
};
auto third = []() {
std::cout << "3";
};
F f;
std::thread A(&F::first, &f, first);
std::thread B(&F::second, &f, second);
std::thread C(&F::third, &f, third);
A.join(); B.join(); C.join();
Now lets consider this situation:
Thread A does not start first - whether the first starting thread was B or C they both block (wait) until get notified (B blocks until notified by A, and C blocks until notified by B)
The infinite waiting (or perhaps deadlock !?) appears when the first starting thread is C, which always yields this output:
third
second
first
...and stalling here
Theoretically, this should not happen because calling cv.wait in thread C unlocks the mutex which allows thread B to run which in turn also waits (because condition didn't become true) and therefore it unlocks the locked mutex as well allowing thread A to start first which finally should enter critical section and notify B.
What is the call path that causes stalling of the program ?
What nuance did I miss ?
Please correct me if I was wrong in the thoughts above.
std::condition_variable::notify_one() will wake one of the threads waiting for the condition_variable. If multiple threads are waiting, one will be picked. It will wake, reacquire the lock check it's predicate. If that predicate is still false it will return to it's waiting state and the notification is in essence lost.
That is what is happening here when the thread running first is the last to execute. When it reaches it's notify_one there will be two threads waiting for the condition_variable. If it notifies the thread running third, it's predicate will still return false. That thread will wake, fail it's predicate test and return to waiting. Your process now has no running threads and is frozen.
The solution is to use std::condition_variable::notify_all(). This function wakes all waiting threads who will, one at a time, relock the mutex and check their own predicate.
Related
I have two threads. One thread acts as a timer thread which at regular intervals of time needs to send a notification to another thread. I intend to use C++ condition variables. (There is a good article on how to use C++ condition variables along with its traps and pitfalls in the following link)
I have the following constraints/conditions :-
The notifying thread need not lock on to a mutex
The notified (or the receiver) thread does some useful section but there is no critical section
The receiver thread is allowed to miss a notification if and only if it is doing useful work
There should be no spurious wakeups.
Using the above link as a guideline I put together the following piece of code
// conditionVariableAtomic.cpp
#include <atomic>
#include <condition_variable>
#include <iostream>
#include <thread>
#include <iostream> // std::cout, std::endl
#include <thread> // std::this_thread::sleep_for
#include <chrono> // std::chrono::seconds
std::mutex mutex_;
std::condition_variable condVar;
std::atomic<bool> dataReady{false};
void waitingForWork(){
int i = 0;
while (i++ < 10)
{
std::cout << "Waiting " << std::endl;
{
std::unique_lock<std::mutex> lck(mutex_);
condVar.wait(lck, []{ return dataReady.load(); }); // (1)
dataReady = false;
}
std::cout << "Running " << std::endl;
// Do useful work but no critical section.
}
}
void setDataReady(){
int i = 0;
while (i++ < 10)
{
std::this_thread::sleep_for (std::chrono::seconds(1));
dataReady = true;
std::cout << "Data prepared" << std::endl;
condVar.notify_one();
}
}
int main(){
std::cout << std::endl;
std::thread t1(waitingForWork);
std::thread t2(setDataReady);
t1.join();
t2.join();
std::cout << std::endl;
}
I use an atomic predicate to avoid spurious wakeups, but don't use a lock_guard in the notifying thread.
My question is:
does the above piece of code satisfy the constraints/conditions listed above?
I understand that the receiver thread cannot avoid a mutex, hence the need to use std::unique_lock<std::mutex> lck(mutex_); in the receiver. I have however limited the scope of std::unique_lock<std::mutex> lck(mutex_); i.e. put the following section of code
std::unique_lock<std::mutex> lck(mutex_);
condVar.wait(lck, []{ return dataReady.load(); }); // (1)
dataReady = false;
inside a scope block aka { .... } so that the mutex is unlocked as soon as the wait condition is over (the receiver then does some useful work but since there is no critical section, it does not need to hold on to the mutex for the entire while loop). Could there still be consequences/side effects of this limited scoping in this context ? Or does the unique_lock<std::mutex> need to be locked for the entire while loop?
Your code has a race condition. Between checking the value of dataReady in your wait predicate and actually starting the wait, the other thread can set dataReady and call notify_one. In your example this isn't critical as you'll just miss one notify and wake up a second later on the next one.
Another race condition is that you can set dataReady to true in one thread, set dataReady back to false in the other thread and then call notify_one in the first thread, again this will cause the wait to block for longer than you intended.
You should hold the mutex in both threads when setting dataReady and using the condition variable to avoid these races.
You could avoid the second race condition by using an atomic counter instead of a boolean, incrementing it on one thread then decrementing on the other and in the predicate checking if it is non-zero.
This is a separate question but related to the previous question I asked here
I am using an std::thread in my C++ code to constantly poll for some data & add it to a buffer. I use a C++ lambda to start the thread like this:
StartMyThread() {
thread_running = true;
the_thread = std::thread { [this] {
while(thread_running) {
GetData();
}
}};
}
thread_running is an atomic<bool> declared in class header. Here is my GetData function:
GetData() {
//Some heavy logic
}
Next I also have a StopMyThread function where I set thread_running to false so that it exits out of the while loop in the lambda block.
StopMyThread() {
thread_running = false;
the_thread.join();
}
As I understand, I can pause & resume the thread using a std::condition_variable as pointed out here in my earlier question.
But is there a disadvantage if I just use the std::atomic<bool> thread_running to execute or not execute the logic in GetData() like below ?
GetData() {
if (thread_running == false)
return;
//Some heavy logic
}
Will this burn more CPU cycles compared to the approach of using an std::condition_variable as described here ?
A condition variable is useful when you want to conditionally halt another thread or not. So you might have an always-running "worker" thread that waits when it notices it has nothing to do to be running.
The atomic solution requires your UI interaction synchronize with the worker thread, or very complex logic to do it asynchronously.
As a general rule, your UI response thread should never block on non-ready state from worker threads.
struct worker_thread {
worker_thread( std::function<void()> t, bool play = true ):
task(std::move(t)),
execute(play)
{
thread = std::async( std::launch::async, [this]{
work();
});
}
// move is not safe. If you need this movable,
// use unique_ptr<worker_thread>.
worker_thread(worker_thread&& )=delete;
~worker_thread() {
if (!exit) finalize();
wait();
}
void finalize() {
auto l = lock();
exit = true;
cv.notify_one();
}
void pause() {
auto l = lock();
execute = false;
}
void play() {
auto l = lock();
execute = true;
cv.notify_one();
}
void wait() {
Assert(exit);
if (thread)
thread.get();
}
private:
void work() {
while(true) {
bool done = false;
{
auto l = lock();
cv.wait( l, [&]{
return exit || execute;
});
done = exit; // have lock here
}
if (done) break;
task();
}
}
std::unique_lock<std::mutex> lock() {
return std::unique_lock<std::mutex>(m);
}
std::mutex m;
std::condition_variable cv;
bool exit = false;
bool execute = true;
std::function<void()> task;
std::future<void> thread;
};
or somesuch.
This owns a thread. The thread repeatedly runs task so long as it is in play() mode. If you pause() the next time task() finishes, the worker thread stops. If you play() before the task() call finishes, it doesn't notice the pause().
The only wait is on destruction of worker_thread, where it automatically informs the worker thread it should exit and it waits for it to finish.
You can manually .wait() or .finalize() as well. .finalize() is async, but if your app is shutting down you can call it early and give the worker thread more time to clean up while the main thread cleans things up elsewhere.
.finalize() cannot be reversed.
Code not tested.
Unless I'm missing something, you already answered this in your original question: You'll be creating and destroying the worker thread each time it's needed. This may or may not be an issue in your actual application.
There's two different problems being solved and it may depend on what you're actually doing. One problem is "I want my thread to run until I tell it to stop." The other seems to be a case of "I have a producer/consumer pair and want to be able to notify the consumer when data is ready." The thread_running and join method works well for the first of those. The second you may want to use a mutex and condition because you're doing more than just using the state to trigger work. Suppose you have a vector<Work>. You guard that with the mutex, so the condition becomes [&work] (){ return !work.empty(); } or something similar. When the wait returns, you hold the mutex so you can take things out of work and do them. When you're done, you go back to wait, releasing the mutex so the producer can add things to the queue.
You may want to combine these techniques. Have a "done processing" atomic that all of your threads periodically check to know when to exit so that you can join them. Use the condition to cover the case of data delivery between threads.
The real code is way more complex but I think I managed to make a mcve.
I'm trying to do the following:
Have some threads do work
Put them ALL into a pause state
Wake up the first of them, wait for it to finish, then wake up the second one, wait for it to finish, wake up the third one.. etc..
The code I'm using is the following and it seems to work
std::atomic_int which_thread_to_wake_up;
std::atomic_int threads_asleep;
threads_asleep.store(0);
std::atomic_bool ALL_THREADS_READY;
ALL_THREADS_READY.store(false);
int threads_num = .. // Number of threads
bool thread_has_finished = false;
std::mutex mtx;
std::condition_variable cv;
std::mutex mtx2;
std::condition_variable cv2;
auto threadFunction = [](int my_index) {
// some heavy workload here..
....
{
std::unique_lock<std::mutex> lck(mtx);
++threads_asleep;
cv.notify_all(); // Wake up any other thread that might be waiting
}
std::unique_lock<std::mutex> lck(mtx);
bool all_ready = ALL_THREADS_READY.load();
size_t index = which_thread_to_wake_up.load();
cv.wait(lck, [&]() {
all_ready = ALL_THREADS_READY.load();
index = which_thread_to_wake_up.load();
return all_ready && my_index == index;
});
// This thread was awaken for work!
.. do some more work that requires synchronization..
std::unique_lock<std::mutex> lck2(mtx2);
thread_has_finished = true;
cv2.notify_one(); // Signal to the main thread that I'm done
};
// launch all the threads..
std::vector<std::thread> ALL_THREADS;
for (int i = 0; i < threads_num; ++i)
ALL_THREADS.emplace_back(threadFunction, i);
// Now the main thread needs to wait for ALL the threads to finish their first phase and go to sleep
std::unique_lock<std::mutex> lck(mtx);
size_t how_many_threads_are_asleep = threads_asleep.load();
while (how_many_threads_are_asleep < threads_num) {
cv.wait(lck, [&]() {
how_many_threads_are_asleep = threads_asleep.load();
return how_many_threads_are_asleep == numThreads;
});
}
// At this point I'm sure ALL THREADS ARE ASLEEP!
// Wake them up one by one (there should only be ONE awake at any time before it finishes his computation)
for (int i = 0; i < threads_num; i++)
{
which_thread_to_wake_up.store(i);
cv.notify_all(); // (*) Wake them all up to check if they're the chosen one
std::unique_lock<std::mutex> lck2(mtx2);
cv2.wait(lck, [&]() { return thread_has_finished; }); // Wait for the chosen one to finish
thread_has_finished = false;
}
I'm afraid that the last notify_all() call (the one I marked with (*)) might cause the following situation:
all threads are asleep
all threads are awaken from the main thread by calling notify_all()
the thread which has the right index finishes the last computation and releases the lock
ALL THE OTHER THREADS HAVE BEEN AWAKENED BUT THEY HAVEN'T CHECKED THE ATOMIC VARIABLES YET
the main thread issues a second notify_all() and THIS GETS LOST (since the threads are ALL awakened yet, they haven't simply checked the atomics yet)
Could this ever happen? I couldn't find any wording for notify_all() if its calls are somehow buffered or the order of synchronization with the functions that actually check the condition variables.
As per the docs on (notify_all)
notify_all is only one half of the requirements to continue a thread. The condition statement has to be true as well. So there has to be a traffic cop designed to wake up the first, wake up the second, wake up the third. The notify function tells the thread to check that condition.
My answer is more high level than code specific but I hope that helps.
The situation you consider can happen. If your working threads (slaves) are awaken when the notify_all() is invoked, then they will probably miss that signal.
One way to prevent this situation is to lock mtx before cv.notify_all() and unlock it afterward. As suggested in the documentation of wait(), lock is used as a guard to pred() access. If the master thread aquires mtx, no other thread are checking the conditions at the same moment. Although they may be doing other jobs at that time, but in your code they are not likely to enter wait again.
Consider the following code:
int main() {
bool done = false;
condition_variable con;
mutex m;
thread producer([&]() {
this_thread::sleep_for(chrono::seconds(10));
done = true;
//con.notify_one();
});
thread consumer([&]() {
/*unique_lock<mutex> lock(m);
while (!done) {
con.wait(lock);
}*/
while (!done);
cout << "now my turn..."<<endl;
});
producer.join();
consumer.join();
}
if I uncomment the code in the 2 threads, I will use the condition_variable. So the consumer thread will look like this:
thread consumer([&]() {
unique_lock<mutex> lock(m);
while (!done) {
con.wait(lock);
}
// while (!done); <-this is equivalent of the above
cout << "now my turn..."<<endl;
});
It seems that I can achieve the same thing with/without condition_variable.
So my question is: why do we need condition_variable if a notifying variable ('done' variable in this case) has been used already? What is the benefit of using it? Can I do something that a notifying variable cannot do?
When waiting on a condition variable the thread is blocked (i.e. not executing). When notified the thread is put in the ready state so the OS can schedule it.
This is more efficient than the thread "busy-waiting", which is polling a variable constantly to check that it can continue. In that case the thread is using up CPU cycles that could be used for actual work instead.
Also you need to use condition variables in order to correctly protect the critical section from being accessed by multiple threads at a time. You might have 3 consumers running but only one is allowed to work at a time (the others might be doing something else until then).
I am having an issue with terminating worker threads from the main thread. So far each method I tried either leads to a race condition or dead lock.
The worker threads are stored in a inner class inside a class called ThreadPool, ThreadPool maintains a vector of these WorkerThreads using unique_ptr.
Here is the header for my ThreadPool:
class ThreadPool
{
public:
typedef void (*pFunc)(const wpath&, const Args&, Global::mFile_t&, std::mutex&, std::mutex&); // function to point to
private:
class WorkerThread
{
private:
ThreadPool* const _thisPool; // reference enclosing class
// pointers to arguments
wpath _pPath; // member argument that will be modifyable to running thread
Args * _pArgs;
Global::mFile_t * _pMap;
// flags for thread management
bool _terminate; // terminate thread
bool _busy; // is thread busy?
bool _isRunning;
// thread management members
std::mutex _threadMtx;
std::condition_variable _threadCond;
std::thread _thisThread;
// exception ptr
std::exception_ptr _ex;
// private copy constructor
WorkerThread(const WorkerThread&): _thisPool(nullptr) {}
public:
WorkerThread(ThreadPool&, Args&, Global::mFile_t&);
~WorkerThread();
void setPath(const wpath); // sets a new task
void terminate(); // calls terminate on thread
bool busy() const; // returns whether thread is busy doing task
bool isRunning() const; // returns whether thread is still running
void join(); // thread join wrapper
std::exception_ptr exception() const;
// actual worker thread running tasks
void thisWorkerThread();
};
// thread specific information
DWORD _numProcs; // number of processors on system
unsigned _numThreads; // number of viable threads
std::vector<std::unique_ptr<WorkerThread>> _vThreads; // stores thread pointers - workaround for no move constructor in WorkerThread
pFunc _task; // the task threads will call
// synchronization members
unsigned _barrierLimit; // limit before barrier goes down
std::mutex _barrierMtx; // mutex for barrier
std::condition_variable _barrierCond; // condition for barrier
std::mutex _coutMtx;
public:
// argument mutex
std::mutex matchesMap_mtx;
std::mutex coutMatch_mtx;
ThreadPool(pFunc f);
// wake a thread and pass it a new parameter to work on
void callThread(const wpath&);
// barrier synchronization
void synchronizeStartingThreads();
// starts and synchronizes all threads in a sleep state
void startThreads(Args&, Global::mFile_t&);
// terminate threads
void terminateThreads();
private:
};
So far the real issue I am having is that when calling terminateThreads() from main thread
causes dead lock or race condition.
When I set my _terminate flag to true, there is a chance that the main will already exit scope and destruct all mutexes before the thread has had a chance to wake up and terminate. In fact I have gotten this crash quite a few times (console window displays: mutex destroyed while busy)
If I add a thread.join() after I notify_all() the thread, there is a chance the thread will terminate before the join occurs, causing an infinite dead lock, as joining to a terminated thread suspends the program indefinitely.
If I detach - same issue as above, but causes program crash
If I instead use a while(WorkerThread.isRunning()) Sleep(0);
The program may crash because the main thread may exit before the WorkerThread reaches that last closing brace.
I am not sure what else to do to stop halt the main until all worker threads have terminated safely. Also, even with try-catch in thread and main, no exceptions are being caught. (everything I have tried leads to program crash)
What can I do to halt the main thread until worker threads have finished?
Here are the implementations of the primary functions:
Terminate Individual worker thread
void ThreadPool::WorkerThread::terminate()
{
_terminate = true;
_threadCond.notify_all();
_thisThread.join();
}
The actual ThreadLoop
void ThreadPool::WorkerThread::thisWorkerThread()
{
_thisPool->synchronizeStartingThreads();
try
{
while (!_terminate)
{
{
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Sleeping..." << std::endl;
_thisPool->_coutMtx.unlock();
_busy = false;
std::unique_lock<std::mutex> lock(_threadMtx);
_threadCond.wait(lock);
}
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Awake..." << std::endl;
_thisPool->_coutMtx.unlock();
if(_terminate)
break;
_thisPool->_task(_pPath, *_pArgs, *_pMap, _thisPool->coutMatch_mtx, _thisPool->matchesMap_mtx);
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Finished Task..." << std::endl;
_thisPool->_coutMtx.unlock();
}
_thisPool->_coutMtx.lock();
std::cout << std::this_thread::get_id() << " Terminating" << std::endl;
_thisPool->_coutMtx.unlock();
}
catch (const std::exception&)
{
_ex = std::current_exception();
}
_isRunning = false;
}
Terminate All Worker Threads
void ThreadPool::terminateThreads()
{
for (std::vector<std::unique_ptr<WorkerThread>>::iterator it = _vThreads.begin(); it != _vThreads.end(); ++it)
{
it->get()->terminate();
//it->get()->_thisThread.detach();
// if thread threw an exception, rethrow it in main
if (it->get()->exception() != nullptr)
std::rethrow_exception(it->get()->exception());
}
}
and lastly, the function that is calling the thread pool (the scan function is running on main)
// scans a path recursively for all files of selected extension type, calls thread to parse file
unsigned int Functions::Scan(wpath path, const Args& args, ThreadPool& pool)
{
wrecursive_directory_iterator d(path), e;
unsigned int filesFound = 0;
while ( d != e )
{
if (args.verbose())
std::wcout << L"Grepping: " << d->path().string() << std::endl;
for (Args::ext_T::const_iterator it = args.extension().cbegin(); it != args.extension().cend(); ++it)
{
if (extension(d->path()) == *it)
{
++filesFound;
pool.callThread(d->path());
}
}
++d;
}
std::cout << "Scan Function: Calling TerminateThreads() " << std::endl;
pool.terminateThreads();
std::cout << "Scan Function: Called TerminateThreads() " << std::endl;
return filesFound;
}
Ill repeat the question again: What can I do to halt the main thread until worker threads have finished?
I don't get the issue with thread termination and join.
Joining threads is all about waiting until the given thread has terminated, so it's exaclty what you want to do. If the thread has finished execution already, join will just return immediately.
So you'll just want to join each thread during the terminate call as you already do in your code.
Note: currently you immediately rethrow any exception if a thread you just terminated has an active exception_ptr. That might lead to unjoined threads. You'll have to keep that in mind when handling those exceptions
Update: after looking at your code, I see a potential bug: std::condition_variable::wait() can return when a spurious wakeup occurs. If that is the case, you will work again on the path that was worked on the last time, leading to wrong results. You should have a flag for new work that is set if new work has been added, and that _threadCond.wait(lock) line should be in a loop that checks for the flag and _terminate. Not sure if that one will fix your problem, though.
The problem was two fold:
synchronizeStartingThreads() would sometimes have 1 or 2 threads blocked, waiting for the okay to go ahead (a problem in the while (some_condition) barrierCond.wait(lock). The condition would sometimes never evaluate to true. removing the while loop fixed this blocking issue.
The second issue was the potential for a worker thread to enter the _threadMtx, and notify_all was called just before they entered the _threadCond.wait(), since notify was already called, the thread would wait forever.
ie.
{
// terminate() is called
std::unique_lock<std::mutex> lock(_threadMtx);
// _threadCond.notify_all() is called here
_busy = false;
_threadCond.wait(lock);
// thread is blocked forever
}
surprisingly, locking this mutex in terminate() did not stop this from happening.
This was solved by adding a timeout of 30ms to the _threadCond.wait()
Also, a check was added before the starting of task to make sure the same task wasn't being processed again.
The new code now looks like this:
thisWorkerThread
_threadCond.wait_for(lock, std::chrono::milliseconds(30)); // hold the lock a max of 30ms
// after the lock, and the termination check
if(_busy)
{
Global::mFile_t rMap = _thisPool->_task(_pPath, *_pArgs, _thisPool->coutMatch_mtx);
_workerMap.element.insert(rMap.element.begin(), rMap.element.end());
}