Worker Thread permanently hibernates, after executing too fast - c++

I am trying to incorporate threads into my project, but have a problem where using merely 1 worker thread makes it "fall asleep" permanently. Perhaps I have a race condition, but just can't notice it.
My PeriodicThreads object maintains a collection of threads. Once PeriodicThreads::exec_threads() has been invoked, the threads are notified, are awaken and preform their task. Afterwards, they fall back to sleep.
Function of such a worker-thread:
void PeriodicThreads::threadWork(size_t threadId){
//not really used, but need to decalre to use conditional_variable:
std::mutex mutex;
std::unique_lock<std::mutex> lck(mutex);
while (true){
// wait until told to start working on a task:
while (_thread_shouldWork[threadId] == false){
_threads_startSignal.wait(lck);
}
thread_iteration(threadId); //virtual function
_thread_shouldWork[threadId] = false; //vector of flags
_thread_doneSignal.notify_all();
}//end while(true) - run until terminated externally or this whole obj is deleted
}
As you can see, each thread is monitoring its own entry in a vector of flags, and once it sees that it's flag is true - performs the task then resets its flag.
Here is the function that can awaken all the threads:
std::atomic_bool _threadsWorking =false;
//blocks the current thread until all worker threads have completed:
void PeriodicThreads::exec_threads(){
if(_threadsWorking ){
throw std::runtime_error("you requested exec_threads(), but threads haven't yet finished executing the previous task!");
}
_threadsWorking = true;//NOTICE: doing this after the exception check.
//tell all threads to unpause by setting their flags to 'true'
std::fill(_thread_shouldWork.begin(), _thread_shouldWork.end(), true);
_threads_startSignal.notify_all();
//wait for threads to complete:
std::mutex mutex;
std::unique_lock<std::mutex> lck(mutex); //lock & mutex are not really used.
auto isContinueWaiting = [&]()->bool{
bool threadsWorking = false;
for (size_t i=0; i<_thread_shouldWork.size(); ++i){
threadsWorking |= _thread_shouldWork[i];
}
return threadsWorking;
};
while (isContinueWaiting()){
_thread_doneSignal.wait(lck);
}
_threadsWorking = false;//set atomic to false
}
Invoking exec_threads() works fine for several hundred or in rare cases several thousand consecutive iterations. Invocations occur from the main thread's while loop. Its worker thread processes the task, resets its flag and goes back to sleep until the next exec_threads(), and so on.
However, some time after that, the program snaps into a "hibernation", and seems to pause, but doesn't crash.
During such a "hibernation" putting a breakpoint at any while-loop of my condition_variables never actualy causes that breakpoint to trigger.
Being sneaky, I've created my own verify-thread (parallel to main) and monitor my PeriodicThreads object. As it falls into hibernation, my verify-thread keeps outputting to the console me that no threads are currently running (the _threadsWorking atomic of PeriodicThreads is permanently set to false). However, during the other tests the atomic remains as true, once that "hibernation issue" begins.
The strange thing is that if I force the PeriodicThreads::run_thread to sleep for at least 10 microseconds before resetting its flag, things work as normal, and no "hibernation" occurs. Otherwise, if we allow thread to complete it's task very quickly it might cause this whole issue.
I've wrapped each condition_variable inside a while loop to prevent spurious wakes from triggering transition, and situation where notify_all is called before .wait() is called on it. Link
Notice, this occurs even when I have only 1 worker thread
What could be the cause?
Edit
Abandoning these vector flags and just testing on a single atomic_bool with 1 worker thread still shows the same issue.

All shared data should be protected by a mutex. The mutex should have (at least) the same scope as the shared data.
Your _thread_shouldWork container is shared data. You can make a global array of mutexes and each one can protect its own _thread_shouldWork element. (see note below). You should also have at least as many condition variables as you have mutexes. (You can use 1 mutex with several different condition variables, but you should not use several different mutexes with 1 condition variable.)
A condition_variable should protect an actual condition (in this case, the state of an individual element of _thread_shouldWork at any given point) and the mutex is used to protect the variables that encompass that condition.
If you're just using a random local mutex (as you are in your thread code) or just not using a mutex at all (in the main code), then all bets are off. It's undefined behavior. Although I could see it working (by luck) most of the time. What I suspect is happening is that a worker thread is missing the signal from the main thread. It could also be that your main thread is missing the signal from a worker thread. (Thread A reads the state and enters the while loop, then Thread B changes the state and sends the notification, then Thread A goes to sleep... waiting for a notification that was already sent)
Mutexes with local scope are a red flag!
Note: If you're using a vector, you have to watch out because adding or removing items can trigger a resize which will touch elements without grabbing the mutex first (because of course the vector doesn't know about your mutex).
You also have to watch out for false sharing when using arrays
Edit: Here's a video that #Kari found useful for explaining false sharing
https://www.youtube.com/watch?v=dznxqe1Uk3E

Related

Stop thread from re-acquiring mutex after releasing it

I am making my own mutex to synchronize my threads and I am having the following issue:
The same thread seems to re-acquire the mutex right after it releases it
What I have tried:
Telling it to yield execution to another thread (SwitchToThread, Sleep, YieldProcessor)
Increasing delay between loops (Up to 1 second)
Here is how it works:
I have a structure with a state value:
volatile unsigned int state;
When I want to acquire the mutex, I check the state until it has been released (open), then acquire (close) it and break out of the infinite loop and do whatever needs to be done:
unsigned int previous = 0;
for (;;)
{
previous = InterlockedExchangeAdd(&mtx->state,
0);
if (STATE_OPEN == previous)
{
InterlockedExchange(&mtx->state,
STATE_CLOSED);
break;
}
Sleep(delay);
}
Then I simply release it for the next thread to acquire it:
InterlockedExchange(&mtx->state,
STATE_OPEN);
The way I am using it is I simply have one global volatile integer that I add 1 to in one thread and subtract 1 to in another one. Increasing the delay has helped with making it so that the number does not either go very low or very high and get stuck in a loop being executed in just a single thread, but a 1+ second delay is not going to work for my other purposes.
How could I go about making sure that all of the threads get a chance to acquire the mutex and not have it get stuck in a single thread?
The mutex does exactly what it is supposed to do: it prevents multiple threads from running at the same time.
To stop a thread from re-acquiring the mutex, the basic solution is to not access the shared resource which is protected by the mutex. The thread probably should be doing something else.
You may also have a design problem. If you have multiple resources protected by a single mutex, you may have false contention between threads. if each resource had its own mutex, multiple threads could each work on their own resource.

What happens to a thread, waiting on condition variable, that is getting joined?

I've got a class named TThreadpool, which holds member pool of type std::vector<std::thread>>, with the following destructor:
~TThreadpool() {
for (size_t i = 0; i < pool.size(); i++) {
assert(pool[i].joinable());
pool[i].join();
}
}
I'm confident that when destructor is called, all of the threads are waiting on a single condition variable (spurious wakeup controlled with always-false predicate), and joinable outputs true.
Reduced example of running thread would be:
void my_thread() {
std::unique_lock<std::mutex> lg(mutex);
while (true) {
my_cond_variable.wait(lg, [] {
return false;
});
# do some work and possibly break, but never comes farther then wait
# so this probably should not matter
}
}
To check what threads are running, I'm launching top -H. At the start of the program, there are pool.size() threads + 1 thread where TThreadpool itself lives. And to my surprise, joining these alive threads does not remove them from list of threads that top is giving. Is this expected behaviour?
(Originally, my program was a bit different - I made a simple ui application using qt, that used threadpool running in ui thread and other threads controlled by threadpool, and on closing the ui window joining of threads had been called, but QtCreator said my application still worked after I closed the window, requiring me to shut it down with a crash. That made me check state of my threads, and it turned out it had nothing to do with qt. Although I'm adding this in case I missed some obvious detail with qt).
A bit later, I tried not asserting joinable, but printing it, and found out the loop inside Threadpool destructor never moved further than first join - the behaviour I did not expect and cannot explain
join() doesn't do anything to the child thread -- all it does is block until the child thread has exited. It only has an effect on the calling thread (i.e. by blocking its progress). The child thread can keep running for as long as it wants (although typically you'd prefer it to exit quickly, so that the thread calling join() doesn't get blocked for a long time -- but that's up to you to implement)
And to my surprise, joining these alive threads does not remove them from list of threads that top is giving. Is this expected behaviour?
That suggests the thread(s) are still running. Calling join() on a thread doesn't have any impact on that running thread; simply the calling thread
waits for the called-on thread to exit.
found out the loop inside Threadpool destructor never moved further than first join
That means the first thread hasn't completed yet. So none of the other threads haven't been joined yet either (even if they have exited).
However, if the thread function is implemented correctly, the first thread (and all other threads in the pool) should eventually complete and
the join() calls should return (assuming the threads in the pool are supposed to exit - but this doesn't need to true in general.
Depending on application, you could simply make the threads run forever too).
So it appears there's some sort of deadlock or wait for some resource that's holding up one or more threads. So you need to run through a debugger.
Helgrind would be very useful.
You could also try to reduce the number of threads (say 2) and to see if the problem becomes reproducible/obvious and then you could increase the threads.

Correct usage of std::condition_variable to trigger timed execution

I am trying to execute a piece of code in fixed time intervals. I have something based on naked pthread and now I want to do the same using std::thread.
#include <thread>
#include <mutex>
#include <condition_variable>
#include <iostream>
bool running;
std::mutex mutex;
std::condition_variable cond;
void timer(){
while(running) {
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
std::lock_guard<std::mutex> guard(mutex);
cond.notify_one();
}
cond.notify_one();
}
void worker(){
while(running){
std::unique_lock<std::mutex> mlock(mutex);
cond.wait(mlock);
std::cout << "Hello World" << std::endl;
//... do something that takes a variable amount of time ...//
}
}
int main(){
running = true;
auto t_work = std::thread(worker);
auto t_time = std::thread(timer);
std::this_thread::sleep_for(std::chrono::milliseconds(10000));
running = false;
t_time.join();
t_work.join();
}
The worker in reality does something that takes a variable amount of time, but it should be scheduled at fixed intervals. It seems to work, but I am pretty new to this, so some things arent clear to me...
Why do I need a mutex at all? I do not really use a condition, but whenever the timer sends a signal, the worker should do its job.
Does the timer really need to call cond.notify_one() again after the loop? This was taken from the older code and iirc the reasoning is to prevent the worker to wait forever, in case the timer finishes while the worker is still waiting.
Do I need the running flag, or is there a nicer way to break out of the loops?
PS: I know that there are other ways to ensure a fixed time interval, and I know that there are some problems with my current approach (eg if worker needs more time than the interval used by the timer). However, I would like to first understand that piece of code, before changing it too much.
Why do I need a mutex at all? I do not really use a condition, but whenever the timer sends a signal, the worker should do its job.
The reason you need a mutex is that the thread waiting for the condition to be satisfied could be subject to a spurious wakeup. To make sure your thread actually received the notification that the condition is correctly satisfied you need to check that and should do so with a lambda inside the wait call. And to guarantee that the variable is not modified after the spurious wakeup but before you check the variable you need to acquire a mutex such that your thread is the only one that can modify the condition. In your case that means you need to add a means for the worker thread to actually verify that the timer did run out.
Does the timer really need to call cond.notify_one() again after the loop? This was taken from the older code and iirc the reasoning is to prevent the worker to wait forever, in case the timer finishes while the worker is still waiting.
If you dont call notify after the loop the worker thread will wait indefinitely. So to cleanly exit your program you should actually call notify_all() to make sure every thread waiting for the condition variable wakes up and can terminate cleanly.
Do I need the running flag, or is there a nicer way to break out of the loops?
A running flag is the cleanest way to accomplish what you want.
Let's first check the background concepts.
Critical Section
First of all Mutex is needed to mutually exclude access to a critical section. Usually, critical section is considered to be shared resource. E.g. a Queue, Some I/O (e.g. socket) etc. In plain words Mutex is used to guard shared resource agains a Race Condition, which can bring a resource into undefined state.
Example: Producer / Consumer Problem
A queue should contain some work items to be done. There might be multiple threads which put some work items into the Queue (i.e. produce items => Producer Threads) and multiple threads which consume these items and do smth. useful with them (=> Consumer Threads).
Put and Consume operations modify the Queue (especially its storage and internal representations). Thus when running either put or consume operations we want to exclude other operations from doing the same. This is where Mutex comes into play. In a very basic constellation only one thread (no matter producer or consumer) can get access to the Mutex, i.e. lock it. There exist some other Higher Level locking primitives to increase throughput dependent on usage scenarios (e.g. ReaderWriter Locks)
Concept of Condition Variables
condition_variable::notify_one wakes up one currently waiting thread. At least one thread has to wait on this variable:
If no threads are waiting on this variable posted event will be lost.
If there was a waiting thread it will wake up and start running as soon as it can lock the mutex associated with the condition variable. So if the thread which initiated the notify_one or notify_all call does not give up the mutex lock (e.g. mutex::unlock() or condition_variable::wait()) woken up thread(s) will not run.
In the timer() thread mutex is unlocked after notify_one() call, because the scope ends and guard object is destroyed (destructor calls implicitly mutex::unlock())
Problems with this approach
Cancellation and Variable Caching
Compilers are allowed to cache values of the variables. Thus setting running to true might not work, as the values of the variable might be cached. To avoid that, you need to declare running as volatile or std::atomic<bool>.
worker Thread
You point out that worker needs to run in some time intervals and it might run for various amounts of time. The timer thread can only run after worker thread finished. Why do you need another thread at that point to measure time? These two threads always run as one linear chunk and have no critical section! Why not just put after the task execution the desired sleep call and start running as soon as time elapsed? As it turns out only std::cout is a shared resource. But currently it is used from one thread. Otherwise, you'd need a mutex (without condition variable) to guard writes to cout only.
#include <thread>
#include <atomic>
#include <iostream>
#include <chrono>
std::atomic_bool running = false;
void worker(){
while(running){
auto start_point = std::chrono::system_clock::now();
std::cout << "Hello World" << std::endl;
//... do something that takes a variable amount of time ...//
std::this_thread::sleep_until(start_point+std::chrono::milliseconds(1000));
}
}
int main(){
running = true;
auto t_work = std::thread(worker);
std::this_thread::sleep_for(std::chrono::milliseconds(10000));
running = false;
t_work.join();
}
Note: With sleep_until call in the worker thread the execution is blocked if your task was blocking longer than 1000ms from the start_point.

std::condition_variable without a lock

I'm trying to synchonise a set of threads. These threads sleep most of the time, waking up to do their scheduled job. I'm using std::thread for them.
Unfortunately, when I terminate the application threads prevent it from exiting. In C# I can make a thread to be background so that it will be termianted on app exit. It seems to me that equavalint feature is not availabe at C++.
So I decided to use a kind of event indicator, and make the threads to wake up when the app exits. Standard C++11 std::condition_variable requires a unique lock, so I cannot use it, as I need both threads to wake up at the same time (they do not share any resources).
Eventually, I decided to use WinApi's CreateEvent + SetEvent+WaitForSingleObject in order to solve the issue.
I there a way to achieve the same behavior using just c++11?
Again, what do I want:
a set of threads are working independently and usually are asleep
for a particular period (could be different for different threads;
all threds check a variable that is availabe for all of them whether
it is a time to stop working (I call this variable IsAliva).
Actually all threads are spinning in loop like this:
while (IsAlive) {
// Do work
std::this_thread::sleep_for(...);
}
threads must be able to work simultaneously, not blocking each other;
when the app is closed and event is risen and it makes the thread to
wake up right now, no matter the timeout;
waken up, it checks the
IsAlive and exits.
yes you can do this using standard c++ mechanisms of condition variables, a mutex and a flag of somekind
// Your class or global variables
std::mutex deathLock;
std::condition_variable deathCv;
bool deathTriggered = false;
// Kill Thread runs this code to kill all other threads:
{
std::lock_guard<std::mutex> lock(deathLock);
deathTriggered = true;
}
deathCv.notify_all();
// You Worker Threads run this code:
while(true)
{
... do work
// Now wait for 1000 milliseconds or until death is triggered:
std::unique_lock<std::mutex> lock(deathLock);
deathCv.wait_for(lock, std::chrono::milliseconds(1000), [](){return deathTriggered;});
// Check for death
if(deathTriggered)
{
break;
}
}
Note that this runs correctly in the face of death being triggered before entering the condition. You could also use the return value from wait_for but this way is easier to read imo. Also, whilst its not clear, multiple threads sleeping is fine as the wait_for code internally unlocks the unique_lock whilst sleeping and reacquires it to check the condition and also when it returns.
Finally, all the threads do wake up 'at the same time' as whilst they're serialised in checking the bool flag, this is only for a few instructions then they unlock the lock as they break out of the loop. It would be unnoticeable.
In c++11, you should be able to detach() a thread, so that it will be treated as a Daemon thread, which means the thread will be automatically stopped if the app terminates.

Boost::Thread / C++11 std::thread, want to wake worker thread on condition

I am using a Boost::thread as a worker-thread. I want to put the worker thread to sleep when there is no work to be done and wake it up as soon as there is work to be done. I have two variables that hold integers. When the integers are equal, there is no work to be done. When the integers are different, there is work to be done. My current code looks like this:
int a;
int b;
void worker_thread()
{
while(true) {
if(a != b) {
//...do something
}
//if a == b, then just waste CPU cycles
}
}
//other code running in the main thread that changes the values of a and b
I tried using a condition variable and having the worker thread go to sleep when a == b. The problem is that there is a race condition. Here is an example situation:
Worker thread evaluates if(a == b), finds that it is true.
Main thread changes a and/or b such that they are no longer equal. Calls notify_one() on the worker thread.
Worker thread ignores notify_one() since it is still awake.
Worker thread goes to sleep.
Deadlock
What would be better is if I could avoid the condition variables since I don't actually need to lock anything. But just have the worker thread go to sleep whenever a == b and wake up whenever a != b. Is there a way to do this?
It seems you are not properly synchronizing your accesses: When you read a and b in the work thread, you'll need to acquire a lock, at least, while accessing the value shared with the producer: since there is a lock held by the work thread, neither a nor b can be changed by the main thread. If they are not equal, the work thread can release the lock and churn away processing the values. If they are equal, the work thread instead wait()s on the condition variable while the lock is held! The main functionality of the condition variable is to atomically release the lock and go to sleep.
When the main thread updates a and/or b it acquires the lock, does the changes, releases the lock and notifies the worker thread. The work thread clearly didn't held the lock but acquires it either when the next check is due or as a result of the notification, checks the state of the values and either wait()s or processes the values.
When done correctly, there is no potential for a race condition!
I missed your key confusion: "Since I don't actually need to lock anything"! Well, when you have two threads which concurrently may access the same value and, at least, one of them is modifying the value, you have a data race if there is no synchronization. Any program which has a data race has undefined behavior. Put differently: even if you want to only sent a bool value from one thread to another thread, you do need synchronization. The synchronization doesn't have to take the form of locks (the values can be synchronized using atomic variables, for example) but doing non-trivial communication, e.g., involving two ints rather than just one with atomics is generally quite hard! You almost certainly want to use a lock. You may not have discovered this deep desire, yet, however.
Things to think about:
Is there a reason for your threads to stay asleep at all?
Why not launch a new thread and let it die a nice natural death when it has finished its work?
If there is only one code path active at any point in time (all other threads are asleep), then your design does not allow for concurrency.
Finally, if you're using variables that are shared between threads, you should be using atomics. This will make sure that access to your values are synchonized.