Pattern for thread to check if it should stop? - c++

Is there a canonical pattern for a thread to check if it should stop working?
The scenario is that a thread is spinning a tight working loop but it should stop if another thread tells it to. I was thinking of checking an atomic bool in the loop condition but I'm not sure if that is an unnecessary performance hit or not. E.g.
std::atomic<bool> stop{false};
while(!stop){
//...
}

You have to make it atomic (or volatile in old C/C++) to ensure that the compiler doesn't optimize it away and only test stop once.
If you call a function in the loop that cannot be inlined (like reading sockets) you might be safe with a non-atomic bool, but why risk it - especially as the atomic read is unlikely to be a performance issue in that case?
To have the least effect you could do something like:
std::atomic<bool> stop;
void rx_thread() {
// ...
while(!stop.load(std::memory_order_relaxed)){
..
}
}

I don't see any reason why the bool should be atomic. There isn't really a potential for a race condition. When you want to stop the thread, you first set the variable to true and then issue a call that wakes up the blocking function inside the loop (however you do that, depends on the blocking call).
bool stop = false;
void rx_thread() {
// ...
while(!stop){
// ...
blocking_call();
// ...
}
}
void stop_thread() {
stop = true;
wakeup_rx_thread();
}
If the blocking call happens to wake up between setting stop to true and calling wakeup_rx_thread(), then the loop will finish anyway. The call to wakeup_rx_thread() will be needless, but that doesn't matter.

Related

How to let a thread wait itself out without using Sleep()?

I want the while loop in the thread to run , wait a second, then run again, so on and so on., but this don't seem to work, how would I fix it?
main(){
bool flag = true;
pthread = CreateThread(NULL, 0, ThreadFun, this, 0, &ThreadIP);
}
ThreadFun(){
while(flag == true)
WaitForSingleObject(pthread,1000);
}
This is one way to do it, I prefer using condition variables over sleeps since they are more responsive and std::async over std::thread (mainly because std::async returns a future which can send information back the the starting thread. Even if that feature is not used in this example).
#include <iostream>
#include <chrono>
#include <future>
#include <condition_variable>
// A very useful primitive to communicate between threads is the condition_variable
// despite its name it isn't a variable perse. It is more of an interthread signal
// saying, hey wake up thread something may have changed that's interesting to you.
// They come with some conditions of their own
// - always use with a lock
// - never wait without a predicate
// (https://www.modernescpp.com/index.php/c-core-guidelines-be-aware-of-the-traps-of-condition-variables)
// - have some state to observe (in this case just a bool)
//
// Since these three things go together I usually pack them in a class
// in this case signal_t which will be used to let thread signal each other
class signal_t
{
public:
// wait for boolean to become true, or until a certain time period has passed
// then return the value of the boolean.
bool wait_for(const std::chrono::steady_clock::duration& duration)
{
std::unique_lock<std::mutex> lock{ m_mtx };
m_cv.wait_for(lock, duration, [&] { return m_signal; });
return m_signal;
}
// wiat until the boolean becomes true, wait infinitely long if needed
void wait()
{
std::unique_lock<std::mutex> lock{ m_mtx };
m_cv.wait(lock, [&] {return m_signal; });
}
// set the signal
void set()
{
std::unique_lock<std::mutex> lock{ m_mtx };
m_signal = true;
m_cv.notify_all();
}
private:
bool m_signal { false };
std::mutex m_mtx;
std::condition_variable m_cv;
};
int main()
{
// create two signals to let mainthread and loopthread communicate
signal_t started; // indicates that loop has really started
signal_t stop; // lets mainthread communicate a stop signal to the loop thread.
// in this example I use a lambda to implement the loop
auto future = std::async(std::launch::async, [&]
{
// signal this thread has been scheduled and has started.
started.set();
do
{
std::cout << ".";
// the stop_wait_for will either wait 500 ms and return false
// or stop immediately when stop signal is set and then return true
// the wait with condition variables is much more responsive
// then implementing a loop with sleep (which will only
// check stop condition every 500ms)
} while (!stop.wait_for(std::chrono::milliseconds(500)));
});
// wait for loop to have started
started.wait();
// give the thread some time to run
std::this_thread::sleep_for(std::chrono::seconds(3));
// then signal the loop to stop
stop.set();
// synchronize with thread stop
future.get();
return 0;
}
While the other answer is a possible way to do it, my answer will mostly answer from a different angle trying to see what could be wrong with your code...
Well, if you don't care to wait up to one second when flag is set to false and you want a delay of at least 1000 ms, then a loop with Sleep could work but you need
an atomic variable (for ex. std::atomic)
or function (for ex. InterlockedCompareExchange)
or a MemoryBarrier
or some other mean of synchronisation to check the flag.
Without proper synchronisation, there is no guarantee that the compiler would read the value from memory and not the cache or a register.
Also using Sleep or similar function from a UI thread would also be suspicious.
For a console application, you could wait some time in the main thread if the purpose of you application is really to works for a given duration. But usually, you probably want to wait until processing is completed. In most cases, you should usually wait that threads you have started have completed.
Another problem with Sleep function is that the thread always has to wake up every few seconds even if there is nothing to do. This can be bad if you want to optimize battery usage. However, on the other hand having a relatively long timeout on function that wait on some signal (handle) might make your code a bit more robust against missed wakeup if your code has some bugs in it.
You also need a delay in some cases where you don't really have anything to wait on but you need to pull some data at regular interval.
A large timeout could also be useful as a kind of watch dog timer. For example, if you expect to have something to do and receive nothing for an extended period, you could somehow report a warning so that user could check if something is not working properly.
I highly recommand you to read a book on multithreading like Concurrency in Action before writing multithread code code.
Without proper understanding of multithreading, it is almost 100% certain that anyone code is bugged. You need to properly understand the C++ memory model (https://en.cppreference.com/w/cpp/language/memory_model) to write correct code.
A thread waiting on itself make no sense. When you wait a thread, you are waiting that it has terminated and obviously if it has terminated, then it cannot be executing your code. You main thread should wait for the background thread to terminate.
I also usually recommand to use C++ threading function over the API as they:
Make your code portable to other system.
Are usually higher level construct (std::async, std::future, std::condition_variable...) than corresponding Win32 API code.

How to end a thread properly?

My main program creates a thread. This thread initializes some data then enters a 'while' loop and runs until the main program sets the control variable to 'false'. Then it calls join() witch blocks the whole code endlessly.
bool m_ThreadMayRun;
void main(){
thread mythread = thread(&ThreadFunction);
//do stuff
m_ThreadMayRun = false;
mythread.join(); // this blocks endlessly even when I ask 'joinable' before
}
void ThreadFunction{
initdata();
m_ThreadMayRun=true;
while(m_ThreadMayRun){
//do stuff that can be / has to be done for ever
}
deinitdata();
}
Am I missing something here?
What would be a proper solution to make the loop leave from the main thread?
Is it at all necessary to call join?
Thanks for help
You have a race condition for two threads writing to m_ThreadMayRun. Consider what happens if first the main thread executes m_ThreadMayRun = false; and then the thread you spwaned executes m_ThreadMayRun = true;, then you have an infinite loop. However, strictly speaking that line of reasoning is irrelevant, because when you have a race condition your code has undefined behavior.
Am I missing something here?
You need to synchronize access to m_ThreadMayRun by making it either an std::atomic<bool> or using a std::mutex and make sure that m_ThreadMayRun = false is executed after m_ThreadMayRun = true;.
PS For this situation it is better to use a std::condition_variable.
The issue is that access to bool m_ThreadMayRun; is not synchronized, and according to C++ rules, each thread may assume it does not change between threads. So you end up with a race (a form of undefined behavior).
To make the intention clear, make it atomic.
std::atomic<bool> m_ThreadMayRun;
With this every load/store of m_ThreadMayRun becomes a memory fence, which not only synchronizes its own value, but also makes other work done by the thread visible, due to the acquire/release semantics of an atomic load/store.
Though there is still a small race possible between m_ThreadMayRun = true in the thread and setting m_ThreadMayRun = false. Either one can execute first, sometimes leading to undesired results. To avoid this, initialize it to true before starting the thread.
std::atomic<bool> m_ThreadMayRun;
void main(){
m_ThreadMayRun = true;
thread mythread(&ThreadFunction);
//do stuff
m_ThreadMayRun = false;
mythread.join(); // this blocks endlessly even when I ask 'joinable' before
}
void ThreadFunction{
initdata();
while(m_ThreadMayRun){
//do stuff that can be / has to be done for ever
}
deinitdata();
}
For more details about memory fences and acquire/release semantics, refer to the following excellent resources: the book "C++ Concurrency in Action" and Herb Sutter's atomic<> weapons talk.

std::atomic_bool for cancellation flag: is std::memory_order_relaxed the correct memory order?

I have a thread that reads from a socket and generates data. After every operation, the thread checks a std::atomic_bool flag to see if it must exit early.
In order to cancel the operation, I set the cancellation flag to true, then call join() on the worker thread object.
The code of the thread and the cancellation function looks something like this:
std::thread work_thread;
std::atomic_bool cancel_requested{false};
void thread_func()
{
while(! cancel_requested.load(std::memory_order_relaxed))
process_next_element();
}
void cancel()
{
cancel_requested.store(true, std::memory_order_relaxed);
work_thread.join();
}
Is std::memory_order_relaxed the correct memory order for this use of an atomic variable?
As long as there is no dependency between cancel_requested flag and anything else, you should be safe.
The code as shown looks OK, assuming you use cancel_requested only to expedite the shutdown, but also have a provision for an orderly shutdown, such as a sentinel entry in the queue (and of course that the queue itself is synchronized).
Which means your code actually looks like this:
std::thread work_thread;
std::atomic_bool cancel_requested{false};
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;
void thread_func()
{
while(! cancel_requested.load(std::memory_order_relaxed))
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue_filled_cond.wait(lock, []{ return !work_queue.empty(); });
auto element = work_queue.front();
work_queue.pop();
lock.unlock();
if (element == exit_sentinel)
break;
process_next_element(element);
}
}
void cancel()
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue.push_back(exit_sentinel);
work_queue_filled_cond.notify_one();
lock.unlock();
cancel_requested.store(true, std::memory_order_relaxed);
work_thread.join();
}
And if we're that far, then cancel_requested may just as well become a regular variable, the code even becomes simpler.
std::thread work_thread;
bool cancel_requested = false;
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;
void thread_func()
{
while(true)
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue_filled_cond.wait(lock, []{ return cancel_requested || !work_queue.empty(); });
if (cancel_requested)
break;
auto element = work_queue.front();
work_queue.pop();
lock.unlock();
process_next_element(element);
}
}
void cancel()
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
cancel_requested = true;
work_queue_filled_cond.notify_one();
lock.unlock();
work_thread.join();
}
memory_order_relaxed is generally hard to reason about, because it blurs the general notion of sequentially executing code. So the usefulness of it is very, very limited as Herb explains in his atomic weapons talk.
Note std::thread::join() by itself acts as a memory barrier between the two threads.
Whether this code is correct depends on a lot of things. Most of all it depends on what exactly you mean by "correct". As far as I can tell, the bits of code that you show don't invoke undefined behavior (assuming your work_thread and cancel_requested are not actually initialized in the order your snippet above suggests as you would then have the thread potentially reading the uninitialized value of the atomic). If all you need to do is change the value of that flag and have the thread eventually see the new value at some point independent of whatever else may be going on, then std::memory_order_relaxed is sufficient.
However, I see that your worker thread calls a process_next_element() function. That suggests that there is some mechanism through which the worker thread receives elements to process. I don't see any way for the thread to exit when all elements have been processed. What does process_next_element() do when there's no next element available right away? Does it just return immediately? In that case you've got yourself a busy wait for more input or cancellation, which will work but is probably not ideal. Or does process_next_element() internally call some function that blocks until an element becomes available!? If that is the case, then cancelling the thread would have to involve first setting the cancellation flag and then doing whatever is needed to make sure the next element call your thread is potentially blocking on returns. In this case, it's potentially essential that the thread can never see the cancellation flag after the blocking call returns. Otherwise, you could potentially have the call return, go back into the loop, still read the old cancellation flag and then go call process_next_element() again. If process_next_element() is guaranteed to just return again, then you're fine. If that is not the case, you have a deadlock. So I believe it technically depends on what exactly process_next_element() does. One could imagine an implementation of process_next_element() where you would potentially need more than relaxed memory order. However, if you already have a mechanism for fetching new elements to process, why even use a separate cancellation flag? You could simply handle cancellation through that same mechanism, e.g., by having it return a next element with a special value or return no element at all to signal cancellation of processing and cause the thread to return instead of relying on a separate flag…

Using std::condition_variable with atomic<bool>

There are several questions on SO dealing with atomic, and other that deal with std::condition_variable. But my question if my use below is correct?
Three threads, one ctrl thread that does preparation work before unpausing the two other threads. The ctrl thread also is able to pause the worker threads (sender/receiver) while they are in their tight send/receive loops.
The idea with using the atomic is to make the tight loops faster in case the boolean for pausing is not set.
class SomeClass
{
public:
//...
// Disregard that data is public...
std::condition_variable cv; // UDP threads will wait on this cv until allowed
// to run by ctrl thread.
std::mutex cv_m;
std::atomic<bool> pause_test_threads;
};
void do_pause_test_threads(SomeClass *someclass)
{
if (!someclass->pause_test_threads)
{
// Even though we use an atomic, mutex must be held during
// modification. See documentation of condition variable
// notify_all/wait. Mutex does not need to be held for the actual
// notify call.
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = true;
}
}
void unpause_test_threads(SomeClass *someclass)
{
if (someclass->pause_test_threads)
{
{
// Even though we use an atomic, mutex must be held during
// modification. See documentation of condition variable
// notify_all/wait. Mutex does not need to be held for the actual
// notify call.
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = false;
}
someclass->cv.notify_all(); // Allow send/receive threads to run.
}
}
void wait_to_start(SomeClass *someclass)
{
std::unique_lock<std::mutex> lk(someclass->cv_m); // RAII, no need for unlock.
auto not_paused = [someclass](){return someclass->pause_test_threads == false;};
someclass->cv.wait(lk, not_paused);
}
void ctrl_thread(SomeClass *someclass)
{
// Do startup work
// ...
unpause_test_threads(someclass);
for (;;)
{
// ... check for end-program etc, if so, break;
if (lost ctrl connection to other endpoint)
{
pause_test_threads();
}
else
{
unpause_test_threads();
}
sleep(SLEEP_INTERVAL);
}
unpause_test_threads(someclass);
}
void sender_thread(SomeClass *someclass)
{
wait_to_start(someclass);
...
for (;;)
{
// ... check for end-program etc, if so, break;
if (someclass->pause_test_threads) wait_to_start(someclass);
...
}
}
void receiver_thread(SomeClass *someclass)
{
wait_to_start(someclass);
...
for (;;)
{
// ... check for end-program etc, if so, break;
if (someclass->pause_test_threads) wait_to_start(someclass);
...
}
I looked through your code manipulating conditional variable and atomic, and it seems that it is correct and won't cause problems.
Why you should protect writes to shared variable even if it is atomic:
There could be problems if write to shared variable happens between checking it in predicate and waiting on condition. Consider following:
Waiting thread wakes spuriously, aquires mutex, checks predicate and evaluates it to false, so it must wait on cv again.
Controlling thread sets shared variable to true.
Controlling thread sends notification, which is not received by anybody, because there is no threads waiting on conditional variable.
Waiting thread waits on conditional variable. Since notification was already sent, it would wait until next spurious wakeup, or next time when controlling thread sends notification. Potentially waiting indefinetly.
Reads from shared atomic variables without locking is generally safe, unless it introduces TOCTOU problems.
In your case you are reading shared variable to avoid unnecessary locking and then checking it again after lock (in conditional wait call). It is a valid optimisation, called double-checked locking and I do not see any potential problems here.
You might want to check if atomic<bool> is lock-free. Otherwise you will have even more locks you would have without it.
In general, you want to treat the fact that variable is atomic independently of how it works with a condition variable.
If all code that interacts with the condition variable follows the usual pattern of locking the mutex before query/modification, and the code interacting with the condition variable does not rely on code that does not interact with the condition variable, it will continue to be correct even if it wraps an atomic mutex.
From a quick read of your pseudo-code, this appears to be correct. However, pseudo-code is often a poor substitute for real code for multi-threaded code.
The "optimization" of only waiting on the condition variable (and locking the mutex) when an atomic read says you might want to may or may not be an optimization. You need to profile throughput.
atomic data doesn't need another synchronization, it's basis of lock-free algorithms and data structures.
void do_pause_test_threads(SomeClass *someclass)
{
if (!someclass->pause_test_threads)
{
/// your pause_test_threads might be changed here by other thread
/// so you have to acquire mutex before checking and changing
/// or use atomic methods - compare_exchange_weak/strong,
/// but not all together
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = true;
}
}

delete order/speed of std::lock_guard relative to other stack-allocated objects?

As far as I can tell there is quite a bit of time between between the lock_guard getting deleted and when a function (run in another thread) actually returns. See the comment below in TEST(...)
bool bDone = false;
void run_worker(Foo* f) {
f->Compute();
bDone = true;
}
TEST(FooTest,ThreadFoo) {
Foo* f = makeFoo();
std::thread worker( run_worker, f );
worker.detach();
micro_wait(100); // wait for N microseconds
f->Reset(); // should block until Compute() is done
// !!?? Why is this necessary !?!?
int k=0;
while(++k<500 && !bDone)
micro_wait(100);**
EXPECT_TRUE(bDone); // Fails even with a single micro_wait(100)!
}
Is there a good explanation for when/why there can be such a time lapse
between when f->Compute() finishes and bDone gets set? My suspicion is that the mutex gets unlocked while there is still work to be done cleaning up stack-based variables allocated in Compute() but this is purely a hypothesis.
Stubs for Compute and Reset are below:
void Foo::Compute() {
std::lock_guard<std::mutex> guard(m_Mutex);
// ... allocate bunch of temporary stuff on stack, update *this
}
void Foo::Reset() {
std::lock_guard<std::mutex> guard(m_Mutex);
// ... simpler stuff, clear
}
There is no synchronization of bDone.
It's quite possible that the compiler loads bDone into a register while it's value is false, and then continues to use the register cached version, instead of acquiring the updated version from memory. Alternatively, your instructions may be reordered such that bDone is set to false after the lock is released.
The correct way to approach this is to use an std::atomic<bool>. The worker thread can update it with a call to bDone.store(true) and the waiting thread can read it's most current value with a call to bDone.load().
If you want to read into memory ordering to help understand why an atomic is needed, you can further improve this (though for a unit test, it really doesn't matter) by using acquire and release ordering.
Aside from this, what you really should be doing is joining your worker thread. A join blocks until the thread has ended, so you can be assured that your Compute function has completed execution. If you're afraid that it may run forever (or for too long), I'd suggest using boost::thread instead of std::thread, as it provides a timed_join function, which stops waiting for the thread after a specified period of time.