Exit thread within a nested function - c++

I want to check a global bool value in my thread whenever necessary:
If it's true the thread should exit, and the thread should continue if it's false.
While checking, I could be inside a single function or I could be within a nested function. I need to ensure that I return to the main function first, then return 0 in the main function, which seems very stupid.
So the only way I can think of is to throw an exception when the condition is fulfilled and then catch it at the end of this thread, so that all the elements are destructed correctly.
So is there a more standard way in C++ language to do this? How do you exit a thread while you are in a nested function?

Your approach is fine. Compared to the costs of thread creation and destruction, throwing an exception is negligible.
If for some reasons you aren't allowed to use exceptions:
Check that signaling variable several places in your code. Note that regular code runs pretty damn fast so you only need these checks inside/before long calculations (loops) or IO operations that could block. Make sure the rest of the code doesn't depend on the results of some unfinished calculations.
Use a coding style where you always return an error code from every function (at least for this thread).

There are system-specific functions like ExitThread and pthread_exit but they are not recommended to use because they will result in memory leak: destructors will not be called, including CRT/stdlib internal objects if the thread was created using std::thread.
So the answer is: no, there is no standard way to exit a thread in C++. Just consider the thread as a regular function.

Related

Why do I need to explicitly detach a short term variable?

Let's say I have a small operation which I want to perform in a separate thread. I do not need to know when it completes, nor do I need to wait for its completion, but I do not want the operation blocking my current thread. When I write the following code, I will get a crash:
void myFunction() {
// do other stuff
std::thread([]()
{
// do thread stuff
});
}
This crash is solved by assigning the thread to a variable, and detaching it:
void myFunction() {
// do other stuff
std::thread t([]()
{
// do thread stuff
});
t.detach();
}
Why is this step necessary? Or is there a better way to create a small single-use thread?
Because the std::thread::~thread() specification says so:
A thread object does not have an associated thread (and is safe to destroy) after
it was default-constructed
it was moved from
join() has been called
detach() has been called
It looks like detach() is the only one of these that makes sense in your case, unless you want to return the thread object (by moving) to the caller.
Why is this step necessary?
Consider that the thread object represents a long-running "thread" of execution (a lightweight process or kernel schedulable entity or similar).
Allowing you to destroy the object while the thread is still executing, leaves you no way to subsequently join (and find the result of) that thread. This may be a logical error, but it can also make it hard even to correctly exit your program.
Or is there a better way to create a small single-use thread?
Not obviously, but it's frequently better to use a thread pool for running tasks in the background, instead of starting and stopping lots of short-lived threads.
You might be able to use std::async() instead, but the future it returns may block in the destructor in some circumstances, if you try to discard it.
See the documentation of the destructor of std:thread:
If *this has an associated thread (joinable() == true), std::terminate() is called.
You should explicitly say that you don't care what's going to happen with the thread, and that you're OK with loosing any control over it. And that is what detach is for.
In general, this looks like a design problem so crashing makes sense: it's hard to propose a general and not surprising rule about what should happen in such a case (e.g. your program might as well normally end its execution - what should happen with the thread?).
Basically, your use case requires a call to detach() because your use case is pretty weird, and not what C++ is trying to make easy.
While Java and .Net blithely let you toss away a Thread object whose associated thread is still running, in the C++ model the Thread is closer to being the thread, in the sense that the existence of the Thread object coincides with the lifetime, or at least joinability, of the execution it refers to. Note how it's not possible to create a Thread without starting it (except in the case of the default constructor, which is really just there in the service of move semantics), or to copy it or to make one from a thread id. C++ wants Thread to outlive the thread.
Maintaining that condition has various benefits. Final cleanup of a thread's control data doesn't have to be done automagically by the OS, because once a Thread goes away, nothing can ever try to join it. It's easier to ensure that variables with thread storage get destroyed in time, since the main thread is the last to exit (barring some move shenanigans). And a missing join -- which is an extremely common type of bug -- gets properly flagged at runtime.
Letting some thread wander off into the distance, in contrast, is allowed, but it's an unusual thing to do. Unless it's interacting with your other threads through sync objects, there's no way to ensure it's done whatever it was meant to do. A detached thread is on the level of reinterpret_cast: You're allowed to tell the compiler that you know something it doesn't, but that has to be explicit, not just the consequence of the function you didn't call.
Consider this: thread A creates thread B and thread A leaves its scope of execution. The handle for thread B is about to be lost. What should happen now? There are several possibilities, with most obvious as follows:
Thread B is detached and continues its execution indempedently
Thread A waits (joins) thread B before quiting its own scope
Now you can argue which is better: 1 or 2? How should we (the compiler) decide on which one of these is better?
So what the designers did was something different: crash terminate the code so that the developer picks one of these solutions explicitely. In order to avoid implicit (perhaps unwanted) behaviuor. It's a signal for you: "hey, pay attention now, this piece of code is important and I (the compiler) don't want to decide for you".

Is this a valid use of the noreturn attribute?

While working on a thread (fiber) scheduling class, I found myself writing a function that never returns:
// New thread, called on an empty stack
// (implementation details, exception handling etc omitted)
[[noreturn]] void scheduler::thread() noexcept
{
current_task->state = running;
current_task->run();
current_task->state = finished;
while (true) yield();
// can't return, since the stack contains no return address.
}
This function is never directly called (by thread();). It is "called" only by a jmp from assembly code, right after switching to a new context, so there is no way for it to "return" anywhere. The call to yield(), at the end, checks for state == finished and removes this thread from the thread queue.
Would this be a valid use of the [[noreturn]] attribute? And if so, would it help in any way?
edit: Not a duplicate. I understand what the attribute is normally used for. My question is, would it do anything in this specific case?
I'd say that it is valid but pointless.
It's valid because the function does not return. The contract cannot be broken.
It's pointless because the function is never called from C++ code. So no caller can make use of the fact that the function does not return because there is no caller. And at the point of definition of the function, the compiler should not require your assistance to determine that code following the while statement is dead, including a function postlude if any.
Well, the jmp entering the function is a bit weird, but to answer your Question is
"Most likely no".
Why most likely ? Because I don't believe you understand the idea of no return OR you are stating your use case wrongfully.
1st of all the function is never entered (or so you are stating) which means that by default is not a no-return (dead code could be removed by the compiler).
But let's consider that you are actually calling the function without realizing it (via that "JMP").
The idea of a no-return function is to never reach the end of scope (or not in a normal way at least). Meaning that either the whole Program is terminated within the function OR an error is thrown (meaning that the function won't pop the stack in a normal way). std::terminate is a good example of such a function. If you call it within your function, then your function becomes no return.
In your case, you are checking if the thread is finished.
If you are in a scenario where you are murdering the thread through suicide , and this is the function which is checking for the thread completion , and you call this function from the thread itself (suicide, which I highly doubt, since the thread will become blocked by the while and never finish), and you are forcing the thread to exit abruptly (OS Specific how to do that) then yes the function is indeed a no return because the execution on the stack won't be finished.
Needless to say, if you're in the above scenario you have HUGE problems with your program.
Most likely you are calling this function from another thread, or you are exiting the thread normally, cases in which the function won't be a no-return.

Is it safe to use pthread_exit() in C++?

pthread_exit() doesn't have the mechanism of stack unwinding. Can we completely avoid using pthread_exit() in C++? Or, are there any cases where we require this API instead of return from thread function in C++?
pthread functions are platform-specific. C++ has its own platform-independent thread API. It doesn't have a function that would correspond to pthread_exit. Use the std::thread API and don't worry about pthreads.
If you need to terminate a thread in the middle without explicitly returning all the way up to the thread starting function, throw an exception and catch it in the starting function. This will unwind the stack properly.
To quote from the documentation:
An implicit call to pthread_exit() is made when a thread other than the thread in which main() was first invoked returns from the start routine that was used to create it. The function's return value serves as the thread's exit status.
Which means that you don't need to call it yourself unless you want to avoid returning from that function (perhaps naked functions, or interrupt handlers or something really low level).

cancel a c++ 11 async task

How can I stop/cancel an asynchronous task created with std::async and policy std::launch::async? In other words, I have started a task running on another thread, using future object. Is there a way to cancel or stop the running task?
In short no.
Longer explanation: There is no safe way to cancel any threads in standard C++. This would require thread cancellation. This feature has been discussed many times during the C++11 standardisation and the general consensus is that there is no safe way to do so. To my knowledge there were three main considered ways to do thread cancellation in C++.
Abort the thread. This would be rather like an emergency stop. Unfortunately it would result in no stack unwinding or destructors called. The thread could have been in any state so possibly holding mutexes, having heap allocated data which would be leaked, etc. This was clearly never going to be considered for long since it would make the entire program undefined. If you want to do this yourself however just use native_handle to do it. It will however be non-portable.
Compulsory cancellation/interruption points. When a thread cancel is requested it internally sets some variable so that next time any of a predefined set of interruption points is called (such as sleep, wait, etc) it will throw some exception. This would cause the stack to unwind and cleanup can be done. Unfortunately this type of system makes it very difficult make any code exception safe since most multithreaded code can then suddenly throw. This is the model that boost.thread uses. It uses disable_interruption to work around some of the problems but it is still exceedingly difficult to get right for anything but the simplest of cases. Boost.thread uses this model but it has always been considered risky and understandably it was not accepted into the standard along with the rest.
Voluntary cancellation/interruption points. ultimately this boils down to checking some condition yourself when you want to and if appropriate exiting the thread yourself in a controlled fashion. I vaguely recall some talk about adding some library features to help with this but it was never agreed upon.
I would just use a variation of 3. If you are using lambdas for instance it would be quite easy to reference an atomic "cancel" variable which you can check from time to time.
In C++11 (I think) there is no standard way to cancel a thread. If you get std::thread::native_handle(), you can do something with it but that's not portable.
maybe you can do like this way by checking some condition:
class Timer{
public:
Timer():timer_destory(false){}
~Timer(){
timer_destory=true;
for(auto result:async_result){
result.get();
}
}
int register_event(){
async_result.push_back(
std::async(std::launch::async,[](std::atomic<bool>& timer_destory){
while(!timer_destory){
//do something
}
},std::ref(timer_destory))
);
}
private:
std::vector<std::future<int>> async_result;
std::atomic<bool> timer_destory;
}

C++ Thread question - setting a value to indicate the thread has finished

Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.