Protecting main thread from errors in worker thread

Protecting main thread from errors in worker thread - c++

When using posix threads, is there some way to "protect" the main thread from errors (such as dereferenced null pointers, division by zero, etc) caused by worker threads. By "worker thread" I mean a posix thread created by pthread_create().
Unfortunately, we cannot use exceptions - so no "catch", etc.
Here is my test program (C++):
void* workerThreadFunc(void* threadId) {
int* a = NULL;
*a = 5; //Error (segmentation fault)
pthread_exit(NULL);
}
int main() {
cout << "Main thread start" << endl;
pthread_t workerThread;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_create(&workerThread, &attr, workerThreadFunc, (void*)0);
pthread_join(workerThread, NULL);
cout << "Main thread end" << endl;
}
In the example above, the error caused by workerThread will terminate the whole program. But I would like the main thread to continue running despite this error. Is this possible to achieve?

Sounds to me like you should be using multiple processes, not threads. Independent processes are automatically protected from these sort of errors happening in other processes.
You can use pipes or shared memory (or other forms of IPC) to pass data between threads, which has the additional benefit of only sharing the memory you intend to share, so a bug in the worker "thread" cannot stomp on the stack of the main "thread", because it's a separate process with a separate address space.
Threads can be useful, but come with several disadvantages, sometimes running in separate processes is more appropriate.

The only way I can think of doing this is registering a signal handler, which could instead of aborting the program, cancel the currently running thread, something like this:
void handler(int sig)
{
pthread_exit(NULL);
}
signal(SIGSEGV, handler);
Note, however, this is unsafe as pthread_exit isn't listed as one of the safe system calls inside a signal handler. It might work and it might not, depending on the O/S you're running under, and on what signal you're handling.

Assuming your system uses signals in a POSIX sort of a way (though that may fall under the "no exceptions" rule), then POSIX says:
At the time of generation, a determination shall be made whether the signal has been generated for the process or for a specific thread within the process. Signals which are generated by some action attributable to a particular thread, such as a hardware fault, shall be generated for the thread that caused the signal to be generated.
So you can handle SIGSEGV, SIGFPE, etc. on a per pthread basis (but note that you can only set one signal handler function for the entire process). So, you can "protect" the process from being stopped dead by a failure in a single pthread... up to a point. The problem, of course, is that you may find it very difficult to tell what state the process -- the failed pthread, and all the other pthreads -- is in. The failed pthread may be holding a number of mutexes. The failed pthread may be leaving some shared data structure(s) in a mess. Who knows what sort of a tangle things are in -- unless the pthreads are essentially independent. It may be possible to arrange for other pthreads to close down "gracefully"... rather than crash and burn. It may, in the end, be safer to stop all pthreads dead, rather than try to continue in some less than well defined state. It will depend entirely on the nature of the application.
Nothing is for nothing... threads can communicate with each other more easily than processes, and cost less to start and stop -- processes are less vulnerable to failure of other processes.

Related

How to cleanly exit a threaded C++ program?

I am creating multiple threads in my program. On pressing Ctrl-C, a signal handler is called. Inside a signal handler, I have put exit(0) at last. The thing is that sometimes the program terminates safely but the other times, I get runtime error stating
abort() has been called
So what would be the possible solution to avoid the error?

The usual way is to set an atomic flag (like std::atomic<bool>) which is checked by all threads (including the main thread). If set, then the sub-threads exit, and the main thread starts to join the sub-threads. Then you can exit cleanly.
If you use std::thread for the threads, that's a possible reason for the crashes you have. You must join the thread before the std::thread object is destructed.

Others have mentioned having the signal-handler set a std::atomic<bool> and having all the other threads periodically check that value to know when to exit.
That approach works well as long as all of your other threads are periodically waking up anyway, at a reasonable frequency.
It's not entirely satisfactory if one or more of your threads is purely event-driven, however -- in an event-driven program, threads are only supposed to wake up when there is some work for them to do, which means that they might well be asleep for days or weeks at a time. If they are forced to wake up every (so many) milliseconds simply to poll an atomic-boolean-flag, that makes an otherwise extremely CPU-efficient program much less CPU-efficient, since now every thread is waking up at short regular intervals, 24/7/365. This can be particularly problematic if you are trying to conserve battery life, as it can prevent the CPU from going into power-saving mode.
An alternative approach that avoids polling would be this one:
On startup, have your main thread create an fd-pipe or socket-pair (by calling pipe() or socketpair())
Have your main thread (or possibly some other responsible thread) include the receiving-socket in its read-ready select() fd_set (or take a similar action for poll() or whatever wait-for-IO function that thread blocks in)
When the signal-handler is executed, have it write a byte (any byte, doesn't matter what) into the sending-socket.
That will cause the main thread's select() call to immediately return, with FD_ISSET(receivingSocket) indicating true because of the received byte
At that point, your main thread knows it is time for the process to exit, so it can start directing all of its child threads to start shutting down (via whatever mechanism is convenient; atomic booleans or pipes or something else)
After telling all the child threads to start shutting down, the main thread should then call join() on each child thread, so that it can be guaranteed that all of the child threads are actually gone before main() returns. (This is necessary because otherwise there is a risk of a race condition -- e.g. the post-main() cleanup code might occasionally free a resource while a still-executing child thread was still using it, leading to a crash)

The first thing you must accept is that threading is hard.
A "program using threading" is about as generic as a "program using memory", and your question is similar to "how do I not corrupt memory in a program using memory?"
The way you handle threading problem is to restrict how you use threads and the behavior of the threads.
If your threading system is a bunch of small operations composed into a data flow network, with an implicit guarantee that if an operation is too big it is broken down into smaller operations and/or does checkpoints with the system, then shutting down looks very different than if you have a thread that loads an external DLL that then runs it for somewhere from 1 second to 10 hours to infinite length.
Like most things in C++, solving your problem is going to be about ownership, control and (at a last resort) hacks.
Like data in C++, every thread should be owned. The owner of a thread should have significant control over that thread, and be able to tell it that the application is shutting down. The shut down mechanism should be robust and tested, and ideally connected to other mechanisms (like early-abort of speculative tasks).
The fact you are calling exit(0) is a bad sign. It implies your main thread of execution doesn't have a clean shutdown path. Start there; the interrupt handler should signal the main thread that shutdown should begin, and then your main thread should shut down gracefully. All stack frames should unwind, data should be cleaned up, etc.
Then the same kind of logic that permits that clean and fast shutdown should also be applied to your threaded off code.
Anyone telling you it is as simple as a condition variable/atomic boolean and polling is selling you a bill of goods. That will only work in simple cases if you are lucky, and determining if it works reliably is going to be quite hard.

Additional to Some programmer dude answer and related to discussion in the comment section, you need to make the flag that controls termination of your threads as atomic type.
Consider following case :
bool done = false;
void pending_thread()
{
while(!done)
{
std::this_thread::sleep(std::milliseconds(1));
}
// do something that depends on working thread results
}
void worker_thread()
{
//do something for pending thread
done = true;
}
Here worker thread can be your main thread also and done is terminating flag of your thread, but pending thread need to do something with given data by working thread, before exiting.
this example has race condition and undefined behaviour along with it, and it's really hard to find what is the actual problem int the real world.
Now the corrected version using std::automic :
std::atomic<bool> done(false);
void pending_thread()
{
while(!done.load())
{
std::this_thread::sleep(std::milliseconds(1));
}
// do something that depends on working thread results
}
void worker_thread()
{
//do something for pending thread
done = true;
}
You can exit thread without being concern of race condition or UB.

Is there a reliable way to force a thread to stop in C++? (especially detached ones)

I am recently working with threads in C++11. now I am thinking about how to force stop a thread. I couldn't find it on stackoverflow, and also tried these.
One variable each thread : not so reliable
return in the main thread : I have to force quit only one not all
and I have no more ideas. I have heard about WinAPI, but I want a portable solution. (that also means I wont use fork())
Can you please give me a solution of this? I really want to do it.

One of the biggest problems with force closing a thread in C++ is the RAII violation.
When a function (and subsequently, a thread), gracefully finishes, everything it held is gracefully cleaned up by the destructors of the objects the functions/threads created.
Memory gets freed,
OS resources (handles, file descriptors etc.) are closed and returned to the OS
Locks are getting unlocked so other threads can use the shared resources they protect.
other important tasks are preformed (such as updating counters, logging, etc.).
If you brutally kill a thread (aka by TerminateThread on Windows, for example), non of these actually happen, and the program is left in a very dangerous state.
A (not-so) common pattern that can be used is to register a "cancellation token" on which you can monitor and gracefully shut the thread if other thread asks so (a la TPL/PPL). something like
auto cancellationToken = std::make_shared<std::atomic_bool>();
cancellationToken->store(false);
class ThreadTerminator : public std::exception{/*...*/};
std::thread thread([cancellationToken]{
try{
//... do things
if (cancellationToken->load()){
//somone asked the thred to close
throw ThreadTerminator ();
}
//do other things...
if (cancellationToken->load()){
//somone asked the thred to close
throw ThreadTerminator ();
}
//...
}catch(ThreadTerminator){
return;
}
});
Usually, one doesn't even open a new thread for a small task, it's better to think of a multi threaded application as a collection of concurrent tasks and parallel algorithms. one opens a new thread for some long ongoing background task which is usually performed in some sort of a loop (such as, accepting incoming connections).
So, anyway, the cases for asking a small task to be cancelled are rare anyway.
tldr:
Is there a reliable way to force a thread to stop in C++?
No.

Here is my approach for most of my designs:
Think of 2 kinds of Threads:
1) primary - I call main.
2) subsequent - any thread launched by main or any subsequent thread
When I launch std::thread's in C++ (or posix threads in C++):
a) I provide all subsequent threads access to a boolean "done", initialized to false. This bool can be directly passed from main (or indirectly through other mechanisms).
b) All my threads have a regular 'heartbeat', typically with a posix semaphore or std::mutex, sometimes with just a timer, and sometimes simply during normal thread operation.
Note that a 'heartbeat' is not polling.
Also note that checking a boolean is really cheap.
Thus, whenever main wants to shut down, it merely sets done to true and 'join's with the subsequent threads.
On occasion main will also signal any semaphore (prior to join) that a subsequent thread might be waiting on.
And sometimes, a subsequent thread has to let its own subsequent thread know it is time to end.
Here is an example -
main launching a subsequent thread:
std::thread* thrd =
new std::thread(&MyClass_t::threadStart, this, id);
assert(nullptr != thrd);
Note that I pass the this pointer to this launch ... within this class instance is a boolean m_done.
Main Commanding shutdown:
In main thread, of course, all I do is
m_done = true;
In a subsequent thread (and in this design, all are using the same critical section):
void threadStart(uint id) {
std::cout << id << " " << std::flush; // thread announce
do {
doOnce(id); // the critical section is in this method
}while(!m_done); // exit when done
}
And finally, at an outer scope, main invokes the join.
Perhaps the take away is - when designing a threaded system, you should also design the system shut down, not just add it on.

Interupt boost thread that is already in condition variable wait call

I'm using the boost interprocess library to create server and client programs for passing opencv mat objects around in shared memory. Each server and client process has two boost threads which are members of a boost::thread_group. One handles command line IO while the other manages data processing. Shared memory access is synchronized using boost::interprocess condition_variables.
Since this program involves shared memory, I need to do some manual cleaning before exiting. My problem is that if the server terminates prematurely, then the processing thread on the client blocks at the wait() call since the server is responsible for sending notifications. I need to somehow interrupt the thread stuck at wait() to initiate shared memory destruction. I understand that calling interrupt() (in my case, thread_group.interrupt_all()) on the thread will cause theboost::thread_interrupted exception to be thrown upon reaching a interruption point (such as wait()), which if left unhandled, would allow the shared memory destruction to proceed. However, when I try to interrupt the thread while it is in wait(), nothing seems to happen. For instance, this prints nothing to the command line:
try {
shared_mat_header->new_data_condition.wait(lock);
} catch (...) {
std::cout << "Thread interrupt occurred\n";
}
I am not at all sure, but it seems like the interrupt() call needs to occur before the thread enters wait() for the exception to be thrown. Is this true? If not, then what is the proper way to interrupt a boost thread that is blocked by a condition_variable.wait() call?
Thanks for any insight.
Edit
I accepted Chris Desjardins' answer, which does not answer the question directly, but has the intended effect. Here I'm translating his code snippet for use with boost::interprocess condition variables, which have slightly different syntax than boost::thread condition variables:
while (_running) {
boost::system_time timeout = boost::get_system_time() + boost::posix_time::milliseconds(1);
if (shared_mat_header->new_data_condition.timed_wait(lock, timeout))
{
//process data
}
}

I prefer to wait with timeouts, then check the return code of the wait call to see if it timed out or not. In fact I have a thread pattern I like to use that resolves this situation (and other common problems with threads in c++).
http://blog.chrisd.info/how-to-run-threads/
The main point for you is to not block infinitely in a thread, so your thread would look like this:
while (_running == true)
{
if (shared_mat_header->new_data_condition.wait_for(lock, boost::chrono::milliseconds(1)) == boost::cv_status::no_timeout)
{
// process data
}
}
Then in your destructor you set _running = false; and join the thread(s).

Try using the "notify function". Keep a pointer to your condition variable and call that instead of interrupting the threads. Interrupting is much more costly than a notify call.
So instead of doing
thread_group.interrupt_all()
call this instead
new_data_condition_pointer->notify_one()

How do I terminate a thread in C++11?

I don't need to terminate the thread correctly, or make it respond to a "terminate" command. I am interested in terminating the thread forcefully using pure C++11.

You could call std::terminate() from any thread and the thread you're referring to will forcefully end.
You could arrange for ~thread() to be executed on the object of the target thread, without a intervening join() nor detach() on that object. This will have the same effect as option 1.
You could design an exception which has a destructor which throws an exception. And then arrange for the target thread to throw this exception when it is to be forcefully terminated. The tricky part on this one is getting the target thread to throw this exception.
Options 1 and 2 don't leak intra-process resources, but they terminate every thread.
Option 3 will probably leak resources, but is partially cooperative in that the target thread has to agree to throw the exception.
There is no portable way in C++11 (that I'm aware of) to non-cooperatively kill a single thread in a multi-thread program (i.e. without killing all threads). There was no motivation to design such a feature.
A std::thread may have this member function:
native_handle_type native_handle();
You might be able to use this to call an OS-dependent function to do what you want. For example on Apple's OS's, this function exists and native_handle_type is a pthread_t. If you are successful, you are likely to leak resources.

#Howard Hinnant's answer is both correct and comprehensive. But it might be misunderstood if it's read too quickly, because std::terminate() (whole process) happens to have the same name as the "terminating" that #Alexander V had in mind (1 thread).
Summary: "terminate 1 thread + forcefully (target thread doesn't cooperate) + pure C++11 = No way."

I guess the thread that needs to be killed is either in any kind of waiting mode, or doing some heavy job.
I would suggest using a "naive" way.
Define some global boolean:
std::atomic_bool stop_thread_1 = false;
Put the following code (or similar) in several key points, in a way that it will cause all functions in the call stack to return until the thread naturally ends:
if (stop_thread_1)
return;
Then to stop the thread from another (main) thread:
stop_thread_1 = true;
thread1.join ();
stop_thread_1 = false; //(for next time. this can be when starting the thread instead)

Tips of using OS-dependent function to terminate C++ thread:
std::thread::native_handle() only can get the thread’s valid native handle type before calling join() or detach(). After that, native_handle() returns 0 - pthread_cancel() will coredump.
To effectively call native thread termination function(e.g. pthread_cancel()), you need to save the native handle before calling std::thread::join() or std::thread::detach(). So that your native terminator always has a valid native handle to use.
More explanations please refer to: http://bo-yang.github.io/2017/11/19/cpp-kill-detached-thread .

This question actually have more deep nature and good understanding of the multithreading concepts in general will provide you insight about this topic. In fact there is no any language or any operating system which provide you facilities for asynchronous abruptly thread termination without warning to not use them. And all these execution environments strongly advise developer or even require build multithreading applications on the base of cooperative or synchronous thread termination. The reason for this common decisions and advices is that all they are built on the base of the same general multithreading model.
Let's compare multiprocessing and multithreading concepts to better understand advantages and limitations of the second one.
Multiprocessing assumes splitting of the entire execution environment into set of completely isolated processes controlled by the operating system. Process incorporates and isolates execution environment state including local memory of the process and data inside it and all system resources like files, sockets, synchronization objects. Isolation is a critically important characteristic of the process, because it limits the faults propagation by the process borders. In other words, no one process can affects the consistency of any another process in the system. The same is true for the process behaviour but in the less restricted and more blur way. In such environment any process can be killed in any "arbitrary" moment, because firstly each process is isolated, secondly, operating system have full knowledges about all resources used by process and can release all of them without leaking, and finally process will be killed by OS not really in arbitrary moment, but in the number of well defined points where the state of the process is well known.
In contrast, multithreading assumes running multiple threads in the same process. But all this threads are share the same isolation box and there is no any operating system control of the internal state of the process. As a result any thread is able to change global process state as well as corrupt it. At the same moment the points in which the state of the thread is well known to be safe to kill a thread completely depends on the application logic and are not known neither for operating system nor for programming language runtime. As a result thread termination at the arbitrary moment means killing it at arbitrary point of its execution path and can easily lead to the process-wide data corruption, memory and handles leakage, threads leakage and spinlocks and other intra-process synchronization primitives leaved in the closed state preventing other threads in doing progress.
Due to this the common approach is to force developers to implement synchronous or cooperative thread termination, where the one thread can request other thread termination and other thread in well-defined point can check this request and start the shutdown procedure from the well-defined state with releasing of all global system-wide resources and local process-wide resources in the safe and consistent way.

Maybe TerminateThread? In windows only.
WINBASEAPI WINBOOL WINAPI TerminateThread (HANDLE hThread, DWORD dwExitCode);
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-terminatethread

You can't use a C++ std::thread destructor to terminate a single thread in a multi-threads program. Here's the relevant code snippet of std::thread destructor, located in the thread header file (Visual C++):
~thread()
{
if (joinable())
std::terminate();
}
If you call the destructor of a joinable thread, the destructor calls std::terminate() that acts on the process; not on the thread, otherwise, it does nothing.
It is possible to "terminating the thread forcefully" (C++11 std::thread) by using OS function. On Windows, you can use TerminateThread. "TerminateThread is a dangerous function that should only be used in the most extreme cases." - Microsoft | Learn.
TerminateThread(tr.native_handle(), 1);
In order to TerminateThread to effect, you should not call join() / detach() before, since such a call will nullify native_handle().
You should call detach() (or join()) after TerminateThread. Otherwise, as written on the 1st paragraph, on thread destructor std::terminate() will be called and the whole process will be terminated.
Example:
#include <iostream>
#include <thread>
#include <Windows.h>
void Work10Seconds()
{
std::cout << "Work10Seconds - entered\n";
for (uint8_t i = 0; i < 20; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(500));
std::cout << "Work10Seconds - working\n";
}
std::cout << "Work10Seconds - exited\n";
}
int main() {
std::cout << "main - started\n";
std::thread tr{};
std::cout << "main - Run 10 seconds work thread\n";
tr = std::thread(Work10Seconds);
std::cout << "main - Sleep 2 seconds\n";
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "main - TerminateThread\n";
TerminateThread(tr.native_handle(), 1);
tr.detach(); // After TerminateThread
std::cout << "main - Sleep 2 seconds\n";
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "main - exited\n";
}
Output:
main - started
main - Run 10 seconds work thread
main - Sleep 2 seconds
Work10Seconds - entered
Work10Seconds - working
Work10Seconds - working
Work10Seconds - working
main - TerminateThread
main - Sleep 2 seconds
main - exited

Is it possible to kill a spinning thread?

I am using ZThreads to illustrate the question but my question applies to PThreads, Boost Threads and other such threading libraries in C++.
class MyClass: public Runnable
{
public:
void run()
{
while(1)
{
}
}
}
I now launch this as follows:
MyClass *myClass = new MyClass();
Thread t1(myClass);
Is it now possible to kill (violently if necessary) this thread? I can do this for sure instead of the infinite loop I had a Thread::Sleep(100000) that is, if it is blocking. But can I kill a spinning thread (doing computation). If yes, how? If not, why not?

As far as Windows goes (from MSDN):
TerminateThread is a dangerous function that should only be used in
the most extreme cases. You should call TerminateThread only if you
know exactly what the target thread is doing, and you control all of
the code that the target thread could possibly be running at the time
of the termination. For example, TerminateThread can result in the
following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
Boost certainly doesn't have a thread-killing function.

A general solution to the kind of question posted can be found in Herb Sutter article:
Prefer Using Active Objects Instead of Naked Threads
This permits you to have something like this (excerpt from article):
class Active {
public:
typedef function<void()> Message;
private:
Active( const Active& ); // no copying
void operator=( const Active& ); // no copying
bool done; // le flag
message_queue<Message> mq; // le queue
unique_ptr<thread> thd; // le thread
void Run() {
while( !done ) {
Message msg = mq.receive();
msg(); // execute message
} // note: last message sets done to true
}
In the active object destructor you can have then:
~Active() {
Send( [&]{ done = true; } ); ;
thd->join();
}
This solution promotes a clean thread function exist, and avoids all other issues related to an unclean thread termination.

It is possible to terminate a thread forcefully, but the call to do it is going to be platform specific. For example, under Windows you could do it with the TerminateThread function.
Keep in mind that if you use TerminateThread, the thread will not get a chance to release any resources it is using until the program terminates.

If you need to kill a thread, consider using a process instead.
Especially if you tell us that your "thread" is a while (true) loop that may sleep for a long period of time performing operations that are necessarily blocking. To me, that indicate a process-like behavior.
Processes can be terminated in a various number of ways at almost any time and always in a clean way. They may also offer more reliability in case of a crash.
Modern operating systems offer an array of interprocess communications facilities: sockets, pipes, shared memory, memory mapped files ... They may even exchange file descriptors.
Good OSes have copy-on-write mechanism, so processes are cheap to fork.
Note that if your operations can be made in a non-blocking way, then you should use a poll-like mechanism instead. Boost::asio may help there.

You can with TerminateThread() API, but it is not recommended.
More details at:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686717(v=vs.85).aspx

As people already said, there is no portable way to kill a thread, and in some cases not possible at all. If you have control over the code (i.e. can modify it) one of the simplest ways is to have a boolean variable that the thread checks in regular intervals, and if set then terminate the thread as soon as possible.

Can't you do add something like below
do {
//stuff here
} while (!abort)
And check the flag once in a while between computations if they are small and not too long (as in the loop above) or in the middle and abort the computation if it is long?

Not sure of the other libraries but in pthread library pthread_kill function is available pthread_kill

Yes,
Define keepAlive variable as an int .
Initially set the value of keepAlive=1 .
class MyClass: public Runnable
{
public:
void run()
{
while(keepAlive)
{
}
}
}
Now, when every you want to kill thread just set the value of keepAlive=0 .
Q. How this works ?
A. Thread will be live until the execution of the function continuous . So it's pretty simple to Terminate a function . set the value of variable to 0 & it breaks which results in killing of thread . [This is the safest way I found till date] .

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js