Why and when shouldn't I kill a thread? - c++

I am writing a multithreaded socket server and I need to know for sure.
Articles about threads say that I should wait for the thread to return, instead of killing it. In some cases though, the user's thread i want to kick/ban, will not be able to return properly (for example, I started to send a big block of data and send() blocks the thread at the moment) so I'll need just to kill it.
Why killing thread functions are dangerous and when can they crash the whole application?

Killing a thread means stopping all execution exactly where it is a the moment. In particular, it will not execute any destructors. This means sockets and files won't be closed, dynamically-allocated memory will not be freed, mutexes and semaphores won't be released, etc. Killing a thread is almost guaranteed to cause resource leaks and deadlocks.
Thus, your question is kind of reversed. The real question should read:
When, and under what conditions can I kill a thread?
So, you can kill the thread when you're convinced no leaks and deadlocks can occur, not now, and not when the other thread's code will be modified (thus, it is pretty much impossible to guarantee).
In your specific case, the solution is to use non-blocking sockets and check some thread/user-specific flag between calls to send()and recv(). This will likely complicate your code, which is probably why you've been resisting to do so, but it's the proper way to go about it.
Moreover, you will quickly realize that a thread-per-client approach doesn't scale, so you'll change your architecture and re-write lots of it anyways.

Killing a thread can cause your program to leak resources because the thread did not get a chance to clean up after itself. Consider closing the socket handle the thread is sending on. This will cause the blocking send() to return immediately with an appropriate error code. The thread can then clean up and die peacefully.

If you kill your thread the hard way it can leak resources.
You can avoid it when you design your thread to support cancelation.
Do not use blocking calls or use blocking calls with a timeout. Receive or send data in smaller chunks or asynchronously.

You really don't want to do this.
If you kill a thread while it holds a critical section it won't be released which will likely result in your whole application breaking. Certain C library calls like heap memory allocation use critical sections and if you happen to kill your thread while it's doing a "new" then calling new from anywhere else in your program will cause that thread to stop.
You simply can't do this safely without really extreme measures which are much more restrictive than simply signalling the thread to terminate itsself.

There's many reasons, but here's an easy one: there's only one heap. If a thread allocates ANYTHING on the heap, and you kill it, whatever it has allocated is around until the process ends. Each thread gets its own stack, and so that MAY be freed (implementation-dependent), but you GUARANTEE leaks on the heap by not letting it shut itself down.

In the case of a thread blocked in I/O you never really need to kill it, instead you have the choice between non-blocking I/O, timeouts, and closing the socket from another thread. Either of these will unblock the thread.

Related

When is a thread actually terminated when calling TerminateThread?

If I terminate a thread on Windows using the TerminateThread function, is that thread actually terminated once the function returns or is termination asychnronous?
Define "actually terminated". The documentation says the thread can not execute any more user-mode code, so effectively: yes, it is terminated, nothing of your code is going to be executed by that thread any more.
If you "WaitForSingleObject" on it right after terminating, I guess there could still be some slight delay because of cleanup that Windows is doing.
By the way: TerminateThread is the worst way of ending a thread. Try using some other means of synchronization, like a global variable that tells the thread to stop, or an event for example.
Terminating a thread is akin to killing a process, only on a per-thread level. It may in fact be implemented by raising an (uncatchable) signal in the targeted thread.
The result is essentially the same: Your program is not in any particular, predictable state. There's not much you can do with the dead thread. The control flow of your program becomes generally indeterminate, and thus it is extremely hard to reason about your program's behaviour in the presence of thread termination.
Basically, unless your thread is doing something extremely narrow, specific and restricted (e.g. increment an atomic counter once every second), there's no good model for the need to terminate a thread, and for the state of the program after the thread termination.
Don't do it. Design your threads so that you can communicate with them and so that their entry functions can return. Design your program so that you can always join all threads eventually and account for everything.
It is a synchronous call. That does not mean that it necessarily returns quickly - there may be some blocking involved if the OS has to resort to using its inter-core driver to stop the thread, (ie. it's actually running on a different core than the thread requesting the termination).
There are issues with calling TerminateThread from user code during an app run, (as distinct from the kernel using it during app/process termination), as clearly posted by others.
I try very hard to never terminate threads at all during an app run, with TerminateThread or by any other means. App-lifetime threads and thread pools often do not require any explicit termination before the OS destroys them on app close.

CreateThread followed by TerminateThread leaves behind a lot of memory

I'm using CreateThread then TerminateThread to cancel threads. It seems like stack space is still allocated. Is there a way to deal with this? I am not using any form of dynamic memory calls such as malloc/new. Threads do not have to exit gracefully. 10 threads leave behind a whopping 5 MB of memory! The threads are all on varying parts of code, so is there a simple way to implement a interthread communication system which can tell them to all exit gracefully, and therefore reorient the stack?
In most cases you should not use TerminateThread(). If you create new threads in your application, it's your responsibility to make sure that those threads do exit gracefully. When you use TerminateThread(), all kinds of resources may be left behind because this function simply terminates the thread without calling clean-up functions.
TerminateThread documentation
What you should do is use events (or other signaling methods) to tell your threads that they're supposed to shut down. When the thread internally receives the message (the event is signaled or a wait expires, etc.) the thread function can internally clean up and return. This way you'll exit your threads correctly and not leave a mess behind.
A non-auto-reset event and a WaitForMultipleObjects on your primary thread will do what you want. If you find yourself exceeding 64 concurrent worker threads, you'll have to retool to use a different approach, such as non-auto-reset event and a semaphore. There are literally dozens of ways to approach this problem, and countless examples on forums throughout the internet, as well as MS's examples in their distribution of Visual Studio. Start with those.

How to know when to kill threads?

I'm designing a thread library. So far I have a method that initializes the library, one that creates threads, and one that yields the current thread to the next one on a queue of ready threads.
Before I move on to implementing semaphores for the threads, I figured I should probably kill the threads as soon as they are done and free up their allocated memory, but I'm having trouble figuring out how to do that. How do I tell when a thread has "finished"?
You don't just kill threads safely or reliably -- let them exit naturally (when their entry returns).
Although the system provides a means to kill the thread, nearly any C++ program out there could expect undefined behavior if it were to continue. You could dream up cases where killing could be accomplished without side effects (to the rest of the program), but that program does not at all resemble idiomatic C++. Such a program would be very exotic, with many unusual and severe restrictions.
When you want to known when a thread has exited or not, you can add some cleanup before it exits in order to track its status.
When you want the ability to request a thread exit (naturally), consider run loops and messages.
You don't explicitly kill the threads when they are finished running their forked procedures as the code which would be doing that would still be in the context of the thread to be killed.
You have a scheduler/interrupt handler which handles the context switching of the threads and maintains a few queues for managing this. You can have it save a reference to to the threads to be killed, something like scheduler->SetThreadToKill( currentThread ); inside probably your finish() method (or similar), which sets a flag for the corresponding threads.
When a context switch occurs, and you have swapped out all data structures of the current thread with that of the next thread, you scheduler can call the destructor for all the threads which have the toBeKilled flag set.
The best policy, by far, for killing threads is to not explicitly do it, (unless you are an OS, ie. on app shutdown). Queue messages and tasks to threads that loop around some queue to perform more work. If you don't write any code to continually new, create, start, terminate, delete, test, check, enlist, delist, enqueue, dequeue and otherwise micro-manage threads, then that code cannot contain bugs.

How to implement a timed wait around a blocking call?

So, the situation is this. I've got a C++ library that is doing some interprocess communication, with a wait() function that blocks and waits for an incoming message. The difficulty is that I need a timed wait, which will return with a status value if no message is received in a specified amount of time.
The most elegant solution is probably to rewrite the library to add a timed wait to its API, but for the sake of this question I'll assume it's not feasible. (In actuality, it looks difficult, so I want to know what the other option is.)
Here's how I'd do this with a busy wait loop, in pseudocode:
while(message == false && current_time - start_time < timeout)
{
if (Listener.new_message()) then message = true;
}
I don't want a busy wait that eats processor cycles, though. And I also don't want to just add a sleep() call in the loop to avoid processor load, as that means slower response. I want something that does this with a proper sort of blocks and interrupts. If the better solution involves threading (which seems likely), we're already using boost::thread, so I'd prefer to use that.
I'm posting this question because this seems like the sort of situation that would have a clear "best practices" right answer, since it's a pretty common pattern. What's the right way to do it?
Edit to add: A large part of my concern here is that this is in a spot in the program that's both performance-critical and critical to avoid race conditions or memory leaks. Thus, while "use two threads and a timer" is helpful advice, I'm still left trying to figure out how to actually implement that in a safe and correct way, and I can easily see myself making newbie mistakes in the code that I don't even know I've made. Thus, some actual example code would be really appreciated!
Also, I have a concern about the multiple-threads solution: If I use the "put the blocking call in a second thread and do a timed-wait on that thread" method, what happens to that second thread if the blocked call never returns? I know that the timed-wait in the first thread will return and I'll see that no answer has happened and go on with things, but have I then "leaked" a thread that will sit around in a blocked state forever? Is there any way to avoid that? (Is there any way to avoid that and avoid leaking the second thread's memory?) A complete solution to what I need would need to avoid having leaks if the blocking call doesn't return.
You could use sigaction(2) and alarm(2), which are both POSIX. You set a callback action for the timeout using sigaction, then you set a timer using alarm, then make your blocking call. The blocking call will be interrupted if it does not complete within your chosen timeout (in seconds; if you need finer granularity you can use setitimer(2)).
Note that signals in C are somewhat hairy, and there are fairly onerous restriction on what you can do in your signal handler.
This page is useful and fairly concise:
http://www.gnu.org/s/libc/manual/html_node/Setting-an-Alarm.html
What you want is something like select(2), depending on the OS you are targeting.
It sounds like you need a 'monitor', capable of signaling availability of resource to threads via a shared mutex (typically). In Boost.Thread a condition_variable could do the job.
You might want to look at timed locks: Your blocking method can aquire the lock before starting to wait and release it as soon as the data is availabe. You can then try to acquire the lock (with a timeout) in your timed wait method.
Encapsulate the blocking call in a separate thread. Have an intermediate message buffer in that thread that is guarded by a condition variable (as said before). Make your main thread timed-wait on that condition variable. Receive the intermediately stored message if the condition is met.
So basically put a new layer capable of timed-wait between the API and your application. Adapter pattern.
Regarding
what happens to that second thread if the blocked call never returns?
I believe there is nothing you can do to recover cleanly without cooperation from the called function (or library). 'Cleanly' means cleaning up all resources owned by that thread, including memory, other threads, locks, files, locks on files, sockets, GPU resources... Un-cleanly, you can indeed kill the runaway thread.

Boost, C++ how to kill thread opened by another thread?

so I have some main function. 24 time a second it opens a boost thread A with a function. that function takes in a buffer with data. It starts up a boost timer. It opens another thread B with a function sending buffer into it. I need thread A to kill thread B if it is executing way 2 long. Of course if thread B has executed in time I do not need to kill it it should kill itself. What boost function can help me to kill created thread (not join - stop/kill or something like that)?
BTW I cannot affect speed of Function I am exequting in thread B thats why I need to be capable of killing it when needed.
There's no clean way to kill a thread, so if you need to do something like this, your clean choices are to either use a function that includes some cancellation capability, or use a separate process for it, since you can kill a process cleanly.
Other than that, my immediate reaction is that instead of "opening" (do you mean creating?) thread A 24 times a second, you'd be better off with thread A reading a buffer, sending it on to thread B, then sleeping until it's ready to read another buffer. Creating and killing threads isn't terribly expensive, but doing it at a rate of 24 (or, apparently, 48) a second strikes me as a bit excessive.
The term you are looking for is "cancellation", as in pthread_cancel(3). Cancellation is troublesome, because the cancelled thread might not execute C++ destructors or release locks on the way out ... but then again it might; the uncertainty is actually worse than a definitive no.
Because of this, boost threads do not support cancellation (see for instance this older question) but they do support interruption, which you might be able to bend to fit. Interruption works by way of a regular C++ exception so it has predictable semantics.
please don't kill threads at random unless you completely control their execution (and then just make proper signals for threads to exit gracefully). you never know if other thread is in some critical section of a library you never heard of and then your program will end up stalling on that CS as it was never exited or something like that.