Safely cleaning up a blocking std::thread after an automated test - c++

Consider a test case for some Mutex class implementation. The test creates several std::thread instances during execution. All threads should finish if the Mutex class is implemented correctly according to the test. If there is a problem, it's possible that one thread may block indefinitely. How can the test correctly cleanup after itself?
At first I thought to detach the thread, but then the thread is leaked. Even worse, the thread relies on a Mutex instance from inside the test case which will sporadically cause access violations after the test case returns.
Some thread libraries such at Qt’s QThread have terminate() methods, but I’d like to use std::thread even though Qt is already a dependency for my project.
Is there a general pattern for testing potentially indefinitely blocking concurrent code?

Killing threads that may hold a lock is one of the main reasons terminating threads forcibly is frowned upon, and why C++11 doesn't support it. You are not supposed to do it, period.
If you need to do something like it, your best best would probably be to spawn a new process to run the test in; if it locks up, you can terminate the process without the same risks.
For examples of why terminating threads is bad news, take a look at the specific example from the Old New Thing on what sort of garbage thread termination leaves lying around on Windows; similar issues occur on most operating systems under different contexts.

I think destructors could come in help here, is the only thing 100% sure that be executed after any problem, by design. I suggest a nice blocking test inside of some destructor and release resources in a SECURE WAY (smart pointers ?) before leave it.

Related

Fire-and-forget std::thread objects cleaning themselves up

When implementing a service, etc, there may be a need for fire-and-forget functionality where one creates a thread and leaves it to its own devices. However, one needs to keep the std::thread object around somewhere to prevent it going out of scope, but when the thread completes there is no neat delete this support, and even if there was, that's going to be a problem for non-pointer allocations. Similarly, higher-level libraries may have Timer objects, where a one-shot timer may be fired off but needs to be cleaned up when done.
One can perhaps keep a collection of std::thread and Timer objects, and every so often go through the list and delete the finished objects, but that seems bothersome. Is there some useful idiom for managing these kinds of temporaries?
My immediate solution has been to use a combination of std::mutex and std::atomic to get my service to return BUSY so I only ever have one thread around, but that feels like a Code Smell
This is what std::thread::detach() is for -- if you want the thread to be "fire and forget", then you call detach on it after creating it. This causes the std:thread object to no longer refer to the actual thread of execution, so you can destroy the std::thread and is has no effect on the execution.
In case of a lot fire-and-forget functionality you may consider using Thread Pool. Creating and destroying of threads have overheads, and a lot of it may eat your RAM in RT. Thread Pool fire up for you X threads ahead for you, that will manage every "fire-and-forget" thread for you.
A simple Thread Pool that solve the problem for me.
std::mutex isn't related to managing threads' resources, it's related to synchronize threads.

C++ graceful shutdown best practices

I'm writing a multi-threaded c++ application for *nix operating systems. What are some best practices for terminating such an application gracefully? My instinct is that I'd want to install a signal handler on SIGINT (SIGTERM?) which stops/joins my threads. Also, is it possible to "guarantee" that all destructors are called (provided no other errors or exceptions are thrown while handling the signal)?
Some considerations come to mind:
designate 1 thread to be responsible for orchestrating the shutdown, eg, as Dithermaster suggested, this could be the main thread if you are writing a standalone application. Or if you are writing a library, provide an interface (eg function call) whereby a client program can terminate the objects created within the library.
you cannot guarantee destructors are called; that is up to you, and requires carefully calling delete for each new. Maybe smart pointers will help you. But, really, this is a design consideration. The major components should have start & stop semantics, which you could choose to invoke from the class constructor & destructor.
the shutdown sequence for a set of interacting objects is something that can require some effort to get correct. E.g., before you delete an object, are you sure some timer mechanism is not going to try calling it in few micro/milli/seconds later? Trial and error is your friend here; develop a framework which can repeatedly & rapidly start and stop your application to tease out shutdown related race-conditions.
signals are one way to trigger an event; others might be periodically polling for a known file, or opening a socket and receiving some data on it. Either way, you want to decouple the shutdown sequence code from the trigger event.
My recommendation is that the main thread shut down all worker threads before exiting itself. Send each worker an event telling it to clean up and exit, and wait for each one to do so. This will allow all C++ destructors to run.
Regarding signal management, the only thing you can portably and safely do inside a signal handler is to write to a variable of type sig_atomic_t (possibly volatile-qualified) and return. In general, you cannot call most functions and must not write to global memory. In other words, the handler should just set a flag to be tested inside your main routine, at some point you find appropriate, and the action resulting from the signal itself should be performed from there.
(Since there might be blocking I/O involved, consider studying POSIX Thread Cancellation. Your Unix clone (most notably Linux) might have peculiarities with respect to this and to the above.)
Regarding destructors, no magic is involved. They will be executed if control leaves a given scope through any means defined in the language. Leaving a scope through other means (for example, longjmp() or even exit()) does not trigger destructors.
Regarding general shutdown practices, there are divergent opinions on the field.
Some state that a "graceful termination", in the sense of releasing every resource ever allocated, should be performed. In C++, this usually means that all destructors should be properly executed before the process terminates. This is tricky in practice and often a source of much grief, specially in multithreaded programs, for a variety of reasons. Signals further complicate things by the very nature of asynchronous signal dispatching.
Because most of this work is totally useless, some others, like me, contend that the program must just terminate immediately, possibly shortly after undoing persistent changes to the system (like removing temporary files or restoring the screen resolution) and saving configuration. An apparently tidier cleanup is not only a waste of time (because the operating system will clean up most things like allocated memory, dangling threads and open file descriptors), but might be a serious waste of time (deallocators might touch paged out memory, uselessly forcing the system to page them in just for releasing them soon after the process terminates, for example), not mentioning the possibility of deadlocks being originated from joining threads.
Just say no. When you want to leave, call exit() (or even _exit(), but watch out for unflushed I/O) and that's it. More annoying than slow starting programs are slow terminating programs.

SetThreadAffinityMask of pooled thread

I am wondering whether it is possible to set the processor affinity of a thread obtained from a thread pool. More specifically the thread is obtained through the use of TimerQueue API which I use to implement periodic tasks.
As a sidenote: I found TimerQueues the easiest way to implement periodic tasks but since these are usually longliving tasks might it be more appropriate to use dedicated threads for this purpose? Furthermore it is anticipated that synchronization primites such as semapores and mutexes need to be used to synchronize the various periodic tasks. Are the pooled threads suitable for these?
Thanks!
EDIT1: As Leo has pointed out the above question is actually two only loosely related ones. The first one is related to processor affinity of pooled threads. The second question is related to whether pooled threads obtained from the TimerQueue API are behaving just like manually created threads when it comes to synchronization objects. I will move this second question the a seperate topic.
If you do this, make sure you return things to how they were every time you release a thread back to the pool. Since you don't own those threads and other code which uses them may have other requirements/assumptions.
Are you sure you actually need to do this, though? It's very, very rare to need to set processor affinity. (I don't think I've ever needed to do it in anything I've written.)
Thread affinity can mean two quite different things. (Thanks to bk1e's comment to my original answer for pointing this out. I hadn't realised myself.)
What I would call processor affinity: Where a thread needs to be run consistently on a the same processor. This is what SetThreadAffinityMask deals with and it's very rare for code to care about it. (Usually it's due to very low-level issues like CPU caching in high performance code. Usually the OS will do its best to keep threads on the same CPU and it's usually counterproductive to force it to do otherwise.)
What I would call thread affinity: Where objects use thread-local storage (or some other state tied to the thread they're accessed from) and will go wrong if a sequence of actions is not done on the same thread.
From your question it sounds like you may be confusing #1 with #2. The thread itself will not change while your callback is running. While a thread is running it may jump between CPUs but that is normal and not something you have to worry about (except in very special cases).
Mutexes, semaphores, etc. do not care if a thread jumps between CPUs.
If your callback is executed by the thread pool multiple times, there is (depending on how the pool is used) usually no guarantee that the same thread will be used each time. i.e. Your callback may jump between threads, but not while it is in the middle of running; it may only change threads each time it runs again.
Some synchronization objects will care if your callback code runs on one thread and then, still thinking it holding locks on those objects, runs again on a different thread. (The first thread will still hold the locks, not the second one, although it depends which kind of synchronization object you use. Some don't care.) That isn't a #1, though; that's #2, and not something you'd use SetThreadAffinityMask to deal with.
As an example, Mutexes (CreateMutex) are owned by a thread. If you acquire a mutex on Thread A then any other thread which tries to acquire the mutex will block until you release the mutex on Thread A. (It is also an error for a thread to release a mutex it does not own.) So if your callback acquired a mutex, then exited, then ran again on another thread and released the mutex from there, it would be wrong.
On the other hand, an Event (CreateEvent) does not care which threads create, signal or destroy it. You can signal an event on one thread and then reset it on another and that's fine (normal, in fact).
It'd also be rare to hold a synchronization object between two separate runs of your callback (that would invite deadlocks, although there are certainly situations where you could legitimately want/do such a thing). However, if you created (for example) an apartment-threaded COM object then that would be something you would want to only access from one specific thread.
You shouldn't. You're only supposed to use that thread for the job at hand, on the processor it's running on at that point. Apart from the obvious inefficiency, the threadpool might destroy every thread as soon as you're done, and create a new one for your next job. The affinity masks wouldn't disappear that soon in practice, but it's even harder to debug if they disappear at random.

Why and when shouldn't I kill a thread?

I am writing a multithreaded socket server and I need to know for sure.
Articles about threads say that I should wait for the thread to return, instead of killing it. In some cases though, the user's thread i want to kick/ban, will not be able to return properly (for example, I started to send a big block of data and send() blocks the thread at the moment) so I'll need just to kill it.
Why killing thread functions are dangerous and when can they crash the whole application?
Killing a thread means stopping all execution exactly where it is a the moment. In particular, it will not execute any destructors. This means sockets and files won't be closed, dynamically-allocated memory will not be freed, mutexes and semaphores won't be released, etc. Killing a thread is almost guaranteed to cause resource leaks and deadlocks.
Thus, your question is kind of reversed. The real question should read:
When, and under what conditions can I kill a thread?
So, you can kill the thread when you're convinced no leaks and deadlocks can occur, not now, and not when the other thread's code will be modified (thus, it is pretty much impossible to guarantee).
In your specific case, the solution is to use non-blocking sockets and check some thread/user-specific flag between calls to send()and recv(). This will likely complicate your code, which is probably why you've been resisting to do so, but it's the proper way to go about it.
Moreover, you will quickly realize that a thread-per-client approach doesn't scale, so you'll change your architecture and re-write lots of it anyways.
Killing a thread can cause your program to leak resources because the thread did not get a chance to clean up after itself. Consider closing the socket handle the thread is sending on. This will cause the blocking send() to return immediately with an appropriate error code. The thread can then clean up and die peacefully.
If you kill your thread the hard way it can leak resources.
You can avoid it when you design your thread to support cancelation.
Do not use blocking calls or use blocking calls with a timeout. Receive or send data in smaller chunks or asynchronously.
You really don't want to do this.
If you kill a thread while it holds a critical section it won't be released which will likely result in your whole application breaking. Certain C library calls like heap memory allocation use critical sections and if you happen to kill your thread while it's doing a "new" then calling new from anywhere else in your program will cause that thread to stop.
You simply can't do this safely without really extreme measures which are much more restrictive than simply signalling the thread to terminate itsself.
There's many reasons, but here's an easy one: there's only one heap. If a thread allocates ANYTHING on the heap, and you kill it, whatever it has allocated is around until the process ends. Each thread gets its own stack, and so that MAY be freed (implementation-dependent), but you GUARANTEE leaks on the heap by not letting it shut itself down.
In the case of a thread blocked in I/O you never really need to kill it, instead you have the choice between non-blocking I/O, timeouts, and closing the socket from another thread. Either of these will unblock the thread.

Why would I want to start a thread "suspended"?

The Windows and Solaris thread APIs both allow a thread to be created in a "suspended" state. The thread only actually starts when it is later "resumed". I'm used to POSIX threads which don't have this concept, and I'm struggling to understand the motivation for it. Can anyone suggest why it would be useful to create a "suspended" thread?
Here's a simple illustrative example. WinAPI allows me to do this:
t = CreateThread(NULL,0,func,NULL,CREATE_SUSPENDED,NULL);
// A. Thread not running, so do... something here?
ResumeThread(t);
// B. Thread running, so do something else.
The (simpler) POSIX equivalent appears to be:
// A. Thread not running, so do... something here?
pthread_create(&t,NULL,func,NULL);
// B. Thread running, so do something else.
Does anyone have any real-world examples where they've been able to do something at point A (between CreateThread & ResumeThread) which would have been difficult on POSIX?
To preallocate resources and later start the thread almost immediately.
You have a mechanism that reuses a thread (resumes it), but you don't have actually a thread to reuse and you must create one.
It can be useful to create a thread in a suspended state in many instances (I find) - you may wish to get the handle to the thread and set some of it's properties before allowing it to start using the resources you're setting up for it.
Starting is suspended is much safer than starting it and then suspending it - you have no idea how far it's got or what it's doing.
Another example might be for when you want to use a thread pool - you create the necessary threads up front, suspended, and then when a request comes in, pick one of the threads, set the thread information for the task, and then set it as schedulable.
I dare say there are ways around not having CREATE_SUSPENDED, but it certainly has its uses.
There are some example of uses in 'Windows via C/C++' (Richter/Nasarre) if you want lots of detail!
There is an implicit race condition in CreateThread: you cannot obtain the thread ID until after the thread started running. It is entirely unpredictable when the call returns, for all you know the thread might have already completed. If the thread causes any interaction in the rest of that process that requires the TID then you've got a problem.
It is not an unsolvable problem if the API doesn't support starting the thread suspended, simply have the thread block on a mutex right away and release that mutex after the CreateThread call returns.
However, there's another use for CREATE_SUSPENDED in the Windows API that is very difficult to deal with if API support is lacking. The CreateProcess() call also accepts this flag, it suspends the startup thread of the process. The mechanism is identical, the process gets loaded and you'll get a PID but no code runs until you release the startup thread. That's very useful, I've used this feature to setup a process guard that detects process failure and creates a minidump. The CREATE_SUSPEND flag allowed me to detect and deal with initialization failures, normally very hard to troubleshoot.
You might want to start a thread with some other (usually lower) priority or with a specific affinity mask. If you spawn it as usual it can run with undesired priority/affinity for some time. So you start it suspended, change the parameters you want, then resume the thread.
The threads we use are able to exchange messages, and we have arbitrarily configurable priority-inherited message queues (described in the config file) that connect those threads. Until every queue has been constructed and connected to every thread, we cannot allow the threads to execute, since they will start sending messages off to nowhere and expect responses. Until every thread was constructed, we cannot construct the queues since they need to attach to something. So, no thread can be allowed to do work until the very last one was configured. We use boost.threads, and the first thing they do is wait on a boost::barrier.
I stumbled with a similar problem once upon I time. The reasons for suspended initial state are treated in other answer.
My solution with pthread was to use a mutex and cond_wait, but I don't know if it is a good solution and if can cover all the possible needs. I don't know, moreover, if the thread can be considered suspended (at the time, I considered "blocked" in the manual as a synonim, but likely it is not so)