I'm building plugins for a host application using C++11/14, for now targeting Windows and MacOS. The plugins start up async worker threads when the host app starts us up, and if they're still running when the host shuts the plugins down they get signaled to stop. Some of these worker threads are started with std::async so I can use an std::future to get the thread result back, while other less involved threads are just std::threads which I ultimately just join to see when they're done. It all works nicely this way.
Unless the host decides not to call our shutdown procedure when it shuts down itself... Yeah, I know, but it really is that bad sometimes -- it often enough just crashes during shutdown. And they even plan to make that into a 'feature' and call it "Fast Exit" to please their users; just pull the plug and we're done extra fast :(
For that case I have registered an std::atexit handler. It last-minute signals any still running threads to exit NOW (atomic bools and/or signals to wake them up), then it waits a second to give the threads some time to respond, and finally it detaches the regular std::thread threads and hopes for the best. This way at least the threads get a heads up to quickly write intermediate state to disk for a next round (if needed), and quit writing to probably already deceased data structures, thus avoiding crashes which would make any crash dump point the finger at my plugins.
However, atexit handlers run at OS DLL unload time, so I'm not even allowed to use thread synchronization (right?). And under the debugger I just saw all of the worker threads were presumably already killed by the OS, since the atexit handler's thread was the only thread left under the debugger. Needless to say, all remaining std::futures went into full blocking mode, hanging up the remaining corpse of the dead host app...
Is there a way to abandon an std::future? In MS Visual C++ I saw futures have an _Abandon method, but that's too platform specific (and undocumented) for my taste. Or is my only recourse to not use std::future, do all thread communication via my own data structures and synchronization, and work with simple std::threads which can just be detached?
Related
There was no direct and satisfactory answer found on quite a simple question:
Given multiple threads running is there a generic/correct way to wait on them to finish while exiting the process? Or "is doing timed wait Ok in this case?"
Yes, we attempt to signal threads to finish but it is observed that during process exit some of them tend to stall. We recently had a discussion and it was decided to rid of "arbitrary wait":
m_thread.quit(); // the way we had threads finished
m_thread.wait(kWaitMs); // with some significant expiration (~1000ms)
m_thread.quit(); // the way we have threads finished now
m_thread.wait(); // wait forever until finished
I understand that kWaitMs constant should be chosen somewhat proportional to one uninterrupted "job cycle" for the thread to finish. Say, if the thread processes some chunk of data for 10 ms then we should probably wait on it to respond to quit signal for 100 ms and if it still does not quit then we just don't wait anymore. We don't wait in that case as long as we quit the program and no longer care. But some engineers don't understand such "paradigm" and want an ultimate wait. Mind that the program process stuck in memory on the client machine will cause problems on the next program start in our case for sure not to mention that the log will not be properly finished to process as an error.
Can the question about the proper thread finishing on process quit be answered?
Is there some assistance from Qt/APIs to resolve the thread hang-up better, so we can log the reason for it?
P.S. Mind that I am well aware on why it is wrong to terminate the thread forcefully and how can that be done. This question I guess is not about synchronization but about limited determinism of threads that run tons of our and framework and OS code. The OS is not Real Time, right: Windows / MacOS / Linux etc.
P.P.S. All the threads in question have event loop so they should respond to QThread::quit().
Yes, we attempt to signal threads to finish but it is observed that
during process exit some of them tend to stall.
That is your real problem. You need to figure out why some of your threads are stalling, and fix them so that they do not stall and always quit reliably when they are supposed to. (The exact amount of time they take to quit isn't that important, as long as they do quit in a reasonable amount of time, i.e. before the user gets tired of waiting and force-quits the whole application)
If you don't/can't do that, then there is no way to shut down your app reliably, because you can't safely free up any resources that a thread might still be accessing. It is necessary to 100% guarantee that a thread has exited before the main thread calls the destructors of any objects that the thread uses (e.g. the QThread object associated with the thread)
So to sum up: don't bother playing games with wait-timeouts or forcibly-terminating threads; all that will get you is an application that sometimes crashes on shutdown. Use an indefinite-wait, and make sure your threads always (always!) quit after the main thread has asked them to, as that is the only way you'll achieve a reliable shutdown sequence.
I use Qt 4.8.6, MS Visual Studio 2008, Windows 7. I've created a GUI program. It contains main GUI thread and worker thread (I have not made QThread subclass, by the way), which makes synchronous calls to 3rd party DLL functions. These functions are rather slow. QTcpServer instance is also under worker thread. My worker class contains QTcpServer and DLL wrapper methods.
I know that quit() is preferred over terminate(), but I don't wanna wait for a minute (because of slow DLL functions) during program shutdown. When I try to terminate() worker thread, I notice warnings about stopping QTcpServer from another thread. What is a correct way of process shutdown?
QThread::quit tells the thread's event loop to exit. After calling it the thread will get finished as soon as the control returns to the event loop of the thread
You may also force a thread to terminate right now via QThread::terminate(), but this is a very bad practice, because it may terminate the thread at an undefined position in its code, which means you may end up with resources never getting freed up and other nasty stuff. So use this only if you really can't get around it.
So i think the right approach is to first tell the thread to quit normally and if something goes wrong and takes much time and you have no way to wait for it, then terminate it:
QThread * th = myWorkerObject->thread();
th->quit();
th->wait(5000); // Wait for some seconds to quit
if(th->isRunning()) // Something took time more than usual, I have to terminate it
th->terminate();
You should always try to avoid killing threads from the outside by force and instead ask them nicely to finish what they're doing. This usually means that the thread checks regularly if it should terminate itself and the outside world tells it to terminate when needed (by setting a flag, signaling an event or whatever is appropriate for the situation at hand).
When a thread is asked to terminate itself, it finishes up what it's doing and exists cleanly. The application waits for the thread to terminate and then exits.
You say that in your case the thread takes a long time to finish. You can take this into consideration and still terminate the thread "the nice way" (for example you can hide the application window and give the impression that the app has exited, even if the process takes a little more time until it finally terminates; or you can show some form of progress indication to the user telling him that the application is shutting down).
Unless there is an overriding reason to do so, you should not attempt to terminate threads with user code at process-termination.
If there is no such reason, just call your OS process termination syscall, eg. ExitProcess(0). The OS can, and will will stop all process threads in any state before releasing all process resources. User code cannot do that, and should not try to terminate threads, or signal them to self-terminate, unless absolutely necessary.
Attempting to 'clean up' with user code sounds 'nice', (aparrently), but is an expensive luxury that you will pay for with extra code, extra testing and extra maintenance.
That is, if your customers don't stop buying your app because they get pissed off with it taking so long to shut down.
The OS is very good at stopping threads and cleaning up. It's had endless thousands of hours of testing during development and decades of life in the wild where problems with process termination would have become aparrent and got fixed. You will not even get close to that with your flags, events etc. as you struggle to stop threads running on another core without the benefit of an interprocessor driver.
There are surely times when you will have to resort to user code to stop threads. If you need to stop them before process termination, or you need to close some DB connection, flush some file at shutdown, deal with interprocess comms or the like issues, then you will have to resort to some of the approaches already suggested in other answers.
If not, don't try to duplicate OS functionality in the name of 'niceness'. Just ask it to terminate your process. You can get your warm, fuzzy feeling when your app shuts down immedately while other developers are still struggling to implement 'Shutdown' progress bars or trying to explain to customers why they have 15 zombie apps still running.
Is it possible to have a boost::thread sleep indefinitely after its work is completed and then wake it from another boost::thread?
Using while(1)s are perfect for a dedicated server where I want the threads to run all cores at 100%, but I'm writing a websocket++ server to be run on a desktop, thus I only want the boost::threads to run when they actually have work to do, so I can do other work on my desktop without performance suffering.
I've seen other examples where boost::threads are set to sleep() for constant a amount of time, but I'd rather not spend the time trying to find that optimal constant; besides, I need the websocket++ server to respond as quickly as possible when it receives data to process.
If this is possible, how can it be done with multiple threads trying to wake?
This mechanism is implemented by what is called a condition-variable, see boost::condition_variable. Essentially, the waiting thread will sleep on a locked mutex until another thread signals the condition, thereby unlocking it.
Watch out for spurious wake-ups. Sometimes the waiting thread will wake-up without being signaled. This means that you should still put a while-loop that checks a predicate (or condition) to decipher between real wake-ups and spurious ones.
yes, pthread_mutex_t+pthread_cond_t is the right thing to use, you can find the corresponding
thing in boost.
I'm designing a thread library. So far I have a method that initializes the library, one that creates threads, and one that yields the current thread to the next one on a queue of ready threads.
Before I move on to implementing semaphores for the threads, I figured I should probably kill the threads as soon as they are done and free up their allocated memory, but I'm having trouble figuring out how to do that. How do I tell when a thread has "finished"?
You don't just kill threads safely or reliably -- let them exit naturally (when their entry returns).
Although the system provides a means to kill the thread, nearly any C++ program out there could expect undefined behavior if it were to continue. You could dream up cases where killing could be accomplished without side effects (to the rest of the program), but that program does not at all resemble idiomatic C++. Such a program would be very exotic, with many unusual and severe restrictions.
When you want to known when a thread has exited or not, you can add some cleanup before it exits in order to track its status.
When you want the ability to request a thread exit (naturally), consider run loops and messages.
You don't explicitly kill the threads when they are finished running their forked procedures as the code which would be doing that would still be in the context of the thread to be killed.
You have a scheduler/interrupt handler which handles the context switching of the threads and maintains a few queues for managing this. You can have it save a reference to to the threads to be killed, something like scheduler->SetThreadToKill( currentThread ); inside probably your finish() method (or similar), which sets a flag for the corresponding threads.
When a context switch occurs, and you have swapped out all data structures of the current thread with that of the next thread, you scheduler can call the destructor for all the threads which have the toBeKilled flag set.
The best policy, by far, for killing threads is to not explicitly do it, (unless you are an OS, ie. on app shutdown). Queue messages and tasks to threads that loop around some queue to perform more work. If you don't write any code to continually new, create, start, terminate, delete, test, check, enlist, delist, enqueue, dequeue and otherwise micro-manage threads, then that code cannot contain bugs.
I've got a C++ Win32 application that has a number of threads that might be busy doing IO (HTTP calls, etc) when the user wants to shutdown the application. Currently, I play nicely and wait for all the threads to end before returning from main. Sometimes, this takes longer than I would like and indeed, it seems kind of pointless to make the user wait when I could just exit. However, if I just go ahead and return from main, I'm likely to get crashes as destructors start getting called while there are still threads using the objects.
So, recognizing that in an ideal, platonic world of virtue, the best thing to do would be to wait for all the threads to exit and then shutdown cleanly, what is the next best real world solution? Simply making the threads exit faster may not be an option. The goal is to get the process dead as quickly as possible so that, for example, a new version can be installed over it. The only disk IO I'm doing is in a transactional db, so I'm not terribly concerned about pulling the plug on that.
Use overlapped IO so that you're always in control of the threads that are dealing with your I/O and can always stop them at any point; you either have them waiting on an IOCP and can post an application level shutdown code to it, OR you can wait on the event in your OVERLAPPED structure AND wait on your 'all threads please shutdown now' event as well.
In summary, avoid blocking calls that you can't cancel.
If you can't and you're stuck in a blocking socket call doing IO then you could always just close the socket from the thread that has decided that it's time to shut down and have the thread that's doing IO always check the 'shutdown now' event before retrying...
I use an exception-based technique that's worked pretty well for me in a number of Win32 applications.
To terminate a thread, I use QueueUserAPC() to queue a call to a function which throws an exception. However, the exception that's thrown isn't derived from the type "Exception", so will only be caught by my thread's wrapper procedure.
The advantages of this are as follows:
No special code needed in your thread to make it 'stoppable' - as soon as it enters an alertable wait state, it will run the APC function.
All destructors get invoked as the exception runs up the stack, so your thread exits cleanly.
The things you need to watch for:
Anything doing catch (...) will eat your exception. User code should always use catch(const Exception &e) or similar!
Make sure your I/O and delays are done in an "alertable" way. For example, this means calling sleepex(N, true) instead of sleep(N).
CPU-bound threads need to call sleepex(0,true) occasionally to check for termination.
You can also 'protect' areas of your code to prevent task termination during critical sections.
Best way: Do your work while the app is running, and do nothing (or as close to) at shutdown (works for startup too). If you stick to that pattern, then you can tear down the threads immediately (rather than "being nice" about it) when the shutdown request comes without worrying about work that still needs to be done.
In your specific situation, you'd probably need to wait for IO to finish (writes, at least) if you're doing local work there. HTTP requests and such you can probably just abandon/close outright (again, unless you're writing something). But if it is the case that you're writing during this shutdown and waiting on that, then you may want to notify the user of that, rather than letting your process look hung while you're wrapping things up.
I'd recommend having your GUI and work be done on different threads. When a user requests a shutdown, dismiss the GUI immediately giving the appearance that the application has closed. Allow the worker threads to close gracefully in the background.
If you want to pull the plug messily, exit(0) will do the trick.
I once had a similar problem, albeit in Visual Basic 6: threads from an app would connect to different servers, download some data, perform some operations looping upon that data, and store on a centralized server the result.
Then, new requirement was that threads should be stoppable from main form. I accomplished this in an easy though dirty fashion, by having the threads stop after N loops (equivalent roughly to half a second) to try to open a mutex with a specific name. Upon success, they immediately stopped whatever they were doing and quit, continued otherwise.
This mutex was created only by the main form, once it was created all the threads would soon close themselves. The disadvantage was that user needed to manually specify it wanted to run the threads again - another button to "Enable threads to run" accomplished this by releasing the mutex :D
This trick is guaranteed to work for mutex operations are atomic. Problem is you're never sure a thread really closed - a failure in the logic of handling the "openMutex succeeded" case could mean it never ends. You also don't know when/if all the threads have closed (assuming your code is right, this would take roughly the same time it takes for the loops to stop and "listen").
With VB's "apartment" model of multi-threading it's somewhat difficult to send info from the threads to the main app back and forth, it's much easier to "fire and forget" or to send it only from the main app to the thread. Thus, the need of these kind of long-cuts. Using C++ you're free to use your multi-threading model, so these constraints might not apply to you.
Whatever you do, do NOT use TerminateThread, especially on anything that could be in OS HTTP calls. You could potentially break IE until reboot.
Change all of your IO to an asynchronous or non-blocking model so that they can watch for termination events.
If you need to shutdown suddenly: Just call ExitProcess - which is what is going to be called just as soon as you return from WinMain anyway. Windows itself creates many worker threads that have no way to be cleaned up - they are terminated by process shutdown.
If you have any threads that are performing writes of some kind - obviously those need a chance to close their resources. But anything else - ignore the bounds checker warnings and just pull the rug from under their feet.
You can call TerminateProcess - this will stop the process immediately, without notifying anyone and without waiting for anything.
*NULL = 0 is the fastest way. if you don't want to crash, call exit() or its win32 equivalent.
Instruct the user to unplug the computer. Short of that, you have to abandon your asynchronous activities to the wind. Or is that HWIND? I can never remember in C++. Of course, you could take the middle road and quickly note in a text file or reg key what action was abandoned so that the next time the program runs it can take up that action again automatically or ask the user if they want to do so. Depending on what data you lose when you abandon the asynch action, you may not be able to do that. If you're interacting with the user, you may want to consider a dialog or some UI interaction that explains why its taking so long.
Personally, I prefer the instruction to the user to just unplug the computer. :)