So, the situation is this. I've got a C++ library that is doing some interprocess communication, with a wait() function that blocks and waits for an incoming message. The difficulty is that I need a timed wait, which will return with a status value if no message is received in a specified amount of time.
The most elegant solution is probably to rewrite the library to add a timed wait to its API, but for the sake of this question I'll assume it's not feasible. (In actuality, it looks difficult, so I want to know what the other option is.)
Here's how I'd do this with a busy wait loop, in pseudocode:
while(message == false && current_time - start_time < timeout)
{
if (Listener.new_message()) then message = true;
}
I don't want a busy wait that eats processor cycles, though. And I also don't want to just add a sleep() call in the loop to avoid processor load, as that means slower response. I want something that does this with a proper sort of blocks and interrupts. If the better solution involves threading (which seems likely), we're already using boost::thread, so I'd prefer to use that.
I'm posting this question because this seems like the sort of situation that would have a clear "best practices" right answer, since it's a pretty common pattern. What's the right way to do it?
Edit to add: A large part of my concern here is that this is in a spot in the program that's both performance-critical and critical to avoid race conditions or memory leaks. Thus, while "use two threads and a timer" is helpful advice, I'm still left trying to figure out how to actually implement that in a safe and correct way, and I can easily see myself making newbie mistakes in the code that I don't even know I've made. Thus, some actual example code would be really appreciated!
Also, I have a concern about the multiple-threads solution: If I use the "put the blocking call in a second thread and do a timed-wait on that thread" method, what happens to that second thread if the blocked call never returns? I know that the timed-wait in the first thread will return and I'll see that no answer has happened and go on with things, but have I then "leaked" a thread that will sit around in a blocked state forever? Is there any way to avoid that? (Is there any way to avoid that and avoid leaking the second thread's memory?) A complete solution to what I need would need to avoid having leaks if the blocking call doesn't return.
You could use sigaction(2) and alarm(2), which are both POSIX. You set a callback action for the timeout using sigaction, then you set a timer using alarm, then make your blocking call. The blocking call will be interrupted if it does not complete within your chosen timeout (in seconds; if you need finer granularity you can use setitimer(2)).
Note that signals in C are somewhat hairy, and there are fairly onerous restriction on what you can do in your signal handler.
This page is useful and fairly concise:
http://www.gnu.org/s/libc/manual/html_node/Setting-an-Alarm.html
What you want is something like select(2), depending on the OS you are targeting.
It sounds like you need a 'monitor', capable of signaling availability of resource to threads via a shared mutex (typically). In Boost.Thread a condition_variable could do the job.
You might want to look at timed locks: Your blocking method can aquire the lock before starting to wait and release it as soon as the data is availabe. You can then try to acquire the lock (with a timeout) in your timed wait method.
Encapsulate the blocking call in a separate thread. Have an intermediate message buffer in that thread that is guarded by a condition variable (as said before). Make your main thread timed-wait on that condition variable. Receive the intermediately stored message if the condition is met.
So basically put a new layer capable of timed-wait between the API and your application. Adapter pattern.
Regarding
what happens to that second thread if the blocked call never returns?
I believe there is nothing you can do to recover cleanly without cooperation from the called function (or library). 'Cleanly' means cleaning up all resources owned by that thread, including memory, other threads, locks, files, locks on files, sockets, GPU resources... Un-cleanly, you can indeed kill the runaway thread.
Related
The Windows and Solaris thread APIs both allow a thread to be created in a "suspended" state. The thread only actually starts when it is later "resumed". I'm used to POSIX threads which don't have this concept, and I'm struggling to understand the motivation for it. Can anyone suggest why it would be useful to create a "suspended" thread?
Here's a simple illustrative example. WinAPI allows me to do this:
t = CreateThread(NULL,0,func,NULL,CREATE_SUSPENDED,NULL);
// A. Thread not running, so do... something here?
ResumeThread(t);
// B. Thread running, so do something else.
The (simpler) POSIX equivalent appears to be:
// A. Thread not running, so do... something here?
pthread_create(&t,NULL,func,NULL);
// B. Thread running, so do something else.
Does anyone have any real-world examples where they've been able to do something at point A (between CreateThread & ResumeThread) which would have been difficult on POSIX?
To preallocate resources and later start the thread almost immediately.
You have a mechanism that reuses a thread (resumes it), but you don't have actually a thread to reuse and you must create one.
It can be useful to create a thread in a suspended state in many instances (I find) - you may wish to get the handle to the thread and set some of it's properties before allowing it to start using the resources you're setting up for it.
Starting is suspended is much safer than starting it and then suspending it - you have no idea how far it's got or what it's doing.
Another example might be for when you want to use a thread pool - you create the necessary threads up front, suspended, and then when a request comes in, pick one of the threads, set the thread information for the task, and then set it as schedulable.
I dare say there are ways around not having CREATE_SUSPENDED, but it certainly has its uses.
There are some example of uses in 'Windows via C/C++' (Richter/Nasarre) if you want lots of detail!
There is an implicit race condition in CreateThread: you cannot obtain the thread ID until after the thread started running. It is entirely unpredictable when the call returns, for all you know the thread might have already completed. If the thread causes any interaction in the rest of that process that requires the TID then you've got a problem.
It is not an unsolvable problem if the API doesn't support starting the thread suspended, simply have the thread block on a mutex right away and release that mutex after the CreateThread call returns.
However, there's another use for CREATE_SUSPENDED in the Windows API that is very difficult to deal with if API support is lacking. The CreateProcess() call also accepts this flag, it suspends the startup thread of the process. The mechanism is identical, the process gets loaded and you'll get a PID but no code runs until you release the startup thread. That's very useful, I've used this feature to setup a process guard that detects process failure and creates a minidump. The CREATE_SUSPEND flag allowed me to detect and deal with initialization failures, normally very hard to troubleshoot.
You might want to start a thread with some other (usually lower) priority or with a specific affinity mask. If you spawn it as usual it can run with undesired priority/affinity for some time. So you start it suspended, change the parameters you want, then resume the thread.
The threads we use are able to exchange messages, and we have arbitrarily configurable priority-inherited message queues (described in the config file) that connect those threads. Until every queue has been constructed and connected to every thread, we cannot allow the threads to execute, since they will start sending messages off to nowhere and expect responses. Until every thread was constructed, we cannot construct the queues since they need to attach to something. So, no thread can be allowed to do work until the very last one was configured. We use boost.threads, and the first thing they do is wait on a boost::barrier.
I stumbled with a similar problem once upon I time. The reasons for suspended initial state are treated in other answer.
My solution with pthread was to use a mutex and cond_wait, but I don't know if it is a good solution and if can cover all the possible needs. I don't know, moreover, if the thread can be considered suspended (at the time, I considered "blocked" in the manual as a synonim, but likely it is not so)
In my application I have two threads
a "main thread" which is busy most of the time
an "additional thread" which sends out some HTTP request and which blocks until it gets a response.
However, the HTTP response can only be handled by the main thread, since it relies on it's thread-local-storage and on non-threadsafe functions.
I'm looking for a way to tell the main thread when a HTTP response was received and the corresponding data. The main thread should be interrupted by the additional thread and process the HTTP response as soon as possible, and afterwards continue working from the point where it was interrupted before.
One way I can think about is that the additional thread suspends the main thread using SuspendThread, copies the TLS from the main thread using some inline assembler, executes the response-processing function itself and resumes the main thread afterwards.
Another way in my thoughts is, setting a break point onto some specific address in the second threads callback routine, so that the main thread gets notified when the second threads instruction pointer steps on that break point - and therefore - has received the HTTP response.
However, both methods don't seem to be nicely at all, they hurt even if just thinking about them, and they don't look really reliable.
What can I use to interrupt my main thread, saying it that it should be polite and process the HTTP response before doing anything else? Answers without dependencies on libraries are appreciated, but I would also take some dependency, if it provides some nice solution.
Following question (regarding the QueueUserAPC solution) was answered and explained that there is no safe method to have a push-behaviour in my case.
This may be one of those times where one works themselves into a very specific idea without reconsidering the bigger picture. There is no singular mechanism by which a single thread can stop executing in its current context, go do something else, and resume execution at the exact line from which it broke away. If it were possible, it would defeat the purpose of having threads in the first place. As you already mentioned, without stepping back and reconsidering the overall architecture, the most elegant of your options seems to be using another thread to wait for an HTTP response, have it suspend the main thread in a safe spot, process the response on its own, then resume the main thread. In this scenario you might rethink whether thread-local storage still makes sense or if something a little higher in scope would be more suitable, as you could potentially waste a lot of cycles copying it every time you interrupt the main thread.
What you are describing is what QueueUserAPC does. But The notion of using it for this sort of synchronization makes me a bit uncomfortable. If you don't know that the main thread is in a safe place to interrupt it, then you probably shouldn't interrupt it.
I suspect you would be better off giving the main thread's work to another thread so that it can sit and wait for you to send it notifications to handle work that only it can handle.
PostMessage or PostThreadMessage usually works really well for handing off bits of work to your main thread. Posted messages are handled before user input messages, but not until the thread is ready for them.
I might not understand the question, but CreateSemaphore and WaitForSingleObject should work. If one thread is waiting for the semaphore, it will resume when the other thread signals it.
Update based on the comment: The main thread can call WaitForSingleObject with a wait time of zero. In that situation, it will resume immediately if the semaphore is not signaled. The main thread could then check it on a periodic basis.
It looks like the answer should be discoverable from Microsoft's MSDN. Especially from this section on 'Synchronizing Execution of Multiple Threads'
If your main thread is GUI thread why not send a Windows message to it? That what we all do to interact with win32 GUI from worker threads.
One way to do this that is determinate is to periodically check if a HTTP response has been received.
It's better for you to say what you're trying to accomplish.
In this situation I would do a couple of things. First and foremost I would re-structure the work that the main thread is doing to be broken into as small of pieces as possible. That gives you a series of safe places to break execution at. Then you want to create a work queue, probably using the microsoft slist. The slist will give you the ability to have one thread adding while another reads without the need for locking.
Once you have that in place you can essentially make your main thread run in a loop over each piece of work, checking periodically to see if there are requests to handle in the queue. Long-term what is nice about an architecture like that is that you could fairly easily eliminate the thread localized storage and parallelize the main thread by converting the slist to a work queue (probably still using the slist), and making the small pieces of work and the responses into work objects which can be dynamically distributed across any available threads.
I have several thread pools and I want my application to handle a cancel operation.
To do this I implemented a shared operation controller object which I poll at various spots in each thread pool worker function that is called.
Is this a good model, or is there a better way to do it?
I just worry about having all of these operationController.checkState() littered throughout the code.
Yes it's a good approach. Herb Sutter has a nice article comparing it with the alternatives (which are worse).
With any kind of ansynchronous cancellation you're going to have to periodically poll some sort of flag. There's a fundamental issue of having to keep things in a consitant state. If you just kill a thread in the middle of whatever it's doing, bad things will happen sooner or later.
Depending on what you are actually doing, you may be able to just ignore the result of the operation instead of cancelling it. You let the operation continue on, but just don't wait for it to complete and never check the result.
If you actually need to stop the operation, then you're going to have to poll at appropriate points, and do whatever cleanup is necessary.
It's a good way to do it.
Another possible way to do it is, if there's some other subroutine[s] which the threads call regularly anyway, to check within that subroutine and throw an exception (to be caught at the top of the thread), assuming that "cancel" may be considered exceptional and assuming that the code being executed by the thread is exception-safe.
I wouldn't do it that way, checking a shared object.
I most likely will provide each thread object with a way to cancel the execution inside the own thread, be it an event, a threadsafe state variable or whatever.
The problem with the shared operation controller is that, from my point of view, the logic is reversed, Why are you calling it "controller" when it doesn't control anything?
For me, Operation Controller shall recive a cancelation order and then, in turn select the appropiate threads and signal them to stop. That would be a correct "chain of command" if you know what I mean. The way you do it you introduce an unnatural behaivour on the thread wich doesn't "obey" orders to stop, instead if checks each time if his "superior" has "written the order somewere". Somehow it just doesn't feel right.
In addition, what if you just one "some" of the threads to stop in the future? What if you want to include some advanced logic so that threads will only stop given a condition? Then you'll have to rewrite the code in each and every thread to handle that condition.
So I will provide a way, for each thread to be able to handle signals to them, for example by using a Command Pattern with a FIFO structure.
(By the way, I realize they're thread pool workers, not actual Thread Classes but still, I think each worker must be signaled to stop separately, not the other way around).
In similar situations I have used an event, non-auto-reset, all threads can look at that event. Quite similar to polling except that if your threads block at times, they can sleep for the "stop"-event as well. (Easier on Windows.)
/L
I am writing an application which blocks on input from two istreams.
Reading from either istream is a synchronous (blocking) call, so, I decided to create two Boost::threads to do the reading.
Either one of these threads can get to the "end" (based on some input received), and once the "end" is reached, both input streams stop receiving. Unfortunately, I cannot know which will do so.
Thus, I cannot join() on both threads, because only one thread (cannot be predetermined which one) will actually return (unblock).
I must somehow force the other to exit, but it is blocked waiting for input, so it cannot itself decide it is time to return (condition variables or what not).
Is their a way to either:
Send a signal a boost::thread, or
Force an istream to "fail", or
Kill a Boost::thread?
Note:
One of the istreams is cin
I am trying to restart the process, so I cannot close the input streams in a way that prohibits reseting them.
Edit:
I do know when the "end" is reached, and I do know which thread has successfully finished, and which needs to be killed. Its the killing I need to figure out (or a different strategy for reading from an istream).
I need both threads to exit and cleanup properly :(
Thanks!
I don't think there is a way to do it cross platform, but pthread_cancel should be what you are looking for. With a boost thread you can get the native_handle from a thread, and call pthread_cancel on it.
In addition a better way might be to use the boost asio equivalent of a select call on multiple files. That way one thread will be blocked waiting for the input, but it could come from either input stream. I don't know how easy it is to do something like this with iostreams though.
Yes there is!
boost::thread::terminate() will do the job to your specifications.
It will cause the targeted thread to throw an exception. Assuming it's uncaught, the stack will unwind properly destroying all resources and terminating thread execution.
The termination isn't instant. (The wrong thread is running at that moment, anyway.)
It happens under predefined conditions - the most convenient for you would probably be when calling boost::this_thread::sleep();, which you could have that thread do periodically.
If a boost thread is blocking on an i/o operation (e.g. cin>>whatever), boost::thread::terminate() will not kill the thread. cin i/o is not a valid termination point. Catch 22.
Well on linux, I use pthread_signal(SIGUSR1), as it interrupts blocking IO. There no such call on windows as I discovered when porting my code. Only a deprecated one in socket reading call. In windows you have to explicitly define an event that will interrupt your blocking call. So there no such thing (AFAIK) as a generic way to interrupt blocking IO.
The boost.thread design handle this by managing well identified interrupt points. I don't know boost.asio well and it seems that you don't want to rely on it anyway. If you don't want to refactor to use non-blocking paradigm, What you can do is using something between non-blocking (polling) and blocking IO. That is do something like (pseudo code ?) :
while(!stopped && !interrupted)
{
io.blockingCall(timeout);
if(!stopped && !interrupted)
{
doSomething();
}
}
Then you interrupt your two threads and join them ...
Perhaps it is simpler in your case ? If you have a master thread that knows one thread is ended you just have to close the IO of the other thread ?
Edit:
By the way I'm interested in the final solution you have ...
I had a similar issue myself and have reached this solution, which some other readers of this question might find useful:
Assuming that you are using a condition variable with a wait() command, it is important for you to know that in Boost, the wait() statement is a natural interrupt point. So just put a try/catch block around the code with the wait statement and allow the function to terminate normally in your catch block.
Now, assuming you have a container with your thread pointers, iterate over your thread pointers and call interrupt() on each thread, followed by join().
Now all of your threads will terminate gracefully and any Boost-related memory cleanup should work cleanly.
Rather than trying to kill your thread, you can always tryjoin the thread instead, and if it fails, you join the other one instead. (Assuming you will always be able to join at least one of your two threads).
In boost:thread you're looking for the timed_join function.
If you want to look at the correct answer, however, that would be to use non-blocking io with timed waits. Allowing you to get the flow structure of synchronous io, with the non-blocking of asynchronous io.
You talk about reading form an istream, but an istream is only an interface. for stdin, you can just fclose the stdin file descriptor to interrupt the read. As for the other, it depends an where you're reading from...
It seems that threads are not helping you do what you want in a simple way. If Boost.Asio is not to your liking, consider using select().
The idea is to get two file descriptors and use select() to tell you which of them has input available. The file descriptor for cin is typically STDIN_FILENO; how to get the other one depends on your specifics (if it's a file, just open() it instead of using ifstream).
Call select() in a loop to find out which input to read, and when you want to stop, just break out of the loop.
Under Windows, use QueueUserAPC to queue a proc which throws an exception. That approach works fine for me.
HOWEVER: I've just found that boost mutexes etc are not "alertable" on win32, so QueueUserAPC cannot interrupt them.
Very late, but in Windows (and it's precursors like VMS or RSX for those that rember such things) I'd use something like ReadFileEx with a completion routine that signals when finished, and CancelIO if the read needs to be cancelled early.
Linux/BSD has an entirely different underlying API which isn't as flexible. Using pthread_kill to send a signal works for me, that will stop the read/open operation.
It's worth implementing different code in this area for each platform, IMHO.
I've got a C++ Win32 application that has a number of threads that might be busy doing IO (HTTP calls, etc) when the user wants to shutdown the application. Currently, I play nicely and wait for all the threads to end before returning from main. Sometimes, this takes longer than I would like and indeed, it seems kind of pointless to make the user wait when I could just exit. However, if I just go ahead and return from main, I'm likely to get crashes as destructors start getting called while there are still threads using the objects.
So, recognizing that in an ideal, platonic world of virtue, the best thing to do would be to wait for all the threads to exit and then shutdown cleanly, what is the next best real world solution? Simply making the threads exit faster may not be an option. The goal is to get the process dead as quickly as possible so that, for example, a new version can be installed over it. The only disk IO I'm doing is in a transactional db, so I'm not terribly concerned about pulling the plug on that.
Use overlapped IO so that you're always in control of the threads that are dealing with your I/O and can always stop them at any point; you either have them waiting on an IOCP and can post an application level shutdown code to it, OR you can wait on the event in your OVERLAPPED structure AND wait on your 'all threads please shutdown now' event as well.
In summary, avoid blocking calls that you can't cancel.
If you can't and you're stuck in a blocking socket call doing IO then you could always just close the socket from the thread that has decided that it's time to shut down and have the thread that's doing IO always check the 'shutdown now' event before retrying...
I use an exception-based technique that's worked pretty well for me in a number of Win32 applications.
To terminate a thread, I use QueueUserAPC() to queue a call to a function which throws an exception. However, the exception that's thrown isn't derived from the type "Exception", so will only be caught by my thread's wrapper procedure.
The advantages of this are as follows:
No special code needed in your thread to make it 'stoppable' - as soon as it enters an alertable wait state, it will run the APC function.
All destructors get invoked as the exception runs up the stack, so your thread exits cleanly.
The things you need to watch for:
Anything doing catch (...) will eat your exception. User code should always use catch(const Exception &e) or similar!
Make sure your I/O and delays are done in an "alertable" way. For example, this means calling sleepex(N, true) instead of sleep(N).
CPU-bound threads need to call sleepex(0,true) occasionally to check for termination.
You can also 'protect' areas of your code to prevent task termination during critical sections.
Best way: Do your work while the app is running, and do nothing (or as close to) at shutdown (works for startup too). If you stick to that pattern, then you can tear down the threads immediately (rather than "being nice" about it) when the shutdown request comes without worrying about work that still needs to be done.
In your specific situation, you'd probably need to wait for IO to finish (writes, at least) if you're doing local work there. HTTP requests and such you can probably just abandon/close outright (again, unless you're writing something). But if it is the case that you're writing during this shutdown and waiting on that, then you may want to notify the user of that, rather than letting your process look hung while you're wrapping things up.
I'd recommend having your GUI and work be done on different threads. When a user requests a shutdown, dismiss the GUI immediately giving the appearance that the application has closed. Allow the worker threads to close gracefully in the background.
If you want to pull the plug messily, exit(0) will do the trick.
I once had a similar problem, albeit in Visual Basic 6: threads from an app would connect to different servers, download some data, perform some operations looping upon that data, and store on a centralized server the result.
Then, new requirement was that threads should be stoppable from main form. I accomplished this in an easy though dirty fashion, by having the threads stop after N loops (equivalent roughly to half a second) to try to open a mutex with a specific name. Upon success, they immediately stopped whatever they were doing and quit, continued otherwise.
This mutex was created only by the main form, once it was created all the threads would soon close themselves. The disadvantage was that user needed to manually specify it wanted to run the threads again - another button to "Enable threads to run" accomplished this by releasing the mutex :D
This trick is guaranteed to work for mutex operations are atomic. Problem is you're never sure a thread really closed - a failure in the logic of handling the "openMutex succeeded" case could mean it never ends. You also don't know when/if all the threads have closed (assuming your code is right, this would take roughly the same time it takes for the loops to stop and "listen").
With VB's "apartment" model of multi-threading it's somewhat difficult to send info from the threads to the main app back and forth, it's much easier to "fire and forget" or to send it only from the main app to the thread. Thus, the need of these kind of long-cuts. Using C++ you're free to use your multi-threading model, so these constraints might not apply to you.
Whatever you do, do NOT use TerminateThread, especially on anything that could be in OS HTTP calls. You could potentially break IE until reboot.
Change all of your IO to an asynchronous or non-blocking model so that they can watch for termination events.
If you need to shutdown suddenly: Just call ExitProcess - which is what is going to be called just as soon as you return from WinMain anyway. Windows itself creates many worker threads that have no way to be cleaned up - they are terminated by process shutdown.
If you have any threads that are performing writes of some kind - obviously those need a chance to close their resources. But anything else - ignore the bounds checker warnings and just pull the rug from under their feet.
You can call TerminateProcess - this will stop the process immediately, without notifying anyone and without waiting for anything.
*NULL = 0 is the fastest way. if you don't want to crash, call exit() or its win32 equivalent.
Instruct the user to unplug the computer. Short of that, you have to abandon your asynchronous activities to the wind. Or is that HWIND? I can never remember in C++. Of course, you could take the middle road and quickly note in a text file or reg key what action was abandoned so that the next time the program runs it can take up that action again automatically or ask the user if they want to do so. Depending on what data you lose when you abandon the asynch action, you may not be able to do that. If you're interacting with the user, you may want to consider a dialog or some UI interaction that explains why its taking so long.
Personally, I prefer the instruction to the user to just unplug the computer. :)