c++ implementing cancel across thread pools

c++ implementing cancel across thread pools - c++

I have several thread pools and I want my application to handle a cancel operation.
To do this I implemented a shared operation controller object which I poll at various spots in each thread pool worker function that is called.
Is this a good model, or is there a better way to do it?
I just worry about having all of these operationController.checkState() littered throughout the code.

Yes it's a good approach. Herb Sutter has a nice article comparing it with the alternatives (which are worse).

With any kind of ansynchronous cancellation you're going to have to periodically poll some sort of flag. There's a fundamental issue of having to keep things in a consitant state. If you just kill a thread in the middle of whatever it's doing, bad things will happen sooner or later.
Depending on what you are actually doing, you may be able to just ignore the result of the operation instead of cancelling it. You let the operation continue on, but just don't wait for it to complete and never check the result.
If you actually need to stop the operation, then you're going to have to poll at appropriate points, and do whatever cleanup is necessary.

It's a good way to do it.
Another possible way to do it is, if there's some other subroutine[s] which the threads call regularly anyway, to check within that subroutine and throw an exception (to be caught at the top of the thread), assuming that "cancel" may be considered exceptional and assuming that the code being executed by the thread is exception-safe.

I wouldn't do it that way, checking a shared object.
I most likely will provide each thread object with a way to cancel the execution inside the own thread, be it an event, a threadsafe state variable or whatever.
The problem with the shared operation controller is that, from my point of view, the logic is reversed, Why are you calling it "controller" when it doesn't control anything?
For me, Operation Controller shall recive a cancelation order and then, in turn select the appropiate threads and signal them to stop. That would be a correct "chain of command" if you know what I mean. The way you do it you introduce an unnatural behaivour on the thread wich doesn't "obey" orders to stop, instead if checks each time if his "superior" has "written the order somewere". Somehow it just doesn't feel right.
In addition, what if you just one "some" of the threads to stop in the future? What if you want to include some advanced logic so that threads will only stop given a condition? Then you'll have to rewrite the code in each and every thread to handle that condition.
So I will provide a way, for each thread to be able to handle signals to them, for example by using a Command Pattern with a FIFO structure.
(By the way, I realize they're thread pool workers, not actual Thread Classes but still, I think each worker must be signaled to stop separately, not the other way around).

In similar situations I have used an event, non-auto-reset, all threads can look at that event. Quite similar to polling except that if your threads block at times, they can sleep for the "stop"-event as well. (Easier on Windows.)
/L

Related

How do I signal a std::thread to exit gracefully?

Using C++17, for a worker thread with a non-blocking loop in it that performs some task, I see three ways to signal the thread to exit:
A std::atomic_bool that the thread checks in a loop. If it is set to true, the thread exits. The main thread sets it to true before invoking std::thread::join().
A std::condition_variable with a bool. This is similar to the above, except it allows you to invoke std::condition_variable::wait_for() to effectively "sleep" the thread (to lower CPU usage) while it waits for a potential exit signal (via setting the bool, which is checked in the 3rd argument to wait_for() (the predicate). The main thread would lock a mutex, change the bool to true, and invoke std::condition_variable::notify_all() before invoking std::thread::join() to signal the thread to exit.
A std::future and std::promise. The main thread holds a std::promise<void> while the worker thread holds the corresponding std::future<void>. The worker thread uses std::future::wait_for() similar to the step above. Main thread invokes std::promise::set_value() before calling std::thread::join().
My thoughts on each:
This is simple, but lacks the ability to "slow down" the worker thread loop without explicitly calling std::this_thread::sleep_for(). Seems like an "old fashioned" way of doing thread signals.
This one is comprehensive, but very complicated, because you need a condition variable plus a boolean variable.
This one seems like the best option, because it has the simplicity of #1 without the verbosity of #2. But I have no personal experience with std::future and std::promise yet, so I am not sure if it's the ideal solution. In my mind, promise & future are meant to transfer values across threads, not really be used as signals. So I'm not sure if there are efficiency concerns.
I see multiple ways of signaling a thread to exit. And sadly, my Google searching has only introduced more as I keep looking, without actually coming to a general consensus on the "modern" and/or "best" way of doing this with C++17.
I would love to see some light shed on this confusion. Is there a conclusive, definitive way of doing this? What is the general consensus? What are the pros/cons of each solution, if there is no "one size fits all"?

If you have a busy working thread which requires one-way notification if it should stop working the best way is to just use an atomic<bool>. It is up to the worker thread if it wants to slow down or it doesn't want to slow down. The requirement to "throttle" the worker thread is completely orthogonal to the thread cancellation and, in my opinion, should not be considered with the cancellation itself. This approach, to my knowledge, has 2 drawbacks: you can't pass back the result (if any) and you can't pass back an exception (if any). But if you do not need any of those then use atomic<bool> and don't bother with anything else. It is as modern as any; there is nothing old-fashioned about it.
condition_variable is part of the consumer/producer pattern. So there is something which produces work and there is something that consumes what was produced. To avoid busy waiting for the consumer while there is nothing to consume the condition_variable is a great option to use. It is just a perfect primitive for such tasks. But it doesn't make sense for the thread cancellation process. And you will have to use another variable anyway because you can't rely on condition_variable alone. It might spuriously wake up the thread. You might "set" it before it gets in the waiting process, losing the "set" completely, and so on. It just can't be used alone so we back to square one but now with an atomic<bool> variable to accompany our condition_variable
The future/promise pair is good when you need to know the result of the operation done on the other thread. So it is not a replacement of the approach with the atomic<bool> but it rather complements it. So to remove the drawbacks described in the first paragraph you add future/promise to the equation. You provide the calling side with the future extracted from the promise which lives within the thread. That promise gets set once the thread is finished:
Because exception is thrown.
Because thread has done its work and completed on its own.
Because we asked it to stop by setting the atomic<bool> variable.
So as you see the future/promise pair just helps to provide some feedback for the callee it has nothing to do with the cancellation itself.
P.S. You can always use an electric sledgehammer to crack a nut but it doesn't make the approach any more modern.

I can't say that this is conclusive, or definitive, but since this is somewhat an opinion question, I'll give an answer that it is based upon a lot of trial and error to solve the kind of problem you are asking about (I think).
My preferred pattern is to signal the thread to stop using atomic bool, and control the 'loop' timing with a condition variable.
We ran into the requirement for running repeating tasks on worker threads so often that we created a class that we called 'threaded_worker'. This class handles the complexities of aborting the thread, and timing the calls to the worker function.
The abort is handled via a method that sets the atomic bool 'abort' signal which tells the thread to stop calling the work function and terminate.
The loop timing can be controlled by methods that set the wait time for the condition variable. The thread can be released to continue via method that calls the notify on the condition variable.
We use the class as a base class for all kinds of objects that have some function that needs to execute on a separate thread. The class is designed to run the 'work' function once, or in a loop.
We use the bool for the abort, because it is simple and suitable to do the job. We use the condition variable for loop timing, because it has the benefit of being notified to 'short circuit' the timing. This is very useful when the threaded object is a consumer. When a producer has work for the threaded object, it can queue the work and notify that the work is available. The threaded object immediately continues, instead of waiting for the specified wait time on the condition variable.
The reason for both (the abort signal, and the condition variable) is that I see terminating the thread as one function, and timing the loop as another.
We used to time loops by putting the thread to sleep for some duration. This made it almost impossible to get predictable loop timing on Windows computers. Some computers will return from sleep(1) in about 1ms, but others will return in 15ms. Our performance was highly dependent on the specific hardware. Using condition variables we have greatly improved the timing of critical tasks. The added benefit of notifying a waiting thread when work is available is more than worth the complexity of the condition variable.

Is It Possible to Send a Signal to a Specific pthread ID?

I'm working on a multi-thread scheduling assignment, which involves adding threads to a variety of queues and selecting the appropriate one to execute.
The pthread_cond_signal(&condition) command is completely asynchronous from what I can tell; it's simply thrown into memory and the first thread to find it with the appropriate pthread_cond_wait() will consume it.
However, say I have a vector of thread ids that have been pushed as the thread is created, ie:
threadIDVector1[0] = 3061099328
threadIDVector1[1] = 3077884736
...
threadIDVector2[0] = 3294747394
threadIDVector2[1] = 3384567393
...
etc.
And I wanted to send a signal specifically to the thread with an id that matches the appropriate element of a vector. I.e. the algorithm would be:
While (at least one threadVector is non-empty):
Look at the first element in each vector
Select the appropriate one to signal by some criteria
Send a signal to ONLY that thread
Complete the thread and remove from threadIDVectorX
Is there some way to execute the above, or some accepted standard for achieving the same result?

There is no way to "send" a signal to a specific thread, nor to know which thread among many will be woken by the OS. It is entirely non-deterministic.
You could use the "multiple condition variable" solution as proposed in the comments. But my preferred solution to something like this is a pipe or socket pair. Have the thread doing the waking write something (like a single byte) to the pipe for the corresponding thread to signal it.
This has a lot of benefits in my book. First, it allows bidirectional communication. Your pseudocode loop at the end of your question seems to also want to remove a finished thread from the list, so you need to know when that thread is done. You could have another CV, or you could have the completing thread write a single byte back to the manager object before exiting. Much easier, I feel.
It also allows you to choose between blocking or nonblocking I/O, or to use synchronous multiplexing with select(2) or epoll(2). If you were not exiting from the worker threads, but instead wanted to reuse them, the notifying thread would need to know when they're ready for more work. Again, a CV would be fine here, but the file-descriptor approach allows the notifier to wait for all of the worker threads in a single select(2) call.
The last thing is that I find files simpler. pthreads are pretty complicated, and multithreading is already hard enough to get right. I find that files are easier to manage and reason about in a multithreaded context, making it easier to avoid locking or crashes.

Changing Thread Task?

I know you cannot kill a boost thread, but can you change it's task?
Currently I have an array of 8 threads. When a button is pressed, these threads are assigned a task. The task which they are assigned to do is completely independent of the main thread and the other threads. None of the the threads have to wait or anything like that, so an interruption point is never reach.
What I need is to is, at anytime, change the task that each thread is doing. Is this possible? I have tried looping through the array of threads and changing what each thread object points to to a new one, but of course that doesn't do anything to the old threads.
I know you can interrupt pThreads, but I cannot find a working link to download the library to check it out.

A thread is not some sort of magical object that can be made to do things. It is a separate path of execution through your code. Your code cannot be made to jump arbitrarily around its codebase unless you specifically program it to do so. And even then, it can only be done within the rules of C++ (ie: calling functions).
You cannot kill a boost::thread because killing a thread would utterly wreck some of the most fundamental assumptions a programmer makes. You now have to take into account the possibility that the next line doesn't execute for reasons that you can neither predict nor prevent.
This isn't like exception handling, where C++ specifically requires destructors to be called, and you have the ability to catch exceptions and do special cleanup. You're talking about executing one piece of code, then suddenly inserting a call to some random function in the middle of already compiled code. That's not going to work.
If you want to be able to change the "task" of a thread, then you need to build that thread with "tasks" in mind. It needs to check every so often that it hasn't been given a new task, and if it has, then it switches to doing that. You will have to define when this switching is done, and what state the world is in when switching happens.

How to implement a timed wait around a blocking call?

So, the situation is this. I've got a C++ library that is doing some interprocess communication, with a wait() function that blocks and waits for an incoming message. The difficulty is that I need a timed wait, which will return with a status value if no message is received in a specified amount of time.
The most elegant solution is probably to rewrite the library to add a timed wait to its API, but for the sake of this question I'll assume it's not feasible. (In actuality, it looks difficult, so I want to know what the other option is.)
Here's how I'd do this with a busy wait loop, in pseudocode:
while(message == false && current_time - start_time < timeout)
{
if (Listener.new_message()) then message = true;
}
I don't want a busy wait that eats processor cycles, though. And I also don't want to just add a sleep() call in the loop to avoid processor load, as that means slower response. I want something that does this with a proper sort of blocks and interrupts. If the better solution involves threading (which seems likely), we're already using boost::thread, so I'd prefer to use that.
I'm posting this question because this seems like the sort of situation that would have a clear "best practices" right answer, since it's a pretty common pattern. What's the right way to do it?
Edit to add: A large part of my concern here is that this is in a spot in the program that's both performance-critical and critical to avoid race conditions or memory leaks. Thus, while "use two threads and a timer" is helpful advice, I'm still left trying to figure out how to actually implement that in a safe and correct way, and I can easily see myself making newbie mistakes in the code that I don't even know I've made. Thus, some actual example code would be really appreciated!
Also, I have a concern about the multiple-threads solution: If I use the "put the blocking call in a second thread and do a timed-wait on that thread" method, what happens to that second thread if the blocked call never returns? I know that the timed-wait in the first thread will return and I'll see that no answer has happened and go on with things, but have I then "leaked" a thread that will sit around in a blocked state forever? Is there any way to avoid that? (Is there any way to avoid that and avoid leaking the second thread's memory?) A complete solution to what I need would need to avoid having leaks if the blocking call doesn't return.

You could use sigaction(2) and alarm(2), which are both POSIX. You set a callback action for the timeout using sigaction, then you set a timer using alarm, then make your blocking call. The blocking call will be interrupted if it does not complete within your chosen timeout (in seconds; if you need finer granularity you can use setitimer(2)).
Note that signals in C are somewhat hairy, and there are fairly onerous restriction on what you can do in your signal handler.
This page is useful and fairly concise:
http://www.gnu.org/s/libc/manual/html_node/Setting-an-Alarm.html

What you want is something like select(2), depending on the OS you are targeting.

It sounds like you need a 'monitor', capable of signaling availability of resource to threads via a shared mutex (typically). In Boost.Thread a condition_variable could do the job.

You might want to look at timed locks: Your blocking method can aquire the lock before starting to wait and release it as soon as the data is availabe. You can then try to acquire the lock (with a timeout) in your timed wait method.

Encapsulate the blocking call in a separate thread. Have an intermediate message buffer in that thread that is guarded by a condition variable (as said before). Make your main thread timed-wait on that condition variable. Receive the intermediately stored message if the condition is met.
So basically put a new layer capable of timed-wait between the API and your application. Adapter pattern.

Regarding
what happens to that second thread if the blocked call never returns?
I believe there is nothing you can do to recover cleanly without cooperation from the called function (or library). 'Cleanly' means cleaning up all resources owned by that thread, including memory, other threads, locks, files, locks on files, sockets, GPU resources... Un-cleanly, you can indeed kill the runaway thread.

Inter-thread communication. How to send a signal to another thread

In my application I have two threads
a "main thread" which is busy most of the time
an "additional thread" which sends out some HTTP request and which blocks until it gets a response.
However, the HTTP response can only be handled by the main thread, since it relies on it's thread-local-storage and on non-threadsafe functions.
I'm looking for a way to tell the main thread when a HTTP response was received and the corresponding data. The main thread should be interrupted by the additional thread and process the HTTP response as soon as possible, and afterwards continue working from the point where it was interrupted before.
One way I can think about is that the additional thread suspends the main thread using SuspendThread, copies the TLS from the main thread using some inline assembler, executes the response-processing function itself and resumes the main thread afterwards.
Another way in my thoughts is, setting a break point onto some specific address in the second threads callback routine, so that the main thread gets notified when the second threads instruction pointer steps on that break point - and therefore - has received the HTTP response.
However, both methods don't seem to be nicely at all, they hurt even if just thinking about them, and they don't look really reliable.
What can I use to interrupt my main thread, saying it that it should be polite and process the HTTP response before doing anything else? Answers without dependencies on libraries are appreciated, but I would also take some dependency, if it provides some nice solution.
Following question (regarding the QueueUserAPC solution) was answered and explained that there is no safe method to have a push-behaviour in my case.

This may be one of those times where one works themselves into a very specific idea without reconsidering the bigger picture. There is no singular mechanism by which a single thread can stop executing in its current context, go do something else, and resume execution at the exact line from which it broke away. If it were possible, it would defeat the purpose of having threads in the first place. As you already mentioned, without stepping back and reconsidering the overall architecture, the most elegant of your options seems to be using another thread to wait for an HTTP response, have it suspend the main thread in a safe spot, process the response on its own, then resume the main thread. In this scenario you might rethink whether thread-local storage still makes sense or if something a little higher in scope would be more suitable, as you could potentially waste a lot of cycles copying it every time you interrupt the main thread.

What you are describing is what QueueUserAPC does. But The notion of using it for this sort of synchronization makes me a bit uncomfortable. If you don't know that the main thread is in a safe place to interrupt it, then you probably shouldn't interrupt it.
I suspect you would be better off giving the main thread's work to another thread so that it can sit and wait for you to send it notifications to handle work that only it can handle.
PostMessage or PostThreadMessage usually works really well for handing off bits of work to your main thread. Posted messages are handled before user input messages, but not until the thread is ready for them.

I might not understand the question, but CreateSemaphore and WaitForSingleObject should work. If one thread is waiting for the semaphore, it will resume when the other thread signals it.
Update based on the comment: The main thread can call WaitForSingleObject with a wait time of zero. In that situation, it will resume immediately if the semaphore is not signaled. The main thread could then check it on a periodic basis.

It looks like the answer should be discoverable from Microsoft's MSDN. Especially from this section on 'Synchronizing Execution of Multiple Threads'

If your main thread is GUI thread why not send a Windows message to it? That what we all do to interact with win32 GUI from worker threads.

One way to do this that is determinate is to periodically check if a HTTP response has been received.
It's better for you to say what you're trying to accomplish.

In this situation I would do a couple of things. First and foremost I would re-structure the work that the main thread is doing to be broken into as small of pieces as possible. That gives you a series of safe places to break execution at. Then you want to create a work queue, probably using the microsoft slist. The slist will give you the ability to have one thread adding while another reads without the need for locking.
Once you have that in place you can essentially make your main thread run in a loop over each piece of work, checking periodically to see if there are requests to handle in the queue. Long-term what is nice about an architecture like that is that you could fairly easily eliminate the thread localized storage and parallelize the main thread by converting the slist to a work queue (probably still using the slist), and making the small pieces of work and the responses into work objects which can be dynamically distributed across any available threads.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js